Advogato's Number On the Road: Paris

This week's installment of Advogato's Number is brought to you from
United flight 999, Boston to SFO, written on my old laptop, the
trusty spiral-bound notebook and Uni-ball Vision pen. My TP600 is
grounded with battery
problems and besides, they'll even let me run it during takeoffs
and landings. I've been on the road for a couple of weeks now, and
will be home for a month or so before Guadec.

The first week of this month, Advogato was at Linux Expo Paris. He spent the
day before the conference strolling the city, observing the
culture. Wine, cheese, and bread are very important to French culture
(and Advogato certainly appreciates fine French cheeses!), but it
seems to me that the culture is defined primarily by the French
language. Having recently read Words and Rules by
Steven Pinker, I kept an ear out for language clues. It is very clear,
when you look, that French and English were once the same language -
so many of the words are the same, even if they have diverged in
shades of meaning.

Yet, they certainly are different languages now, and the Acadamie
Française is charged with a special mission to maintain the integrity
of the language, especially against the gale-force winds of the
English language. Consequently, while most languages in the world have
adopted English tems for computers and other technical fields, in
French there is a new, parallel set.

For example, "free software operating system" is rendered in French as
"système d'exploitation libre." "Système" is the same word as in
English, and "exploitation" is similar, but without the negative
connotation - a system for exploiting, or operating, a computer
("ordinateur"). Finally, "libre" means free, but in the sense of free
speech ("liberty") rather than free beer ("gratis").

"Système d'exploitation libre" was very much the theme of the
conference, in both the French and English senses of the words. Linux
has made it big in Europe, as well as in the US. The show floor was
packed with slick booths of large, powerful companies, eager to show
off their impressive marketing budgets. Yet, most of what they were
selling was plain, old-fashioned proprietary software to run on top of
Linux. Advogato spent an hour or two wandering the booths, then was
frankly bored - not much of the stuff being sold, or money being
spent, is really advancing the cause of making a complete free
software system into a richer, more powerful tool.

Nowhere was this contrast more striking than in Michael Cowpland's
keynote address, which was basically an extended advertisement for
Corel Linux and Corel Corporation. Cowpland spent a good part of the
talk extolling the benefits of Linux and free software in general,
including the robustness that comes from having so many eyes on the
source. He then proceeded smoothly to promote Corel's proprietary
software products running on Linux, such as Corel Office and Corel
Draw. Advogato could feel the gears shifting in his head. For some
strange reason, Cowpland did not feel it necessary to explain that,
since these products are not free software, users shouldn't
expect any of the benefits, such as the careful attention to
reliability and compatibility so characteristic of free software.

Advogato has no problem with the existence of proprietary software,
and it certainly makes a lot of sense to choose a solid, well
supported operating system on which to run this software. But I do
have a problem with the blurring between the two in the public's mind,
and it seems to me that this blurring is being actively promoted by
many Linux vendors. Linux is well known as a spectacularly successful
example of free software, yet when people plunk down a wad of cash to
buy some Linux, most of the time they're getting a significant dose of
proprietary software as well.

Why does this matter? The conference session on clustering provided an
excellent example. After the Mosix talk, an intriguing
research project, Alinka and Suse presented their clustering
solutions, both based on proprietary software. These systems probably
work very well, and the people who buy them should no doubt consider
them money well spent. However, I think it's fair to say that the
SRO audience in the room was blown away by Peter Braam's talk on the
work of the Linux-HA project,
which is entirely based on free software. It's clear that this project
intends to solvee the hard problems of making clusters Just Work,
such as making making sure that systems go down and recover in the
right sequence. They've come up with some very impressive stuff,
including the Intermezzo file
system and a general, scalable distributed lock manager. In the true
spirit of academic inquiry, much of the work is based on previously
publishe results, particularly the DEC VAXcluster systems of the early
'80s. I don't see how this kind of work would be possible without the
open cooperation of lots of bright people that free software
brings. It seems likely to me that this project will bear sweet fruit.

The opportunity to build software systems that suck less is one of the
main reasons I'm involved with free software. Yes, as Richard
Stallman's talk strongly emphasized, the freedom is important
too. Much of RMS's talk was taken up by a history of the Gnu project and a clear, compelling
discription of the freedoms guaranteed by free software.

He then presented a plea that people call the complete operating
system with the Linux kernel "GNU/Linux" rather than simply "Linux."
Advogato has always been dubious of this request, because a Linux
system contains many components from many different sources,
each of which is essential for the system as a whole. In fact, the
fact that the Gnu project adopted X as
the windowing system is entirely analogous to RMS's assertion that the
Linux project adopted Gnu tools. But nobody is insisting that Gnu be
called "X/Gnu", for example.

It's easy to pick on RMS for coming across as a glory hound (or for
having sour grapes about the relative failure of Hurd), but to do so
misses an important point: calling the system "GNU/Linux" emphasizes
the free software nature of the project, and brings the
philosophy and goals of the Gnu project to the public's attention. In
a time where many users are probably quite confused about the roles of
free and proprietary software, this can only be a good thing. Maybe
it's worth swallowing a bit of pride and using the term, if only for
the pragmatic purposes of helping to educate the masses.

RMS described the requirements for being a Saint in the church of Gnu
(alongside St. Ignucious himself); you simply have to eschew the use
of proprietary software. Many in the audience were uncomfortable with
this, at least until Mozilla
ships.

My talk on libart went well, followed by an excellent presentation by
Miguel de Icaza on new technologies in Gnome. He introduced his talk
with a slide about Helix Code,
the company he recently founded with his friend Nat Friedman, and
clarified that "no, we're not lovers, and no, we're not gay." Ok,
sure!

Overall, I had a great time in Paris, and was made to feel very
welcome by French free software hackers, and I especially enjoyed
meeting David Turner of FreeType
fame and Mathieu Lacage, the organizer of Guadec (even if Mathieu did stick
me with the check at dinner the first night!).

On the flight back, I was reminded, though, of the fact that our
shared interest in free software doesn't completely break down the
language barrier. After all, all the hackers I talked to spoke
English. Even though Linux is a worldwide pheonomenon, it's still
worth remembering that the free software community is dominated by the
English language. One more reason why good, free documentation is
important: it makes life easier for the hard-working translators, and
makes free software more accessible to people who don't speak English.

I'm really looking forward to going back to Paris next month for
Guadec. Hopefully, I'll see you there!

Of course the English dominiation issue isn't just a free software
problem, but a
pretty widespread problem throughout the scientific and software
industries. From a coding perspective one big suggestion I would have,
now that I am coming at it from the other direction. i.e. reading german
code, is that when you comment use full words.

It is relatively easy
to look up a full words in a dictionary, but it can be really difficult
to figure out an abbreviation.

Translations of the software in general is, in my opinion, not that great a problem. As English is taught to a larger and larger
percentage of the world's population in school, it will become less of a problem. It's a lot more interesting to see things like Pango, which will let you write and display text in all languages. This is a true point for
internationalization. Having the pulldown menus in every obscure language you can think of is not.

As for comments in code, and code in general (that is, identifier names), that should be in English no matter what, period. I'm
not a native English speaker, but I can read and understand English quite well, and I'll be pissed off if code I have to work with
is in Spanish (which happens quite a bit here, as well as MixedCaseIdentifiers, but I digress). If there is any chance
your code is going to be looked at by anyone else at all, your comments and identifiers should be in English (and you should
use accepted terms for things, which assumes a certain CompSci background, but that's a different issue).

raph, I would like to know what you imply here. quoting you:
""exploitation" is similar, but without the negative connotation - a
system for exploiting, or operating, a computer". What negative
connotation is involved in "operating" ? I probably have missed
something here...

I was talking about the negative connotation in "exploiting."
"Operating" really has no negative connotation in English.

I've definitely experienced the language barrier in the other direction.
For example, the GYVE project obviously has some very talented people
working on it, and was started well in advance of Gill. However, I was
very frustrated in trying to communicate the differences in vision,
particularly that I thought the libart imaging model had a much brighter
future than Display PostScript. So we ended up going our separate ways,
and another possibility for collaboration was missed. I have a feeling
things could have gone better without the language barrier.

Yet, I also feel that forcing the free software community to use English
is arrogant and imperialistic in a way. Monocultures are not good in
ecology, and I can't imagine they're any healthier in free software.
Yes, science has the same problem, but science is basically a sport for
rich countries, or at best a small elite within poor countries. I expect
a lot of people around the world to be using Linux and other free
software over the next few years.

Yet, I also feel that forcing the free software community to use English is arrogant and imperialistic
in a way.

Do you really see it that way? The forcing part I mean.

As with science and technology so with computers. I once worked for a Danish company (Danfoss), and when asked why all of their
senior staff were required to speak English, and why much of their documentation was is in English, the response was, "We have to."
But
not in that they were being forced to, but in that, since Danfoss is a world wide business, all of their offices have to all communicate
together, communicate very specific and unambiguous things, such as schematics and notation. It is much more efficient for all to
communicate in the same language rather than having each office learn the laguage of every office.

I was involved in a meeting in Sonderburg with representatives from France, Germany, Japan, Korea, Sweden and the UK (I am
from
the
US), and of course Denmark. We had to all speak the same language or things would have been very difficult.

Of course it makes it easier for people of English speaking countries. And I really feel, well, left out by not speaking other languages.
But
I
do not feel lazy as many have labeled Americans because English is the most common language in so many technical things. I think it
just
worked out that way.

RMS described the requirements for being a Saint...you
simply have to eschew the use of proprietary software.

This is getting off topic, but one of my biggest complaints with RMS
is his refusal to apply the ideas of free software to anything but his
particular definition of software. It's nice to say that you shouldn't
use proprietary software; I agree with him. But I'd also like the
hardware my system runs on to be a product of free software. When I
ask him about this possibility he laughs it off, saying "there's no such
thing as a car copier." True enough, but not an argument. I think he's
wrong, and
that bothers me much more than any possible 'glory hounding'.

I'm not convinced that the lack of desktop chip fads is all that
different from the situation 15 years ago, when RMS quit his job at MIT
to write free software--on a proprietary workstation owned by the
university. Of course the issues are different, and the only way I seem
"everybody" buying a chip fabricator like they're buying computers is if
it also happens to make coffee, decorative craftwork, and everything
else. But that doesn't mean an interested developer couldn't build one
in the garage, based on cheap prototyping techniques developed in the
free by a community, and amortize material and equipment costs by
running things for others locally. We've done that model in software
already. It's called a mainframe. Remember "5 or so computers should be
enough for the nation's needs"?

One interesting thing to note about code is that APIs are usually
designed with the english langage in mind (actually, I NEVER used any
non-english API). As a consequence, code which uses non-english variable
names and function names and english APIs should be banished.

I have been obliged to use some code from some networking profs here,
at uni -people who should NEVER be asked to code anything but that is
offtopic-, and what was the most difficult was to get used to the
intermingling english API calls with french variable names (although I
am fluent in english and french is my native langage). It is extremly
disturbing and requires a constant mind-change to read such code
-actually, the code was even worse than just this but... So many buffer
overflows...-

Although: now, I recall using a graphics library in french. It was
one of my worst experience but I think this was mainly due to the fact
that the API was simply badly designed.

First, a bit of background about myself: sometimes, I like to describe
myself as a French-speaking Belgian guy working in English in Germany
for a Swedish company (and driving through the Netherlands in order to
go from my home in Belgium to work in Germany). I probably speak and
write in English more often than in French, mostly because I work in a
big company in which a good knowledge of English is required and also
because I still do not know enough German so I use English when I talk
with my German co-workers.

As a researcher and developer working for an international company,
there is no question about what language should be used when writing
code or documentation: everything must be in English. It is likely that
the next person who will maintain the code that I am writing now does
not understand my native language and I probably do not understand hers,
so English is the only way to ensure that we can communicate.

But even if my job would not require it, it would not make sense for me
to write my code in a language other than English. Even if I write
some code for myself and I do not intend to give it to anyone else.
Most of the APIs that I ever had to use are using English names for all
functions, types and classes, so it looks very
bad if I mix that with French names. It could also be confusing, if
you consider that the French translation for "a queue" is "une file" and
the translation for "a file" is "un fichier". You can get in trouble if
you read some code that contains function calls like: queue_append
(file, fichier); or if you are not sure about the best mapping
between the words in the two languages: draw_outline (calque, fond,
bord);. As others have mentioned, it gets even worse if you start
using abbreviations.

But on the other hand, some non-English CS students are faced with
the opposite problem: they do not know English well enough, so they try
to use their native language as much as possible in the code in order to
be able to understand it later. A good friend of mine once told me
something like: "It is already hard enough to understand the API
written in this cryptic English language, so why should I make my task
even more difficult by naming my variables and functions in a language
that I am not familiar with?" Well, he was right to some extent.
Besides, as a student, you often have to write your code and / or your
comments in French anyway (or in German, Spanish, or whatever is your
native language) because the other students and even your professor
could think that you are a pedantic defender of the English language or
that you simply copied the code from a book or from a more experienced
programmer and you forgot to rename the variables appropriately. Don't
laugh, I know that it happens...

Looking back at the first programs I wrote (good old ZX81, then ZX
Spectrum), I remember that I started with something like: "code and
comments in French," then later "code in English and comments in French"
and finally everything in English. One must be reasonably familiar with
the language before being able to name the functions and variables
correctly (and also being able to re-read one's own code later).
Learning a language takes time, and this applies to programming
languages as well as natural languages. It would be foolish to require
all CS students to use English exclusively in their code: many would
fail or would be disgusted.

But once you are past the first few learning steps, it is vital to
use English for everything. Because at some point in time, you learn
more by reading, modifying and trying to understand code written by
others than by reading books or writing programs in isolation. As soon
as you have to share code with others, English is the only good choice.
Besides, all the good reference books are written in English. Even the
translated versions contain enough references to English terms that you
could not read them if you do not know English.

This is especially true for Open Source programs: reading and
writing OSS is an excellent learning tool. This is a good opportunity
to share code, tips and tricks with experienced programmers from all
around the world. But it is only possible if everything is written in
English. Releasing a nice Open Source program written entirely in
French is probably not more useful to a Japanese developer than if the
source code had not been released at all. And vice-versa.

I believe that beyond a certain point, you cannot gain more
experience if you do not read and write code in English. And since
Open Source is an
excellent learning tool, I would conclude with: In order to be a
good Open Source programmer, one must be good at
English.

Hmmm... That was a long comment for a short conclusion. I will try to
write shorter comments next time... :-)

Programming is not the only field which uses a specific language. Ballet uses French for its technical terms. Music uses Italian.

Admittedly, these don't require fluency in a different human-language just to work in the field. I felt it appropriate, however, to point out
that there is precedent - and precedent other than medieval scholarship!

I disagree with an earlier comment that pretended that translating of
programs (menus etc...) was unnecessary, since today young people in
most countries are learning English.

I think this comment disregards the fact that, even though there have
been improvements in foreign language teaching, people, even young, even
in highly developped countries, do not speak English well. They
are more often than not able to complete simple tasks such as asking for
directions, but are not able to read English documentation.

Do I exaggerate ? No. I teach Java and basic algorithmics in a French university. My students asked
me for documentation. I pointed them to the official site for Java. They just could
not believe no documentation was available in French. Sure, the problem
might be that English is taught as a foreign language, not as a language
you need for day-to-day activities such as programming. A segmented view
of education is surely responsible for this "English belongs in the
English class" thought.

Nevertheless, that is what happens in real life at some allegedly good
university in a major city in one of the richest countries in the world.
Now imagine the situation in the middle of countries that cannot afford
the luxury of running extensive education systems.

We in the technical community have become so used to writing programs,
documentation, e-mail and WWW pages that we do not know anymore that
for normal people out there, English is just a faint souvenir from
highschool days It is the case for my mother, for example.

Providers of commercial proprietary software have long understood the
need for localized software. Let us not fool ourselves.

The problem with the internationalization and localization of software
is that it is often inadequate and amateurish. Often, the spelling of
the translated strings is approximate ; some translations in Gnome or
the GNU Libc show that the translator did not have proper command of
English and/or the target language.

Is there any forum on internalization issues ?

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser
code is live. It needs further work but already handles most
markup better than the original parser.