Community

Sooner or later that will need to be defined. I know next to nothing
about locales. (I know I dislike the design C++ uses.)
I was thinking of a design along the following lines. There are RFCs
dedicated to locale nomenclature:
http://tools.ietf.org/html/rfc4646 for language names
http://www.unicode.org/cldr/ for various locale names
So we know the basic names we want to follow, which is one less burden.
Then what I want to do is to define a hierarchical string table that
fills the appropriate names.
This is in opposition to defining an actual class hierarchy that mimics
the localization table. I think a hierarchical string table is better
because it allows simple extensibility.
The type stored by each slot of a locale is:
Algebraic!(
int,
string,
Variant delegate(Variant),
This[string]);
meaning that a locale could store one of these types. (What else should
go in there?)
The access pattern goes like:
// Get the date display pattern
auto pat = myLocale.get("calendars", "calendar=default",
"dateFormats", "dateFormatLength=medium", "pattern");
This will return an Algebraic with a string in it. The string looks like
e.g. "yyyy-MM-dd".
The access is rather verbose because the corresponding locale names tree
is equally (actually more) verbose, see
http://unicode.org/Public/cldr/1.6.1/core.zip. But the flexibility and
the standards-compliance are there. We may add later some convenience
functions for frequently-used stuff such as dates, times, and numbers.
Extension is obvious:
myLocale.put("my-category", "my-slot", "whatever");
Getting later the stuff in "my-category", "my-slot" will return a string
Algebraic containing "whatever".
There will be a global reference to a Locale class, e.g. defaultLocale.
By default the reference will be null, implying the C locale should be
in effect. Applications can assign to it as they find fit, and also pass
around multiple locale variables.
So I wanted to gather some good ideas about locale design. Is a
string-and-Algebraic design good for all uses? What kind of locale
functionality does it not capture? I must have missed a ton of details,
so if you don't understand what I mean by the above, it must be me.
Andrei

Andrei Alexandrescu wrote:
> There will be a global reference to a Locale class, e.g. defaultLocale.
> By default the reference will be null, implying the C locale should be
> in effect. Applications can assign to it as they find fit, and also pass
> around multiple locale variables.
I disagree with being able to assign to the global defaultLocale. This
is going to cause endless problems. Just one is that any function that
uses locale can no longer be pure. defaultLocale should be immutable.
Any function that is locale aware should be parameterized with a locale
parameter. (Not only is that better design, it self-documents the
dependency.)

Andrei Alexandrescu wrote:
> Sooner or later that will need to be defined. I know next to nothing
> about locales. (I know I dislike the design C++ uses.)
D uses Utf-8, and that is *good enough*!
This lets my programs "understand" Finnish, and doesn't give me undue
headaches.
Seriously tending to locale issues would be an *endless swamp*. Just for
this, I looked up something suitable to read:
http://www.manpagez.com/man/1/perllocale/
It may even be that you would find the time, but think about Walter and
us, please. There *really are* other things to do.
An excellent string hierarchy without the entire rest of i18n, is only
going to look like a Ferrari with a Trabant engine. Which is worse than
nothing at all.
Besides, there's more to this than just designing the perfect, or even a
good locale system in a language. *Somebody should actually use it*.
Now, the non-English programmer, what does he really want? He wants to
be able to type stuff into his program in his native character set. D
already does that, by way of Utf-8.
What else? Well, it is conceivable that he wants his program to print
dates and times the way it's done over there. He simply writes the
program "by hand" so it does dates and times like he wants. Even if
there was a locale thing in the language, he wouldn't bother with the
hassle. And he couldn't care less about Urdu.
The hypothetical Ambitious Programmer might want to use locale. He could
then have the dates and times (and currencies, etc.) follow the country.
Now, that might sound commendable, but in practice it *crumbles*.
He can't possibly know how to deal with languages that are written
backwards, languages where several characters make one letter, exotic
ways of writing dates, etc.
So, his fancy i18n project is doomed to be, at most, as usable as the
"normal" D program. Probably less, since his decisions will actually
worsen the user experience -- for users in another culture.
And, any project big enough to tackle this, will implement its own
locale handling anyway. I'm sorry to say.
----
Yes, locales are nice and all.
For D 3.5 that is.
Honestly.

Walter Bright wrote:
> Andrei Alexandrescu wrote:
>> There will be a global reference to a Locale class, e.g.
>> defaultLocale. By default the reference will be null, implying the C
>> locale should be in effect. Applications can assign to it as they find
>> fit, and also pass around multiple locale variables.
>
> I disagree with being able to assign to the global defaultLocale. This
> is going to cause endless problems. Just one is that any function that
> uses locale can no longer be pure. defaultLocale should be immutable.
>
> Any function that is locale aware should be parameterized with a locale
> parameter. (Not only is that better design, it self-documents the
> dependency.)
I don't understand this. That means there's no more default locale.
Here's what I had in mind:
class Locale { ... }
// function parameterized with an optional locale
void foo(Data d, Locale loc = null);
So there's no more default locale. If you pass in null, that's the
default locale.
Andrei

Walter Bright wrote:
> Andrei Alexandrescu wrote:
>> There will be a global reference to a Locale class, e.g.
>> defaultLocale. By default the reference will be null, implying the C
>> locale should be in effect. Applications can assign to it as they find
>> fit, and also pass around multiple locale variables.
>
> I disagree with being able to assign to the global defaultLocale. This
> is going to cause endless problems. Just one is that any function that
> uses locale can no longer be pure. defaultLocale should be immutable.
The two programs that are most "locale aware" are usually spread sheets
and word processors.
It is usual that the user needs to write, say, in Swedish or in Russian,
while in a Finnish setting. Or that one wants to use a decimal separator
other than what is "proper" for the country.
For example, a lot of people use "." instead of the official "," in
Finland, and many use time as "18:23" instead of "18.23".
For this purpose, these programs let the users define these any way they
want.
I think the notion of locales is, slowly but steadily, going away.
It was a nice idea at the time, but with two problems: users don't use
it, and programmers don't use it.
Of course, eventually we will want to "do something" about this. But
that should be left to the day when real issues are all sorted out in D.
This is a non-urgent, low-priority thing.

Georg Wrede wrote:
> Andrei Alexandrescu wrote:
>> Sooner or later that will need to be defined. I know next to nothing
>> about locales. (I know I dislike the design C++ uses.)
>
>
> D uses Utf-8, and that is *good enough*!
>
> This lets my programs "understand" Finnish, and doesn't give me undue
> headaches.
>
>
> Seriously tending to locale issues would be an *endless swamp*. Just for
> this, I looked up something suitable to read:
>
> http://www.manpagez.com/man/1/perllocale/
>
> It may even be that you would find the time, but think about Walter and
> us, please. There *really are* other things to do.
I don't find that scary at all. It's quite what I expected. We should
phase it in, after we do a good design. Also I don't plan to sit down
and write locale definition files, I want to parse the XML in that
locale repository I referred to.
> An excellent string hierarchy without the entire rest of i18n, is only
> going to look like a Ferrari with a Trabant engine. Which is worse than
> nothing at all.
I don't understand this. What is the rest of i18n?
> Besides, there's more to this than just designing the perfect, or even a
> good locale system in a language. *Somebody should actually use it*.
>
> Now, the non-English programmer, what does he really want? He wants to
> be able to type stuff into his program in his native character set. D
> already does that, by way of Utf-8.
>
> What else? Well, it is conceivable that he wants his program to print
> dates and times the way it's done over there. He simply writes the
> program "by hand" so it does dates and times like he wants. Even if
> there was a locale thing in the language, he wouldn't bother with the
> hassle. And he couldn't care less about Urdu.
If we come up with a good design, then they will be compelled to use it.
Applications meant to be used across multiple countries have fumbled
with locale support because there's no good support in most languages.
So then why not offer a compelling support in D?
> The hypothetical Ambitious Programmer might want to use locale. He could
> then have the dates and times (and currencies, etc.) follow the country.
> Now, that might sound commendable, but in practice it *crumbles*.
> He can't possibly know how to deal with languages that are written
> backwards, languages where several characters make one letter, exotic
> ways of writing dates, etc.
Well my understanding is that the guys who wrote those RFCs and whatnot
spent time figuring out the right abstractions. Why not use them?
> So, his fancy i18n project is doomed to be, at most, as usable as the
> "normal" D program. Probably less, since his decisions will actually
> worsen the user experience -- for users in another culture.
>
>
> And, any project big enough to tackle this, will implement its own
> locale handling anyway. I'm sorry to say.
They will implement their own because the language doesn't offer an
extensible framework that they can build on.
> Yes, locales are nice and all.
> For D 3.5 that is.
> Honestly.
I just don't see where the big problem is. I'm talking about a blessed
hierarchical hashtable to begin with. My initial desire is to be able to
customize the array separators in writeln.
Andrei

Georg Wrede wrote:
> Walter Bright wrote:
>> Andrei Alexandrescu wrote:
>>> There will be a global reference to a Locale class, e.g.
>>> defaultLocale. By default the reference will be null, implying the C
>>> locale should be in effect. Applications can assign to it as they
>>> find fit, and also pass around multiple locale variables.
>>
>> I disagree with being able to assign to the global defaultLocale. This
>> is going to cause endless problems. Just one is that any function that
>> uses locale can no longer be pure. defaultLocale should be immutable.
>
> The two programs that are most "locale aware" are usually spread sheets
> and word processors.
>
> It is usual that the user needs to write, say, in Swedish or in Russian,
> while in a Finnish setting. Or that one wants to use a decimal separator
> other than what is "proper" for the country.
>
> For example, a lot of people use "." instead of the official "," in
> Finland, and many use time as "18:23" instead of "18.23".
>
>
> For this purpose, these programs let the users define these any way they
> want.
That's exactly what my proposal is doing. People can start with the
defaults of the Finnish locale and then overwrite whichever parts they want.
> I think the notion of locales is, slowly but steadily, going away.
Do you have any data backing this up?
> It was a nice idea at the time, but with two problems: users don't use
> it, and programmers don't use it.
Is it because it hasn't been properly packaged?
> Of course, eventually we will want to "do something" about this. But
> that should be left to the day when real issues are all sorted out in D.
> This is a non-urgent, low-priority thing.
I guess. Now please tell me how I print arrays in D.
Andrei

Andrei Alexandrescu wrote:
> Walter Bright wrote:
>> Andrei Alexandrescu wrote:
>>> There will be a global reference to a Locale class, e.g.
>>> defaultLocale. By default the reference will be null, implying the C
>>> locale should be in effect. Applications can assign to it as they
>>> find fit, and also pass around multiple locale variables.
>>
>> I disagree with being able to assign to the global defaultLocale. This
>> is going to cause endless problems. Just one is that any function that
>> uses locale can no longer be pure. defaultLocale should be immutable.
>>
>> Any function that is locale aware should be parameterized with a
>> locale parameter. (Not only is that better design, it self-documents
>> the dependency.)
>
> I don't understand this. That means there's no more default locale.
> Here's what I had in mind:
>
> class Locale { ... }
>
> // function parameterized with an optional locale
> void foo(Data d, Locale loc = null);
>
> So there's no more default locale. If you pass in null, that's the
> default locale.
That's fine, I was thrown off by your reference to a "global reference".

Georg Wrede wrote:
> What else? Well, it is conceivable that he wants his program to print
> dates and times the way it's done over there. He simply writes the
> program "by hand" so it does dates and times like he wants. Even if
> there was a locale thing in the language, he wouldn't bother with the
> hassle. And he couldn't care less about Urdu.
I've attempted to use locales, but the reason I'd always wind up doing
it by hand is because the existing libraries to do it are obtuse,
impenetrable, execrable, and pretty much unusable.
So it may be that it's an insoluble problem, or maybe nobody has come up
with the right abstraction yet. I don't have nearly enough experience
with it to know the answer.

Walter Bright wrote:
> Andrei Alexandrescu wrote:
>> Walter Bright wrote:
>>> Andrei Alexandrescu wrote:
>>>> There will be a global reference to a Locale class, e.g.
>>>> defaultLocale. By default the reference will be null, implying the C
>>>> locale should be in effect. Applications can assign to it as they
>>>> find fit, and also pass around multiple locale variables.
>>>
>>> I disagree with being able to assign to the global defaultLocale.
>>> This is going to cause endless problems. Just one is that any
>>> function that uses locale can no longer be pure. defaultLocale should
>>> be immutable.
>>>
>>> Any function that is locale aware should be parameterized with a
>>> locale parameter. (Not only is that better design, it self-documents
>>> the dependency.)
>>
>> I don't understand this. That means there's no more default locale.
>> Here's what I had in mind:
>>
>> class Locale { ... }
>>
>> // function parameterized with an optional locale
>> void foo(Data d, Locale loc = null);
>>
>> So there's no more default locale. If you pass in null, that's the
>> default locale.
>
> That's fine, I was thrown off by your reference to a "global reference".
Well I was thinking a global reference might be handy for people who
e.g. want to set the locale once and then be done with it. I think only
a few apps actually manipulate multiple locales simultaneously. Most
would just want to load the locale present on the user's computer and
then use it.
Andrei