> a'bc => A'Bc
Unfortunately, this rule will transform "I'll" and "I've" incorrectly.
Given the fact that smart guys at Unicode have already challenged this issue and have ended up with what they have in 5.18 Case Mappings in Unicode 6.0[1], I'm not very positive to say that reinventing titlecasing in CSS is a good idea. If there were any good idea, it should go into Unicode, and not only CSS but all other applications in the world can use it.
Even for "." (PERIOD) case Xaxio raised, I'm still skeptical whether CSS should treat it differently from Unicode or not. I understand how "a.m." should be titlecased, but I haven't investigated if there were any counter-cases, nor asked if Unicode guys considered that case or not. Unicode guys must have reasons to make "." as MidNumLet, not MidNum. IE must have reasons to make "." not to break words in titlecasing, and WebKit must have reasons to break. I'm not saying that Xaxio is wrong, but just that we still know little to make the decision to do it differently from what Unicode defines.
If we can agree on taking the option 3 in the previous e-mail[2]; i.e., support whatever Unicode defines today, I'll need to investigate that.
Anyone agreeing to other options than the option 3, or proposing other options are still greatly appreciated.
Also, If you have more information on "." PERIOD, it is also greatly appreciated. I can think of some other cases like "e.g.", "u.s.", file name with extension or domain name, but you guys must have much better idea for the cases and how they should be titlecased.
Regards,
Koji
[1] http://www.unicode.org/versions/Unicode6.0.0/ch05.pdf
[2] http://lists.w3.org/Archives/Public/www-style/2011Feb/0583.html
-----Original Message-----
From: www-international-request@w3.org [mailto:www-international-request@w3.org] On Behalf Of Christoph Paper
Sent: Monday, February 21, 2011 9:37 PM
To: W3C style mailing list
Cc: 'WWW International' (www-international@w3.org)
Subject: Re: [css3-text] text-transform:capitalize
Brady Duga:
> it seems like there is no way to get both French (l'histoire -> L'Histoire) and English (can't -> Can't) titlecasing without using language-specific word break tables.
You could probably special-case on character count (0, 1, 2 or more) before and after the punctuation. The following is just an example, not necessarily best practice.
a'b => A'B ?
a' => A'
'b => 'b
ab' => Ab'
'ab => 'Ab
ab'c => Ab'c
a'bc => A'Bc
ab'cd => Ab'cd
same for '