Thank you all for the great contributions, it looks like we're in the consensus for the following points:
1. The feature should rely on Unicode to define its scope
2. The name of the value should stay unchanged
3. The wording "language-specific rules *must* be used"[1] should be weakened at least for this value as language-specific rules for this value is more complicated than upper/lower. We'd like to allow UAs to implement language-specific rules, but we might not be able to test and make them interoperable.
4. Use UAX#29 for word break
5. Apply Titlecase_Mapping defined in Unicode[2] to the first letter of every word
A couple of concerns were left:
A. I'd like to add:
5.1. Except that numeric glyphs appear before the first letter of a word
since doing so looks to be the right thing to me (e.g., '99ers), and since IE/Firefox/Opera/WebKit implement that way.
I'm not sure if this contradicts with Unicode definition, as Unicode defines how to titlecase a word, but doesn't clearly define which words to apply. We define it instead in item 5 above, so I think this is safe to add; i.e., doing so does not mean we have different rules than Unicode.
Still, it must be good to send this to unicode@unicode.org, I'll do that anyway.
B. No existing implementations match to this spec. I guess this is the ideal spec from our perspective, but we may need to tweak or compromise as implementation goes. Not sure at this point, I'll consult this with my co-editor.
I think we're very close to finalize the requirements. Thank you again for all the help you all have made so far!
Regards,
Koji
[1] http://dev.w3.org/csswg/css3-text/#text-transform
[2] http://www.unicode.org/versions/Unicode6.0.0/ch05.pdf
-----Original Message-----
From: Brady Duga [mailto:bradyduga@gmail.com] On Behalf Of Brady Duga
Sent: Tuesday, February 22, 2011 12:41 PM
To: Asmus Freytag
Cc: Brady Duga; Mark Davis ☕; Xaxio Brandish; John Cowan; Koji Ishii; Christoph Päper; W3C style mailing list; www-international@w3.org
Subject: Re: [css3-text] text-transform:capitalize
On Feb 21, 2011, at 7:06 PM, Asmus Freytag wrote:
>
> Hence, in this context, Mark is correct that "capitalize" is ambiguous.
I don't disagree with that assertion. It certainly is ambiguous, and if someone told me to capitalize the string 'the quick brown fox' I might quite reasonably respond with 'The quick brown fox', or the 'The Quick Brown Fox' or 'THE QUICK BROWN FOX'. However, the term title case, when applied to a string, is also ambiguous, and a very common meaning is to re-write it using language specific rules for titles.
>
> Unicode has defined the terms Uppsercase, Lowercase and Titlecase in very precise ways. Building on these defintions would seem useful.
Unicode defines titlecase in a specific way for words, but it leaves the term ambiguous for strings (what we are referring to here) as well as leaving the definition of a word open. The examples they provide for titlecasing strings would all fail with the proposed algorithm, so it seems inappropriate to use that term as the value of the transformation. As you point out, uppercase and title case are different, so we can't remove the term from our definition, but it doesn't seem helpful to change the name of the value since there at least does exist a meaning of capitalize when applied to a string that could mean making the first letter of a word uppercase. Our definition can remove the ambiguity by clearly stating that Titlecase_Mapping should be applied to the first character of every word and explaining that UAX#29 should be used for word breaks. Again, I just don't see any benefit in changing the name of the value.
--Brady