This profile focuses not only on Zhou’s role in the creation of Hanyu Pinyin but also on his political views, which he has become increasingly public with.

About Mao, he said in an interview: “I deny he did any good.” About the 1989 Tiananmen Square massacre: “I am sure one day justice will be done.” About popular support for the Communist Party: “The people have no freedom to express themselves, so we cannot know.”

As for fostering creativity in the Communist system, Mr. Zhou had this to say, in a 2010 book of essays: “Inventions are flowers that grow out of the soil of freedom. Innovation and invention don’t grow out of the government’s orders.”

No sooner had the first batch of copies been printed than the book was banned in China.

Although the reporter’s assertion, following the PRC’s official figures, that “China all but stamp[ed] out illiteracy” is well wide of the mark, there is no denying Pinyin’s crucial role in this area. I recommend reading the whole article.

The ordering is primarily simply alphabetical. Diacritical marks, punctuation, juncture and capitalization are only taken into account when the strings being compared are otherwise identical. For example, píng’?n sorts before p?ny?n, because pingan sorts before pinyin, because g precedes y alphabetically.

Only when two strings are alphabetically identical is non-alphabetical information taken into account.

The series’ Reader’s Guide presents the specifics of the sort order. Since I don’t have to worry about how much space this takes up on my site, I have reformatted the information slightly to give the examples as numbered lists.

Head entry transcriptions with the same sequence of letters are ordered first strictly by letter sequence regardless of tones, then by initial syllable tone in the sequence 0 1 2 3 4. For entries with the same initial tone, arrangement is by the tone of the second syllable, again in the order 0 1 2 3 4. For example:

sh?shi

sh?sh?

sh?shí

sh?sh?

sh?shì

shísh?

shíshì

sh?sh?

shìsh?

Irrespective of tones, entries with the vowel u precede those with ü.
For example:

lú

l?

lù

l?

l?

l?

nù

n?

Entries without apostrophe precede those with apostrophe. For example:

biàn — argue

b?’àn — the other shore

Lower-case entries precede upper-case entries. For example:

hòujìn — aftereffect

Hòu Jìn — Later Jin dynasty

For entries with identical spelling, including tones, arrangement is by order of frequency….

For most users, the most important thing to note is that the neutral tone is regarded as 0, not as 5. Thus, the order is not “? á ? à a,” but “a ? á ? à.” And, because lowercase comes before uppercase, not “A a ? ? Á á ? ? À à” but “a A ? ? á Á ? ? à À.”

HPC [Hanyu Pinyin Cihui] gave hyphens and spaces the same priority as apostrophes, so that lìg?ng sorted before l?-g?ng, in spite of the tones. Usage of hyphens and spaces in pinyin is still far from being fully standardized. (The same is true in English orthography.) Consequently, for collation it makes sense to give less weight to hyphens and spaces, and more weight to tones, thus sorting l?-g?ng before lìg?ng. In ABC, hyphens and spaces don’t affect the sort order unless they change the pronunciation in the same way that apostrophe would; for example, 1míng-àn ?? and 2míng’àn ?? are treated as homophones, and they sort after m?ng?n ??.

Today, the thirtieth anniversary of the death of the brilliant linguist and all-around interesting guy Y.R. Chao (Zhao Yuanren / Zhào Yuánrèn / ??? / ???), I’m remembering him by rereading some of his work. (Chao died twenty years and one day after his good friend Hu Shih.)

Here are some readings here on Pinyin.info by or about Y.R. Chao that you may wish to review:

Today, on the fiftieth anniversary of the death of Hu Shih (Hú Shì/??/??), I’d like to say a few things in his memory. This is, after all, someone I regard as a hero in many ways. I even keep a photo of him in my office.

Hu Shi (1891–1962), “the Father of the Chinese Renaissance,” towered over China’s intellectual landscape in the first half of the twentieth century. Among other achievements, he is credited with having made everyday speech respectable as a medium of written communication. Groomed as a traditional scholar-bureaucrat in his father’s footsteps, he had already turned into an iconoclastic renegade by the time he left Shanghai at the age of eighteen to study in the United States. In John Dewey, whose approach to philosophy was to treat all doctrines as working hypotheses, Hu felt he found “the proper way to think.” He and his associates who studied with Dewey at Columbia University established the framework of China’s modern educational system. A dedicated humanist, social reformer and promoter of women rights, he was, at different periods of his life, president of Peking University, president of the Academia Sinica, and ambassador to Washington.

To return to the most important point, at least in terms of the focus of this site, it was he, more than anyone else, who helped break the stranglehold of Literary Sinitic (a.k.a. classical Chinese). The vernacular movement he spearheaded is of far greater significance and has had a much greater impact on Chinese culture and people’s lives than so-called character simplification. Yet it receives relatively little attention, perhaps because many do not understand — or do not want to admit — how very different Literary Sinitic is from modern standard Mandarin.

Hu Shih is also the one who, more than anyone else, popularized the use of modern punctuation in Chinese texts, such as through his book Zh?ngguó Zhéxuésh? Dàg?ng and his editions of earlier works. That alone should be enough to earn him the eternal gratitude of all who read texts written in Chinese characters.

There’s so much more to the man than this, though most of it falls outside the bounds of this site. So rather than go into it here I will just encourage people to read more by and about him.

Shortly after Hu Shih’s death his son wrote:

father passed away during a cocktail party in honor of the members of the Academia Sinica after the completion of the members’ meeting. He passed away without any pain, and from every one present at the party, I gathered that he died happy, for the last words he said was, “Let’s have some drinks!”

Some of the Pinyin-friendly font families I provide examples of on this blog are fun but not exactly the sort of thing you’d want to use in a book or other serious project. Others, though, are solid examples of the subtle and exacting art of type design. Today’s entry belongs in the latter group.

Brill — a Leiden-based publisher of work in the humanities, social sciences, law, and science — has released “the Brill,” a new font family designed to support the Latin and Greek scripts “to the fullest extent possible.” IPA and the Slavic parts of the Cyrillic range are also covered. This can handle the needs of just about any romanized script, including Hanyu Pinyin.

As someone with Brill explained to me:

Instead of limiting the fonts’ character set to known characters and character-plus-diacritic combinations, we chose a dynamic model in which, using OpenType GPOS features, any base character can carry any diacritic above or below it, and in which diacritics can be stacked as well—not forgetting all the precomposed characters that are already present in the Unicode Standard, of course. Finally, a huge assortment of punctuation marks, editorial marks, and other symbols known to occur in Brill publications were added to the spec.

In total, the Brill contains more than 5,100 characters. And that already immense range can be extended through combining diacritics, as noted above.

The Brill is available now in roman and italic styles. Bold and bold italic versions will be released later this year, probably before July.

The Brill is considerably different than Brill Online, which has been available for some time and was aimed at helping users of Brill’s online reference works. Brill Online is based on v. 1.00 of the Gentium family of fonts. The glyph set was extended to support some very rare characters, such as Aegean numbers. “In essence it became a hybrid Latin-Greek-Cyrillic-IPA and ‘pi’ font family.”

Thanks to Lin Ai of Zhongweb.net for the heads up that this had been released, and to Dominique de Roo and Pim Rietbroek of Brill for patiently helping me with my questions.

But it was all for a good cause, of course. You see, the Mandarin expression chu? l?ba, when not referring to the literal playing of a trumpet, is usually taken in Taiwan to refer to a blow job. But in China, Ma explained, chu? l?ba means the same thing as the idiom p?i m?pì (pat/kiss the horse’s ass — i.e., flatter). And now that we have the handy-dandy Zh?nghuá Y?wén Zh?shikù (Chinese Language Database), which Ma was announcing, we can look up how Mandarin differs in Taiwan and China, and thus not get tripped up by such misunderstandings. Or at least that’s supposed to be the idea.

The database, which is the result of cross-strait cooperation, can be accessed via two sites: one in Taiwan, the other in China.

It’s clear that a lot of money has been spent on this. For example, many entries are accompanied by well-documented, precise explanations by distinguished lexicographers. Ha! Just kidding! Many entries are really accompanied by videos — some two hundred of them — of cutesy puppets gabbing about cross-strait differences in Mandarin expressions. But if there’s a video in there of the panda in the skirt explaining to the sheep in the vest that a useful skill for getting ahead in Chinese society is chu? l?ba, I haven’t found it yet. Will NMA will take up the challenge?

Much of the site emphasizes not so much language as Chinese characters. For example, another expensively produced video feeds the ideographic myth by showing off obscure Hanzi, such as the one for ch?ng.

WARNING: The screenshot below links to a video that contains scenes with intense wawa-ing and thus may not be suitable for anyone who thinks it’s not really cute for grown women to try to sound like they’re only thwee-and-a-half years old.

Most of these characters are of relatively low frequency and, except for a few of them, neither their meanings nor their pronunciations are known by persons of average literacy.

Many more such characters consisting or two, three, or four repetitions of the same character exist, and their sounds and meanings are in most cases equally or more opaque.

The Hanzi for ch?ng (which looks like ??? run together as one character) in the video above is sufficiently obscure that it likely won’t be shown correctly in many browsers on most systems when written in real text: ????. But never fear: It’s already in Unicode and so should be appearing one of these years in a massively bloated system font.

Then he went on about how Chinese characters are a great system because, supposedly, they have a one-to-one correspondence with language that other scripts cannot match and people can know what they mean by looking at them (!) and that they therefore have a high degree of artistic quality (g?odù de yìshùxìng). Basically, the person in charge of this project seems to have a bad case of the Like Wow syndrome, which is not a reassuring trait for someone in charge of producing a dictionary.

The same cooperation that built the Web sites led to a new book, Li?ng’àn M?irì Y? Cí (???????? / Roughly: Cross-Strait Term-a-Day Book), which was also touted at the press conference.