Chinese, Korean Japanese - It's All the Same

Over thirty years of professional experience designing and developing high performance parallel transactional server side systems in "C", "C++" and Java on *nix and Windows platforms for military, financial and health platforms. Bachelor's in Geological Engineering and Masters in Computer Science.

Matt H. and company recently added support for Cyrillic script to their PDF invoice generator when they discovered that none of the characters would print. The script used DOMPDF to convert the HTML invoices to PDF, and
font handling across scripts can get a bit hairy, so it was not really a surprise. However, as he was digging through the code that generated the invoices, he found this little gem:

This snippet attempts to deliberately specify a Japanese font for all Chinese, Korean or Japanese characters. But since the condition to do it is messed up, it actually applies this font to all html entities above &#999 (ϧ). This is kind of a good thing, because large portions of the Japanese character set are not in the range 19968-40895. On the other hand, Matt was quite impressed with DOMPDF's spanking-good handling of unclosed tags.