Wednesday, March 16, 2016

Unicode CLDR 29 provides an update to the key building blocks for software
supporting the world's languages. This data is used by all
major software
systems for their software internationalization and localization, adapting
software to the conventions of different languages for such common software
tasks. The following summarizes the main improvements in the release.

New BCP47 extension keys have been added for
specifying transliteration and emoji presentation, and for customizing
locales with region-specific settings. Many new transforms are provided, the
rule format has been simplified, and BCP47 IDs have been added for all
transforms. Region data now includes appropriate preferences for day periods
such as “6:00 in the morning” and “7:00 in the evening”, and there is new
structure for choosing appropriate units based on region and usage. A Cantonese
locale has been added. The emoji ordering has been improved, and annotations
are provided for more emoji and in more locales. The JSON-format data has
been extended to include number spellout (RBNF) and script metadata.

Tuesday, March 15, 2016

Call for Participation Now Open

For twenty-five years the
Internationalization & Unicode® Conference (IUC) has been the
preeminent event highlighting the latest innovations and best
practices of global and multilingual software providers. The 40th
conference will be held this year on November 1-3, 2016 in Santa
Clara, California.

Two Key Themes for This Year

Breaking All Barriers: Explore how software providers can meet the globalization challenges
of supporting the burgeoning diversity of communication platforms
around the world, including mobile, tablets, social media, video,
and voice. Examine how online social platforms are supporting
multilingual text and rich content in hundreds of languages. Often
the task is not just to publish in multiple languages, but to accept
input in alternative forms, analyze it for meaning and sentiment,
look for patterns in big data, or automate its routing or
translation. This theme also includes the latest advances in
relevant standards, and emerging and historic scripts.

This is the conference where you can
promote your ideas and experience working with natural languages,
multicultural user interfaces, producing and supporting
multinational and multilingual products, linguistic algorithms,
applying internationalization across mobile and social media
platforms, or advancements in relevant standards.

Thursday, March 10, 2016

Mountain View, CA, USA – The Unicode® Consortium today announced the start of
the beta review for the forthcoming Unicode 9.0.0, which is scheduled for
release in June, 2016. All beta feedback must be submitted by May 2, 2016.

Unicode is the foundation for all modern software and communications around
the world, including all modern operating systems, browsers, laptops, and smart
phones – plus the Internet and Web (URLs, HTML, XML, CSS, JSON, etc.). Thus it
is important to ensure a smooth transition to each new version of the Unicode
Standard.

Unicode 9.0.0 comprises several additions and changes which require careful
migration in implementations. These include asymmetric case mappings, numerous
variation sequences, new fractional numeric values, and changes to property
values, especially East_Asian_Width values. The line breaking and text
segmentation algorithms handle character sequences that represent emoji as
indivisible units via the addition of new property values and rules.
Implementers need to modify code and check assumptions for all affected
processes to support these additions and changes.

The new character repertoire includes 74 emoji symbols, 19 symbols used in
Japanese TV broadcasting, and multiple additions to existing scripts. There are
six new scripts, of which three are in modern use (Adlam, Osage, and Newa) and
three are historic (Bhaiksuki, Marchen, and Tangut). Adlam and Osage have case
pairs and require data updates for casing functions. Tangut is a large
ideographic script whose addition incurred changes to the Unicode Collation
Algorithm (used as the basis for sorting text in all languages).

Please review the documentation, adjust your code, test the data files, and
report errors and other issues to the Unicode Consortium by May 2, 2016.
Feedback instructions are on the beta page.

About the Unicode Consortium

The Unicode Consortium is a non-profit organization founded to develop,
extend and promote use of the Unicode Standard and related globalization
standards. The membership of the consortium represents a broad spectrum of
corporations and organizations, many in the computer and information processing
industry. Members include: Adobe, Apple, Emoji One, EmojiXpress, Facebook,
Google, Government of Bangladesh, Government of India, Huawei, IBM, Microsoft,
Monotype Imaging, Sultanate of Oman MARA, Oracle, SAP, Tamil Virtual University,
The University of California (Berkeley), Yahoo!, plus well over a hundred
Associate, Liaison, and Individual members. For more information, please contact
the Unicode Consortium
http://www.unicode.org/contacts.html.