Detailed Description

C API: Locale.

ULoc C API for Locale

A Locale represents a specific geographical, political, or cultural region. An operation that requires a Locale to perform its task is called locale-sensitive and uses the Locale to tailor information for the user. For example, displaying a number is a locale-sensitive operation–the number should be formatted according to the customs/conventions of the user's native country, region, or culture. In the C APIs, a locales is simply a const char string.

You create a Locale with one of the three options listed below. Each of the component is separated by '_' in the locale string.

The third option requires another additional information–the Variant. The Variant codes are vendor and browser-specific. For example, use WIN for Windows, MAC for Macintosh, and POSIX for POSIX. Where there are two variants, separate them with an underscore, and put the most important one first. For example, a Traditional Spanish collation might be referenced, with "ES", "ES", "Traditional_WIN".

Because a Locale is just an identifier for a region, no validity check is performed when you specify a Locale. If you want to see whether particular resources are available for the Locale you asked for, you must query those resources. For example, ask the UNumberFormat for the locales it supports using its getAvailable method. Note: When you ask for a resource for a particular locale, you get back the best available match, not necessarily precisely what you asked for. For more information, look at UResourceBundle.

The Locale provides a number of convenient constants that you can use to specify the commonly used locales. For example, the following refers to a locale for the United States:

Once you've specified a locale you can query it for information about itself. Use uloc_getCountry to get the ISO Country Code and uloc_getLanguage to get the ISO Language Code. You can use uloc_getDisplayCountry to get the name of the country suitable for displaying to the user. Similarly, you can use uloc_getDisplayLanguage to get the name of the language suitable for displaying to the user. Interestingly, the uloc_getDisplayXXX methods are themselves locale-sensitive and have two versions: one that uses the default locale and one that takes a locale as an argument and displays the name or country in a language appropriate to that locale.

The ICU provides a number of services that perform locale-sensitive operations. For example, the unum_xxx functions format numbers, currency, or percentages in a locale-sensitive manner.

A Locale is the mechanism for identifying the kind of services (UNumberFormat) that you would like to get. The locale is just a mechanism for identifying these services.

Each international service that performs locale-sensitive operations allows you to get all the available objects of that type. You can sift through these objects by language, country, or variant, and use the display names to present a menu to the user. For example, you can create a menu of all the collation objects suitable for a given language. Such classes implement these three class methods:

Concerning POSIX/RFC1766 Locale IDs, the getLanguage/getCountry/getVariant/getName functions do understand the POSIX type form of language_COUNTRY.ENCODING@VARIANT and if there is not an ICU-stype variant, uloc_getVariant() for example will return the one listed after the @at sign. As well, the hyphen "-" is recognized as a country/variant separator similarly to RFC1766. So for example, "en-us" will be interpreted as en_US. As a result, uloc_getName() is far from a no-op, and will have the effect of converting POSIX/RFC1766 IDs into ICU form, although it does NOT map any of the actual codes (i.e. russian->ru) in any way. Applications should call uloc_getName() at the point where a locale ID is coming from an external source (user entry, OS, web browser) and pass the resulting string to other ICU functions. For example, don't use de-de@EURO as an argument to resourcebundle.

For example, a collator for "en_US_CALIFORNIA" was requested. In the current state of ICU (2.0), the requested locale is "en_US_CALIFORNIA", the valid locale is "en_US" (most specific locale supported by ICU) and the actual locale is "root" (the collation data comes unmodified from the UCA) The locale is considered supported by ICU if there is a core ICU bundle for that locale (although it may be empty).

If localeID is already in the maximal form, or there is no data available for maximization, it will be copied to the output buffer. For example, "und-Zzzz" cannot be maximized, since there is no reasonable maximization.

Examples:

"en" maximizes to "en_Latn_US"

"de" maximizes to "de_Latn_US"

"sr" maximizes to "sr_Cyrl_RS"

"sh" maximizes to "sr_Latn_RS" (Note this will not reverse.)

"zh_Hani" maximizes to "zh_Hans_CN" (Note this will not reverse.)

Parameters

localeID

The locale to maximize

maximizedLocaleID

The maximized locale

maximizedLocaleIDCapacity

The capacity of the maximizedLocaleID buffer

err

Error information if maximizing the locale failed. If the length of the localeID and the null-terminator is greater than the maximum allowed size, or the localeId is not well-formed, the error code is U_ILLEGAL_ARGUMENT_ERROR.

Returns

The actual buffer size needed for the maximized locale. If it's greater than maximizedLocaleIDCapacity, the returned ID will be truncated. On error, the return value is -1.

Note: This has the effect of 'canonicalizing' the string to a certain extent. Upper and lower case are set as needed, and if the components were in 'POSIX' format they are changed to ICU format. It does NOT map aliased names in any way. See the top of this header file.

Parameters

localeID

the locale to get the full name with

name

the full name for localeID

nameCapacity

the size of the name buffer to store the full name with

err

error information if retrieving the full name failed

Returns

the actual buffer size needed for the full name. If it's greater than nameCapacity, the returned full name will be truncated.

If the specified language tag contains any ill-formed subtags, the first such subtag and all following subtags are ignored.

This implements the 'Language-Tag' production of BCP47, and so supports grandfathered (regular and irregular) as well as private use language tags. Private use tags are represented as 'x-whatever', and grandfathered tags are converted to their canonical replacements where they exist. Note that a few grandfathered tags have no modern replacement, these will be converted using the fallback described in the first paragraph, so some information might be lost.

Parameters

langtag

the input BCP47 language tag.

localeID

the output buffer receiving a locale ID for the specified BCP47 language tag.

localeIDCapacity

the size of the locale ID output buffer.

parsedLength

if not NULL, successfully parsed length for the input language tag is set.

The return value is a pointer to an item of a locale name array. Both this array and the pointers it contains are owned by ICU and should not be deleted or written through by the caller. The locale name is terminated by a null pointer.

Gets the full name for the specified locale, like uloc_getName(), but without keywords.

Note: This has the effect of 'canonicalizing' the string to a certain extent. Upper and lower case are set as needed, and if the components were in 'POSIX' format they are changed to ICU format. It does NOT map aliased names in any way. See the top of this header file.

This API strips off the keyword part, so "de_DE\@collation=phonebook" will become "de_DE". This API supports preflighting.

Parameters

localeID

the locale to get the full name with

name

fill in buffer for the name without keywords.

nameCapacity

capacity of the fill in buffer.

err

error information if retrieving the full name failed

Returns

the actual buffer size needed for the full name. If it's greater than nameCapacity, the returned full name will be truncated.

The returned string is a snapshot in time, and will remain valid and unchanged even when uloc_setDefault() is called. The returned storage is owned by ICU, and must not be altered or deleted by the caller.

Warning: this is for the region part of a valid locale ID; it cannot just be the region code (like "FR"). To get the display name for a region alone, or for other options, use ULocaleDisplayNames instead.

Parameters

locale

the locale to get the displayable country code with. NULL may be used to specify the default.

displayLocale

Specifies the locale to be used to display the name. In other words, if the locale's language code is "en", passing Locale::getFrench() for inLocale would result in "Anglais", while passing Locale::getGerman() for inLocale would result in "Englisch". NULL may be used to specify the default.

country

the displayable country code for localeID

countryCapacity

the size of the country buffer to store the displayable country code with

status

error information if retrieving the displayable country code failed

Returns

the actual buffer size needed for the displayable country code. If it's greater than countryCapacity, the returned displayable country code will be truncated.

Specifies the locale to be used to display the name. In other words, if the locale's language code is "en", passing Locale::getFrench() for inLocale would result in "Anglais", while passing Locale::getGerman() for inLocale would result in "Englisch". NULL may be used to specify the default.

dest

the buffer to which the displayable keyword should be written.

destCapacity

The size of the buffer (number of UChars). If it is 0, then dest may be NULL and the function will only return the length of the result without writing any of the result string (pre-flighting).

status

error information if retrieving the displayable string failed. Should not be NULL and should not indicate failure on entry.

Gets the value of the keyword suitable for display for the specified locale.

E.g: for the locale string de_DE@collation=PHONEBOOK, this API gets the display string for PHONEBOOK, in the display locale, when "collation" is specified as the keyword.

Parameters

locale

The locale to get the displayable variant code with. NULL may be used to specify the default.

keyword

The keyword for whose value should be used.

displayLocale

Specifies the locale to be used to display the name. In other words, if the locale's language code is "en", passing Locale::getFrench() for inLocale would result in "Anglais", while passing Locale::getGerman() for inLocale would result in "Englisch". NULL may be used to specify the default.

dest

the buffer to which the displayable keyword should be written.

destCapacity

The size of the buffer (number of UChars). If it is 0, then dest may be NULL and the function will only return the length of the result without writing any of the result string (pre-flighting).

status

error information if retrieving the displayable string failed. Should not be NULL and must not indicate failure on entry.

Specifies the locale to be used to display the name. In other words, if the locale's language code is "en", passing Locale::getFrench() for inLocale would result in "Anglais", while passing Locale::getGerman() for inLocale would result in "Englisch".

language

the displayable language code for localeID

languageCapacity

the size of the language buffer to store the displayable language code with

status

error information if retrieving the displayable language code failed

Returns

the actual buffer size needed for the displayable language code. If it's greater than languageCapacity, the returned language code will be truncated.

the locale to get the displayable name with. NULL may be used to specify the default.

inLocaleID

Specifies the locale to be used to display the name. In other words, if the locale's language code is "en", passing Locale::getFrench() for inLocale would result in "Anglais", while passing Locale::getGerman() for inLocale would result in "Englisch". NULL may be used to specify the default.

result

the displayable name for localeID

maxResultSize

the size of the name buffer to store the displayable full name with

err

error information if retrieving the displayable name failed

Returns

the actual buffer size needed for the displayable name. If it's greater than maxResultSize, the returned displayable name will be truncated.

the locale to get the displayable script code with. NULL may be used to specify the default.

displayLocale

Specifies the locale to be used to display the name. In other words, if the locale's language code is "en", passing Locale::getFrench() for inLocale would result in "", while passing Locale::getGerman() for inLocale would result in "". NULL may be used to specify the default.

script

the displayable script for the localeID

scriptCapacity

the size of the script buffer to store the displayable script code with

status

error information if retrieving the displayable script code failed

Returns

the actual buffer size needed for the displayable script code. If it's greater than scriptCapacity, the returned displayable script code will be truncated.

the locale to get the displayable variant code with. NULL may be used to specify the default.

displayLocale

Specifies the locale to be used to display the name. In other words, if the locale's language code is "en", passing Locale::getFrench() for inLocale would result in "Anglais", while passing Locale::getGerman() for inLocale would result in "Englisch". NULL may be used to specify the default.

variant

the displayable variant code for localeID

variantCapacity

the size of the variant buffer to store the displayable variant code with

status

error information if retrieving the displayable variant code failed

Returns

the actual buffer size needed for the displayable variant code. If it's greater than variantCapacity, the returned displayable variant code will be truncated.

Note: This has the effect of 'canonicalizing' the ICU locale ID to a certain extent. Upper and lower case are set as needed. It does NOT map aliased names in any way. See the top of this header file. This API supports preflighting.

Parameters

localeID

the locale to get the full name with

name

fill in buffer for the name without keywords.

nameCapacity

capacity of the fill in buffer.

err

error information if retrieving the full name failed

Returns

the actual buffer size needed for the full name. If it's greater than nameCapacity, the returned full name will be truncated.

If localeID is already in the minimal form, or there is no data available for minimization, it will be copied to the output buffer. Since the minimization algorithm relies on proper maximization, see the comments for uloc_addLikelySubtags for reasons why there might not be any data.

Examples:

"en_Latn_US" minimizes to "en"

"de_Latn_US" minimizes to "de"

"sr_Cyrl_RS" minimizes to "sr"

"zh_Hant_TW" minimizes to "zh_TW" (The region is preferred to the script, and minimizing to "zh" would imply "zh_Hans_CN".)

Parameters

localeID

The locale to minimize

minimizedLocaleID

The minimized locale

minimizedLocaleIDCapacity

The capacity of the minimizedLocaleID buffer

err

Error information if minimizing the locale failed. If the length of the localeID and the null-terminator is greater than the maximum allowed size, or the localeId is not well-formed, the error code is U_ILLEGAL_ARGUMENT_ERROR.

Returns

The actual buffer size needed for the minimized locale. If it's greater than minimizedLocaleIDCapacity, the returned ID will be truncated. On error, the return value is -1.

NOTE: Unlike almost every other ICU function which takes a buffer, this function will NOT truncate the output text, and will not update the buffer with unterminated text setting a status of U_STRING_NOT_TERMINATED_WARNING. If a BUFFER_OVERFLOW_ERROR is received, it means a terminated version of the updated locale ID would not fit in the buffer, and the original buffer is untouched. This is done to prevent incorrect or possibly even malformed locales from being generated and used.

Parameters

keywordName

name of the keyword to be set; must not be NULL or empty, and must consist only of [A-Za-z0-9]. Case insensitive.

keywordValue

value of the keyword to be set. If 0-length or NULL, will result in the keyword being removed; no error is given if that keyword does not exist. Otherwise, must consist only of [A-Za-z0-9] and [/_+-].

Note: When strict is FALSE, any locale fields which do not satisfy the BCP47 syntax requirement will be omitted from the result. When strict is TRUE, this function sets U_ILLEGAL_ARGUMENT_ERROR to the err if any locale fields do not satisfy the BCP47 syntax requirement.

Parameters

localeID

the input locale ID

langtag

the output buffer receiving BCP47 language tag for the locale ID.

langtagCapacity

the size of the BCP47 language tag output buffer.

strict

boolean value indicating if the function returns an error for an ill-formed input locale ID.

For example, the legacy type "phonebook" is returned for the input BCP 47 Unicode locale extension type "phonebk" with the keyword "collation" (or "co").

When the specified keyword is not recognized, but the specified value satisfies the syntax of legacy key, or when the specified keyword allows 'variable' type and the specified value satisfies the syntax, then the pointer to the input type value itself will be returned. For example, uloc_toLegacyType("Foo", "Bar") returns "Bar", uloc_toLegacyType("vt", "00A4") returns "00A4".

Parameters

keyword

the locale keyword (either legacy keyword such as "collation" or BCP 47 Unicode locale extension key such as "co").

For example, BCP 47 Unicode locale extension type "phonebk" is returned for the input keyword value "phonebook", with the keyword "collation" (or "co").

When the specified keyword is not recognized, but the specified value satisfies the syntax of the BCP 47 Unicode locale extension type, or when the specified keyword allows 'variable' type and the specified value satisfies the syntax, then the pointer to the input type value itself will be returned. For example, uloc_toUnicodeLocaleType("Foo", "Bar") returns "Bar", uloc_toUnicodeLocaleType("variableTop", "00A4") returns "00A4".

Parameters

keyword

the locale keyword (either legacy key such as "collation" or BCP 47 Unicode locale extension key such as "co").

value

the locale keyword value (either legacy type such as "phonebook" or BCP 47 Unicode locale extension type such as "phonebk").