The Pathedral and the Kazoo (Entries tagged as l10n)https://erack.org/blog/
Eike Rathke blogging about LibreOffice and the world (what a claim)enSerendipity 2.1.4 - http://www.s9y.org/Mon, 23 Sep 2013 13:41:18 GMThttps://erack.org/blog/templates/2k11/img/s9y_banner_small.pngRSS: The Pathedral and the Kazoo - Eike Rathke blogging about LibreOffice and the world (what a claim)https://erack.org/blog/
10021LibreOffice goes BCP 47https://erack.org/blog/archives/30-LibreOffice-goes-BCP-47.html
Free Softwarei18nLibreOfficehttps://erack.org/blog/archives/30-LibreOffice-goes-BCP-47.html#commentshttps://erack.org/blog/wfwcomment.php?cid=305https://erack.org/blog/rss.php?version=2.0&type=comments&cid=3023@127.0.0.1 (erAck)
<p> This week I accomplished an important milestone of the major rewrite that &ndash; apart from the daily work such as fixing bugs, coding small enhancements and reviewing patches &ndash; I'm working on since 9 months or so. In current master <strong><span title="The Free Office Suite, try it, love it, share it. Fantastic Project. Fun People." class="serendipity_glossaryMarkup">LibreOffice</span></strong> finally is able to transparently handle arbitrary (if valid) <strong><span title="Best Current Practice" class="serendipity_glossaryMarkup">BCP</span> <span title="IETF language tags" class="serendipity_glossaryMarkup">47</span> language tags</strong> and fully support the <em>fo:script</em> and <em>*:rfc-language-tag</em> attributes defined in <span title="OpenDocument Format for Office Applications" class="serendipity_glossaryMarkup">ODF</span> 1.2.
<p> So what does this mean? It means that <strong>you'll be able to get your language in</strong>.
<p> It means that already supported languages or writing scripts that so far used a kludge to squeeze them into <span title="The International Organization for Standardization" class="serendipity_glossaryMarkup">ISO</span> <span title="Codes for the Representation of Names of Languages" class="serendipity_glossaryMarkup">639</span> language codes and ISO <span title="Codes for the Representation of Names of Countries" class="serendipity_glossaryMarkup">3166</span> country codes only, are finally supported using the proper language tags registered with <span title="Internet Assigned Numbers Authority" class="serendipity_glossaryMarkup">IANA</span>. For example:
<dl>
<dt> <strong>ca-ES-valencia</strong> Catalan Valencian </dt>
<dd> The Valencian variant of Catalan previously used the <em>ca-XV</em> kludge where <em>XV</em> is a <em>reserved for private use</em> ISO 3166 code, which meant it could be used for UI translation purposes but not for document content. This is now stored in ODF as style:rfc-language-tag='ca-ES-valencia' attributes.</dd>
<dt> <strong>sr-Latn</strong> Serbian Latin </dt>
<dd> Previously the deprecated <strong>sh</strong> kludge was used to differentiate between Serbian Latin and <strong>sr</strong> Serbian Cyrillic. Serbian Latin in Serbia <strong>sr-Latn-RS</strong> is now stored in ODF as fo:language='sr' fo:script='Latn' fo:country='RS' attributes. </dd>
<dt>
</dl>
<p> It also means that a tag <strong>en-GB-oed</strong> can be and now is already supported, including the corresponding language list entry already being added to the list. This is <a href="https://erack.org/blog/exit.php?url_id=87&amp;entry_id=30" title="https://www.iana.org/assignments/lang-tags/en-GB-oed" onmouseover="window.status='https://www.iana.org/assignments/lang-tags/en-GB-oed';return true;" onmouseout="window.status='';return true;"><em>English, Oxford English Dictionary spelling</em></a>, which is mandatory for <span title="United Nations" class="serendipity_glossaryMarkup">UN</span> documents and as it seems also used for <span title="European Union" class="serendipity_glossaryMarkup">EU</span> documents. LibreOffice will be the first free office suite to support spell-checkers with Oxford English Dictionary spelling along with <strong>en-GB</strong> and <strong>en-US</strong> spelling at the same time.
<p> <em>Transparently handle arbitrary tags</em> means that when a document is read that contains language attribution not specifically known to LibreOffice (i.e. does not have an entry in the language list), when positioning the cursor on or selecting such text the language tag is shown in the status bar and in the language list of the character attribution so you will not see <em>Unknown</em> or, even worse, nothing or the system locale's language. If a dictionary was installed that handled such tag then it could be used for spell-checking. Transparently of course also means that the tag will be stored again to ODF when saving the document so the attribution is not lost.<strong></strong><strong></strong><strong></strong>
<p> The following screenshot shows an example of a document that uses the tag <strong>de-DE-1901</strong> to designate <a href="https://erack.org/blog/exit.php?url_id=88&amp;entry_id=30" title="https://www.iana.org/assignments/lang-tags/de-DE-1901" onmouseover="window.status='https://www.iana.org/assignments/lang-tags/de-DE-1901';return true;" onmouseout="window.status='';return true;"><em>German, German variant, traditional orthography</em></a>:
<div class="serendipity_imageComment_center" style="width: 754px"><div class="serendipity_imageComment_img"><!-- s9ymdb:17 --><img class="serendipity_image_center" width="754" height="697" src="https://erack.org/blog/uploads/screenshots/LibreOffice_de-DE-1901.png" title="Screenshot of LibreOffice displaying a BCP 47 language tag." alt="Screenshot of LibreOffice displaying a BCP 47 language tag." /></div><div class="serendipity_imageComment_txt">Screenshot of LibreOffice displaying a BCP 47 language tag.</div></div>
<p>&#160;
<p> I'm extremely glad to have this step ready just in time and of course I'll talk about it at the <a href="https://erack.org/blog/exit.php?url_id=89&amp;entry_id=30" title="http://conference.libreoffice.org/2013/" onmouseover="window.status='http://conference.libreoffice.org/2013/';return true;" onmouseout="window.status='';return true;">LibreOffice Conference 2013 at Milano</a>, so to get all the details please join me and attend <a href="https://erack.org/blog/exit.php?url_id=90&amp;entry_id=30" title="http://conference.libreoffice.org/2013/en/program/libreoffice-community/getting-your-language-in" onmouseover="window.status='http://conference.libreoffice.org/2013/en/program/libreoffice-community/getting-your-language-in';return true;" onmouseout="window.status='';return true;"><strong>Getting you language in</strong></a> on Thursday, 26 September at 15:30 in Sala Alfa.
<div align="center"><a href="https://erack.org/blog/exit.php?url_id=89&amp;entry_id=30" title="http://conference.libreoffice.org/2013/" onmouseover="window.status='http://conference.libreoffice.org/2013/';return true;" onmouseout="window.status='';return true;"><img src="http://conference.libreoffice.org/2013/conference01.png" alt="LibreOffice Milano Conference 2013 logo" /></a></div>
<p> If you are interested in the technical details of BCP 47 language tags I recommend <a href="https://erack.org/blog/exit.php?url_id=92&amp;entry_id=30" title="http://erack.de/bookmarks/D.html#Language_Tags" onmouseover="window.status='http://erack.de/bookmarks/D.html#Language_Tags';return true;" onmouseout="window.status='';return true;">my bookmarks</a> as a starting point.
<p>&#160;
Sun, 22 Sep 2013 21:33:00 +0200https://erack.org/blog/archives/30-guid.htmlBCP 47Free Softwarei18nl10nlanguage tagsLibreOfficeODFOpenDocument FormatEditable Date Acceptance Patterns in LibreOfficehttps://erack.org/blog/archives/22-Editable-Date-Acceptance-Patterns-in-LibreOffice.html
Calci18nLibreOfficehttps://erack.org/blog/archives/22-Editable-Date-Acceptance-Patterns-in-LibreOffice.html#commentshttps://erack.org/blog/wfwcomment.php?cid=220https://erack.org/blog/rss.php?version=2.0&type=comments&cid=2223@127.0.0.1 (erAck)
<p> The introduction of more restrictive date acceptance patterns in <span title="The Free Office Suite, try it, love it, share it. Fantastic Project. Fun People." class="serendipity_glossaryMarkup">LibreOffice</span> 3.6 (see earlier blog entries <a href="https://erack.org/blog/exit.php?url_id=56&amp;entry_id=22" title="http://erack.org/blog/archives/8-LibreOffice-date-acceptance-patterns.html" onmouseover="window.status='http://erack.org/blog/archives/8-LibreOffice-date-acceptance-patterns.html';return true;" onmouseout="window.status='';return true;">here</a> and <a href="https://erack.org/blog/exit.php?url_id=57&amp;entry_id=22" title="http://erack.org/blog/archives/18-Does-your-LibreOffice-locale-need-a-date-acceptance-pattern-for-incomplete-date-input.html" onmouseover="window.status='http://erack.org/blog/archives/18-Does-your-LibreOffice-locale-need-a-date-acceptance-pattern-for-incomplete-date-input.html';return true;" onmouseout="window.status='';return true;">here</a>) generated quite some discussion whether the change was good or bad. The fact that not all locales had patterns for incomplete (only day and month) date input added to their data also added some angry voices.
<p> Independent from that there was one thing overlooked: users want to be able to input dates using the numeric keypad and in locales with a '.' dot date separator that was not possible anymore because usually then there is no dot on the keypad due to the decimal separator being different. That certainly needed to be addressed.
<p> There is no way to satisfy everyone with a default set of patterns, I therefore implemented a <strong>Date acceptance patterns</strong> edit field in the <em>Tools&rarr;Options&rarr;LanguageSettings&rarr;Languages</em> dialogue that follows the selected locale and enables users to add, edit and remove patterns.
<div class="serendipity_imageComment_center" style="width: 600px"><div class="serendipity_imageComment_img"><!-- s9ymdb:8 --><img class="serendipity_image_center" width="600" src="https://erack.org/blog/uploads/screenshots/EditDateAcceptancePatterns.png" title="EditDateAcceptancePatterns.png" alt="Date acceptance patterns edit field." /></div><div class="serendipity_imageComment_txt">Date acceptance patterns edit field.</div></div>
<p> The change is currently in <em>master</em> and pending review as a late feature for inclusion to the 3.6.2 release.
<p> Example for the German de-DE locale:
<ul>
<li> default patterns: D.M.Y;D.M.
<li> to enable additional input on numeric keypad: D.M.Y;D.M.;D-M-Y;D-M
<ul>
<li> if 3-4 shall not result in a date, D-M- could be used instead of D-M
<li> note that to enter an ISO 8601 Y-M-D date with a D-M-Y pattern active one needs to enter a year >31 or with at least 3 digits, e.g. 011
</ul>
<li> instead of D-M-Y;D-M also D/M/Y;D/M could be used
</ul>
<p> Changes to the patterns become effective immediately after having confirmed and closed the dialog.
<p>
Fri, 31 Aug 2012 23:02:00 +0200https://erack.org/blog/archives/22-guid.htmlCalccell inputdatesi18nl10nLibreOfficelocale datanumber scannerspreadsheetWriterLibreOffice date acceptance patternshttps://erack.org/blog/archives/8-LibreOffice-date-acceptance-patterns.html
Calci18nLibreOfficehttps://erack.org/blog/archives/8-LibreOffice-date-acceptance-patterns.html#commentshttps://erack.org/blog/wfwcomment.php?cid=814https://erack.org/blog/rss.php?version=2.0&type=comments&cid=823@127.0.0.1 (erAck)
<p> <strong>Update 2012-08-31T23:08+0200</strong> : <a href="https://erack.org/blog/exit.php?url_id=59&amp;entry_id=8" title="http://erack.org/blog/archives/22-Editable-Date-Acceptance-Patterns-in-LibreOffice.html" onmouseover="window.status='http://erack.org/blog/archives/22-Editable-Date-Acceptance-Patterns-in-LibreOffice.html';return true;" onmouseout="window.status='';return true;">Editable Date Acceptance Patterns in <span title="The Free Office Suite, try it, love it, share it. Fantastic Project. Fun People." class="serendipity_glossaryMarkup">LibreOffice</span></a>
<p> Abstract: <span title="The LibreOffice Spreadsheet Application" class="serendipity_glossaryMarkup">Calc</span>'s (and in Writer table) cell input now needs to match locale
dependent date acceptance patterns before it is recognized as a valid date.
<p> Previously the number formatter's input scanner was very lax in what it
accepted as a "valid" date. All combinations of 2-3 numbers separated by '.'
'/' '-' or the locale's date separator even with blanks in between that somehow
could be interpreted as a date was accepted as such, which was especially
confusing with incomplete dates containing only 2 numbers that in many cases
were meant as textual input instead. For example
<ul>
<li> In en-US locale, M/D is a valid date input to be interpreted as day of
month of current year. However, M/D/ and M.D. were accepted as well.
<li> In de-DE locale, D.M. is a valid date input to be interpreted as day of
month of current year. However, D.M and D/M and D/M/ were accepted as well.
</ul>
<p> In case of an input like 1.2 in a de-DE locale or others using '.'
separator, meant as some sort of textual numbering, this was extremely
annoying, it was interpreted as 1st of February of current year and the user
had to prepend a single quote / apostrophe to suppress date recognition.
Similar for 1.2.3 in locales that do not use the '.' date separator.
<p> Now, during build time for each locale one full date acceptance pattern is
generated from the existing locale data's number format <em>FormatElement</em>
with <em>formatindex="21"</em> that is also used to edit dates, taking the DMY
order and the defined <em>DateSeparator</em>. For example, in the en-US locale
this generates <b>M/D/Y</b> from the MM/DD/YYYY <em>FormatCode</em>, and in the
de-DE locale <b>D.M.Y</b> from the DD.MM.YYYY code. For this to work correctly
the separator used in the FormatCode must match the DateSeparator element
defined in <em>Separators</em>. As for all rules there's one exception though
;) if the format code uses a different separator and that is one of the known
'-' '.' '/' separators, a second pattern is generated using the format's
separator. This as a generalized case for locales that for example may use an
<span title="The International Organization for Standardization" class="serendipity_glossaryMarkup">ISO</span> 8601 edit format, as hu-HU does, regardless what the date separator is
defined to.
<p> Additionally to the date acceptance pattern every locale of course still
accepts input in an ISO 8601 <b>Y-M-D</b> pattern, and since LibreOffice 3.5
that also leads to the YYYY-MM-DD format being applied.
<h5> Localizers, HEADS UP please </h5>
<p> If in your locale incomplete dates should be accepted or additional
patterns that vary from the generated full date pattern are needed, those are
to be defined in the locale data <em>LC_FORMAT</em> element for which a new
<em>DateAcceptancePattern</em> element exists, of which zero or more can occur
before the <em>FormatElement</em> elements. Currently only the following
patterns are defined as they are the only ones I knew were plausible:
<ul>
<li> bg-BG, a trailing breaking or non-breaking space followed by lower case
or upper case Cyrillic letter GHE and a dot, as defined in the edit format
<ul>
<li> D.M.Y г.
<li> D.M.Y г.
<li> D.M.Y Г.
<li> D.M.Y Г.
</ul>
<li> de-DE, incomplete date
<ul>
<li> D.M.
</ul>
<li> en-US, incomplete date
<ul>
<li> M/D
</ul>
<li> sl-SI, date separator dot plus space
<ul>
<li> D. M. Y
</ul>
</ul>
<p> For example see <a href="https://erack.org/blog/exit.php?url_id=18&amp;entry_id=8" onmouseover="window.status='http://cgit.freedesktop.org/libreoffice/core/plain/i18npool/source/localedata/data/en_US.xml';return true;" onmouseout="window.status='';return true;" title="the English-US locale data file">i18npool/source/localedata/data/en_US.xml</a>
<p> Happy date accepting :-)
<p><strong>Update:</strong> an updated list of locales and patterns is available in <a href="archives/18-Does-your-LibreOffice-locale-need-a-date-acceptance-pattern-for-incomplete-date-input.html">a newer blog post</a>.
<p>
Wed, 11 Jan 2012 20:46:20 +0100https://erack.org/blog/archives/8-guid.htmlCalccell inputdatesi18nl10nLibreOfficelocale datanumber formatternumber scannerspreadsheetWriterLibreOffice possessive genitive case and partitive case month nameshttps://erack.org/blog/archives/2-LibreOffice-possessive-genitive-case-and-partitive-case-month-names.html
i18nLibreOfficehttps://erack.org/blog/archives/2-LibreOffice-possessive-genitive-case-and-partitive-case-month-names.html#commentshttps://erack.org/blog/wfwcomment.php?cid=20https://erack.org/blog/rss.php?version=2.0&type=comments&cid=223@127.0.0.1 (erAck)
<p> Poss...what? you may ask.. yes, the month that owns the day-of-month.
<p> In some languages (and thus locales) a month name has different cases, depending on the context the name is used in. That's for example the case in Slavic languages, Greek, Russian, Finnish, Gaelic, ... and probably a few more I didn't hear of. Totally unknown to native English speakers ;-) (which I'm not)
<p>
<ul>
<li> a standalone month name is the nominative case, the noun, as in <em>November</em>
<li> a possessive genitive case month name can be described as "the month's day", as in <em>November's 17th</em>
<li> a partitive case month name can be described as "day of month", as in <em>17 of November</em>
</ul>
<p> This feature in number formatting was requested for quite some time. Recently I found time to implement it. To achieve this, I added optional elements to <span title="The Free Office Suite, try it, love it, share it. Fantastic Project. Fun People." class="serendipity_glossaryMarkup">LibreOffice</span>'s internal locale data &lt;LC_CALENDAR&gt;&lt;Calendar&gt; element and implemented the general rules in the number formatter.
<p>
<h5> Locale data (submitted by localizers) </h5>
<ul>
<li> &lt;MonthsOfYear&gt; element, nominative (nouns) month names
<ul>
<li> always specified
<li> includes &lt;Month&gt;, &lt;MonthID&gt;, &lt;DefaultAbbrvName&gt; and &lt;DefautFullName&gt; elements
</ul>
<li> &lt;GenitiveMonths&gt; element, genitive case month names
<ul>
<li> optional
<li> follows the &lt;MonthsOfYear&gt; element
<li> consists of same elements as &lt;MonthsOfYear&gt; element
<li> if &lt;GenitiveMonths&gt; are not specified then &lt;MonthsOfYear&gt; names are used in the context of the number formatter's genitive case
</ul>
<li> &lt;PartitiveMonths&gt; element, partitive case month names
<ul>
<li> optional
<li> follows the &lt;GenitiveMonths&gt; element, or follows the &lt;MonthsOfYear&gt; element if the &lt;GenitiveMonths&gt; element is not specified
<li> consists of same elements as &lt;MonthsOfYear&gt; element
<li> if &lt;PartitiveMonths&gt; are not specified then &lt;GenitiveMonths&gt; names are used, if that is not specified then &lt;MonthsOfYear&gt; names are used
</ul>
</ul>
<p>
<h5> Rules for use of nominative / genitive / partitive case month names in number formatter when encountering MMM or MMMM </h5>
<ul>
<li> MMM or MMMM immediately preceded or followed by a literal character other than space &rArr; nominative month name (noun), for Excel and backwards compatibility such as Finnish MMMM"ta"
<li> no day of month (D or DD) present in format code &rArr; nominative name
<li> day of month (D or DD) after MMM or MMMM &rArr; genitive name
<ul>
<li> no genitive names defined &rArr; nominative names
</ul>
<li> day of month (D or DD) before MMM or MMMM &rArr; partitive name
<ul>
<li> no partitive names defined &rArr; genitive names
<ul>
<li> no genitive names defined &rArr; nominative names
</ul>
</ul>
</ul>
<h5> NOTE: </h5>
<p> If only &lt;MonthsOfYear&gt; and &lt;PartitiveMonths&gt; are specified but not &lt;GenitiveMonths&gt;, then for MMM(M) D(D) formats the &lt;MonthsOfYear&gt; nominative name is displayed. Only for D(D) MMM(M) formats the &lt;PartitiveMonths&gt; name is displayed.
<p> If only for MMM(M) D(D) formats the &lt;GenitiveMonths&gt; are to be displayed but nominative names for D(D) MMM(M), then specify &lt;PartitiveMonths&gt; identical to &lt;MonthsOfYear&gt;, do not omit it as otherwise it would inherit from &lt;GenitiveMonths&gt; again.
<h5> Screenshot </h5>
<p> To illustrate, here's a screenshot using the Finnish fi-FI locale, Finnish is an extraordinary case that uses all three, nominative, genitive and partitive case month names.
<div class="serendipity_imageComment_center" style="width: 558px"><div class="serendipity_imageComment_img"><!-- s9ymdb:1 --><img class="serendipity_image_center" width="558" height="484" src="https://erack.org/blog/uploads/screenshots/date_nominative_genitive_partitive_fi_FI.png" title="Screenshot of nominative, genitive and partitive month names in Finnish." alt="Screenshot of nominative, genitive and partitive month names in Finnish." /></div><div class="serendipity_imageComment_txt">Screenshot of nominative, genitive and partitive month names in Finnish.</div></div>
<h5> Locales featuring month name cases </h5>
Currently for the following locales genitive and/or partitive case month names were contributed:
<ul>
<li> [an-ES] Aragonese, Spain
<li> [ast-ES] Asturian, Spain
<li> [be-<span title="Attribution" class="serendipity_glossaryMarkup">BY</span>] Belarusian, Belarus
<li> [fi-FI] Finnish, Finland
<li> [gd-GB] Gaelic (Scottish), United Kingdom
<li> [la-VA] Latin, State of the Vatican City
<li> [lt-LT] Lithuanian, Lithuania
<li> [ru-RU] Russian, Russia
</ul>
<p> As you can see, that's only a few locales and not all that should be covered.
So if you're working on localization of LibreOffice and your language uses
month name cases, please contribute the locale data additions as lined out
above. Best send a patch of your locale's .xml data file as attachment to the
developer mailing list and I'll pick it up. If uncertain how to do that just
ask and we'll help. For an example how the data looks like see
<a href="https://erack.org/blog/exit.php?url_id=3&amp;entry_id=2" onmouseover="window.status='http://cgit.freedesktop.org/libreoffice/core/plain/i18npool/source/localedata/data/lt_LT.xml';return true;" onmouseout="window.status='';return true;" title="the Lithuanian locale data file">i18npool/source/localedata/data/lt_LT.xml</a>
and search for GenitiveMonths. If you're interested in technical details of locale data files see
<a href="https://erack.org/blog/exit.php?url_id=4&amp;entry_id=2" onmouseover="window.status='http://cgit.freedesktop.org/libreoffice/core/plain/i18npool/source/localedata/data/locale.dtd';return true;" onmouseout="window.status='';return true;" title="the locale data DTD file">i18npool/source/localedata/data/locale.dtd</a>
<p> Happy month casing :-)
<p>
Tue, 20 Dec 2011 18:12:00 +0100https://erack.org/blog/archives/2-guid.htmli18nl10nLibreOfficelocale datanumber formatter