A string is often a combination of a fixed string and something changing, for
example, Welcome,James is a combination of the fixed part Welcome,,
and the changing part James. The naive solution is to localize the first
part and the follow it with the name:

_('Welcome, ')+username

This is wrong!

In some locales, the word order may be different. Use Python string formatting
to interpolate the changing part into the string:

_('Welcome, {name}').format(name=username)

Python gives you a lot of ways to interpolate strings. The best way is to use
Py3k formatting and kwargs. That’s the clearest for localizers.

The worst way is to use %(label)s, as localizers seem to have all manner
of trouble with it. Options like %s and {0} are somewhere in the
middle, and generally OK if it’s clear from context what they will be.

Sometimes, it can help localizers to describe where a string comes from,
particularly if it can be difficult to find in the interface, or is not very
self-descriptive (e.g. very short strings). If you immediately precede the
string with a comment that starts L10n:, the comment will be added to the
PO file, and visible to localizers.

Example:

rev_data.append({'x':1000*int(time.mktime(rdate.timetuple())),# L10n: 'R' is the first letter of "Revision".'title':_('R','revision_heading'),'text':unicode(_('Revision %s'))%rev.created#'url': 'http://www.google.com/' # Not supported yet})

Strings may be the same in English, but different in other languages. English,
for example, has no grammatical gender, and sometimes the noun and verb forms
of a word are identical.

To make it possible to localize these correctly, we can add “context” (known in
gettext as “msgctxt”) to differentiate two otherwise identical strings.

For example, the string “Search” may be a noun or a verb in English. In a
heading, it may be considered a noun, but on a button, it may be a verb. It’s
appropriate to add a context (like “button”) to one of them.

Generally, we should only add context if we are sure the strings aren’t used in
the same way, or if localizers ask us to.

Example:

fromtowerimportugettextas_...foo=_('Search',context='text for the search button on the form')

This method takes three arguments because English only needs three, i.e., zero
is considered “plural” for English. Other locales may have different plural
rules, and require different phrases for, say 0, 1, 2-3, 4-10, >10. That’s
absolutely fine, and gettext makes it possible.

When a string is very long, i.e. long enough to make Github scroll sideways, it
should be line-broken and put in a {%trans%} block. {%trans%}
blocks work like other block-level tags in Jinja2, except they cannot have
other tags, except strings, inside them.

The only thing that should be inside a {%trans%} block is printing a
string with {{string}}. These are defined in the opening {%trans%}
tag:

You can use ugettext or ungettext only in views or functions called
from views. If the function will be evaluated when the module is loaded, then
the string may end up in English or the locale of the last request! (We’re
tracking down that issue.)

Examples include strings in module-level code, arguments to functions in class
definitions, strings in functions called from outside the context of a view. To
localize these strings, you need to use the _lazy versions of the above
methods, ugettext_lazy and ungettext_lazy. The result doesn’t get
translated until it is evaluated as a string, for example by being output or
passed to unicode():

fromtowerimportugettext_lazyas_lazyPAGE_TITLE=_lazy(u'Page Title')

ugettext_lazy also supports context.

It is very important to pass Unicode objects to the _lazy versions of these
functions. Failure to do so results in significant issues when they are
evaluated as strings.

If you need to work with a lazily-translated string, you’ll first need to
convert it to a unicode object:

There is some user generated content that needs to be localizable. For
example, karma titles can be created in the admin site and need to be
localized when displayed to users. A django management command is used
for this. The first step to making a model’s field localizable is adding
it to DB_LOCALIZE in settings.py:

DB_LOCALIZE={'karma':{'Title':{'attrs':['name'],'comments':['This is a karma title.'],}},'appname':{'ModelName':{'attrs':['field_name'],'comments':['Optional comments for localizers.'],}}}

Then, all you need to do is run the extract_db management command:

$ python manage.py extract_db

Be sure to have a recent database from production when running the command.

By default, this will write all the strings to kitsune/sumo/db_strings.py
and they will get picked up during the normal string extraction (see below).

It’ll extract all the strings, create a .pot file, then create a
Pirate translation of all strings. The Pirate strings are available in
the xx locale. After running the test_locales.sh script, you can
access the xx locale with:

gettext is so fast for localization because it doesn’t parse text files, it
reads a binary format. You can easily compile that binary file from the PO
files in the repository.

We don’t store MO files in the repository because they need to change every
time the corresponding PO file changes, so it’s silly and not worth it. They
are ignored by .gitignore, but please make sure you don’t forcibly add them
to the repository.

We use Dennis to lint .po files for errors that cause HTTP 500 errors in
production. Things like malformed variables, variables in the translated
string that aren’t in the original and that sort of thing.

When we do a deployment to production, we dump all the Dennis output into: