Translate extension is extensible in many ways. The most likely ways to extend Translate is to add support for new file formats (link to section) or new message groups (link to section). Sometimes it is also useful to write a new message checks (link to section) or to extend Translate via hooks (link to section). Sometimes you might get along only by using the existing web API.

In addition to the concepts already mentioned, there are many more important concepts and classes in Translate that are useful to understand when hacking Translate. This pages aims to comprehensively detail all components of Translate.

In addition to hooks and interfaces that can only be used from PHP code, the WebAPI provides access to many message group and translating related information and actions. It is based on the MediaWiki API framework, supporting many output formats like json and xml.

Message groups bring together a collection of messages. They come in various types: translatable pages, SVG files or software interface messages stored in various file formats. Each message group instance has a unique identifier, name and description. In the code message groups are primarily referenced by their identifier, while the MessageGroups class can be used to get the instances for a given id. Message groups can also control many translation process related actions like allowed translation languages and the message group workflow states. Usually these behaviors fallback to the global defaults.

Translation aids are little modules that provide helpful and necessary information for the translator when translating. Different aids can provide suggestions from translation memory and machine translation, documentation about the message or even such a basic thing as the message definition – the text that needs to be translated.

Translate comes with many aid classes. Currently there is no hook to add new classes. Each class that extends the TranslationAid class only needs to implement one method called getData. It should return the information in structured format (nested arrays), which is then exposed via ApiQueryTranslationAids WebAPI module. In addition to the aid class, changes are needed to actually use the provided data in the translation editor(s).

One special case of translation aids are machine translation services. See the next section.

Adding more machine translation services can easily be done by extending the TranslationWebService class. See the webservices subdirectory for examples. You will need some basic information to implement such a class:

URL for the service

What language pairs are supported

Whether they use language codes that differ from the codes used in MediaWiki

Whether the service needs an API key

When you have this information, it is straightforward to write the mapCode, doPairs and doRequest methods. You should use the TranslationWebServiceException to signal errors. The errors are automatically logged and tracked, and if the service goes down, it will automatically be suspended to avoid unnecessary requests to it. The suggestions will automatically be displayed in the translation editor via the MachineTranslationAid class and the ApiQueryTranslationAids WebAPI module. See also $wgTranslateTranslationServices to see how those services are registered.

We use computers to catch simple errors in translations, like unbalanced parenthesis or failing to use a variable placeholder. These checkers can emit warnings that are displayed in the translation editor (constantly updating). Any warning present in saved translation will also mark the translation as outdated (fuzzy in jargon). Each message group determines which checks it uses.

Message collection provides access to the list of messages for a message group. It is used to load a set of languages for certain group in a certain language. It provides paging and filtering functionality.

There is currently a limitation that all messages in a collection must be in the same namespace. This prevents the creation of aggregate groups that include groups which have messages in different namespaces.

Here is short a example of how to use message collection to load all Finnish translations of group core and print the first ten of them:

When rendering bitmap graphics, suitable fonts are needed for each language or script. To solve this problem, the FCFontFinder class was written. It uses the fc-match command of the package fontconfig (so this doesn't work on Windows) to find a suitable font. Many additional fonts should be installed on the server to make this useful. It can either return a path to a font file or the name of the font, whichever is more suitable.

The messages of file-based message groups are stored in CDB files. Each language of each group has its own CDB cache file. The reason for cache files are twofold.

First they provide constant and efficient access to message data avoiding the potentially expensive parsing of files in various places. For example the list of message keys for each group can be loaded efficiently when rebuilding a message index.

The second reason is that the cache files are used together with the translations in the wiki to process external message changes. Having a snapshot of the state of translations in files and wiki (hopefully consistent at that point) allows us to automatically deduct whether something has been changed in the wiki or externally and make intelligent choices, leaving only real conflicts (messages changed both externally and on the wiki since last snapshot) to be resolved by the translation administrator.

Message index is a reverse map of all known messages. It provides efficient answer to the questions is this a known message and what groups does this message belong to. It needs to be fast for single and multiple message key lookups. Multiple different backends are implemented, with different trade-offs.

Serialized file is fast to parse, but don't provide random access and is very memory inefficient when the number of keys grow.

Database backend provides efficient random access and full load with the expense of little slower individual lookups. It also doesn't need to write to any files avoiding any permission problems.

Also memory backend (memcached, apc) is provided, which could be useful alternatives to database backend in multiple server setups to reduce database contention.

Message index does not support incremental rebuilds. Thus rebuilding the index gets relatively resource intensive when the number of message groups and message keys increase. Depending on the message group, this might involve parsing files or doing database queries and loading the definitions, which can take a lot of memory. The message index rebuilding is triggered in various places in Translate, and by default it is executed immediately during the request. As it gets slower, it can be delayed via the MediaWiki job queue and run outside of web requests.