This question was originally asked in the linguistics SE (which is not a good place to ask this kinds of thing).

Full question follows

I follow the software localization guideline, that states that any
constructed text should be constructed by the format string, not by
gluing pieces together.

For example: To get string Oranges: 50, use '${OBJECT_NAME}:
${QUANTITY}' string, not variables object_name + ': ' + quantity.
The point here is to give the person who will translate the text full
control of the order of elements in the string.

Now, my colleague insists that in this particular case with <Object
name>: <quantity> notation it is OK to just go ahead and glue strings
together.

I strongly suspect that it is not.

Please help me with examples of popular languages where this notation
would be a bad style or worse. (Popular means that we have at least
some chance to actually localize the software to that language. That
being said, exotic language examples are welcome as well.)

Are you sure that the spelling of the object name doesn't change with the quantity in every language? Probably not... That's why you need to keep them together, and use the support of your software tools to deal with these things.
–
AndréFeb 25 '13 at 12:47

So you have two options. Option 1 will always work, and option 2 will usually work - only work if all the languages follow the English convention.

It may be that option 2 will work, but you might just as easily find languages or situations down the line where that isn't a good idea. So why choose option 2? What do you gain by not using the localisation guidelines? I suspect not much.

You need to be thinking "what is the best way of doing this" rather than "what can I get away with not doing properly".

André's comment is right on. There is an interplay between the quantity and the noun form. And as Nicolas correctly points out, the expected separator string is not the same in every language.

In English, there are singular and plural forms: orange and oranges, which are sometimes glossed over with orange(s). Even this could be tricky to apply programatically, because of cases like brush(es). See An Algorithmic Approach to English Pluralization for one example of how programmatic pluralization could be approached in English.

Not all languages follow the same rules, and so you could have different forms depending on whether the noun would be considered to be in the subject or object of a sentence, and perhaps even by grammatical case.

Assumptions about how other languages work cannot be relied upon. For example, in English and many other languages, plurals are either singular or plural (two or more). In Arabic, there are forms for singular, dual, and plural (three or more).

Without seeing the information in situ it would be difficult to suggest a workable solution for all languages. At the very least I would think you should store 1x, 2x, and 3+x forms of the noun, the preferred separator string, and the number. And do not assume that if you rearrange the order (e.g. "3 oranges" vs. "Oranges: 3") that the localized strings should not also change. You should work with a user interface localization expert early on to make sure that this part of your solution is designed correctly. Character encoding, direction of text flow, and text expansion will also need to be considered.

ou could use the plural in all cases and in English, I guess it was just "Number of Files: 1" for example and be comprehensible. But globally, plural versus singular could be more of an issue, and there may also be variances of different amounts in some languages (Russian?) other than singular or plural.

Some technologies and frameworks such as that used by Mozilla's L20n approach allow the translator to make adjustments. For example:

For Bi-Di languages, the UI rendering technology should do the RTL (right to left) work, you don't need to manually flip the screen. For complete strings, allow the translator to move the tokens about.

The optimal solution is always to write complete strings. That's not always possible, or desirable from a storage or performance issue with s/w (and tokens are an established development practices), so you may need to a) research common patterns and code for those and/or b) allow translators to control the order or variances during localization.