Language detection

It is very useful to be able to detect the language of a content instead of asking the user to pick the language in a dropdown list. In this tutorial we will learn how to detect a language by using M#.

Context

You are writing a trading platform application, users can sell goods and are able to specify a description for it. In the requirements you have to display the description in the current user's language. For example if a French seller wants to sell goods and target German people, he may want to directly write the description in German. Your application must be able to detect the language of the description and if needed translate it to another language.

Implementation

We will use the Google API for detecting the language. M# makes easy to detect the language, simply call this function:
First of all you have to add your Google API key in the web.config file with the parameter "Google.Translate.Key", then add the parameter "Enable.Google.Autodetect" and set it to "true" to enable the feature.
The

GoogleAutodetectLanguage

function will return an object of type

GoogleAutodetectResponse

from which you can get the language, the ISO code and the reliability of the detection.
If the detected language is not in your application languages ("Language" entity) Language will be null, but you will be still able to get the detected ISO Code.
Now when a user request for the description in our application we can detect the language and then translate to the current user's language if the language is different.

Going further

Disable the auto-detect feature

You can disable the language detection feature by setting the "Enable.Google.Autodetect" parameter to "false" in web.config. If this parameter is missing the language detection feature will be by default disabled.

Configuration problem

Like for translations, when you use the auto-detection API you could have a configuration problem. You can know this by checking the

Translator.IsGoogleMisconfigured()

function. For reconfiguring it simply call the

Translator.ReconfigureGoogleTranslate()

method.

Google API limits

Like for translations you cannot send requests to the Google API with more than 2000 characters, which should be long enough for detecting the language. If your query is longer than 2000, M# will throw you an

ArgumentOutOfRangeException

so make sure to provide a valid request to the API.
You can get the maximum number of characters allowed in the query by calling

Translator.GOOGLE_PHRASE_LIMIT

. Do not forget that your content will be URL encoded so if you want to check the length of your query use

HttpUtility.UrlEncode("Your text to detect...")

.

Save money

Improve workflow

In our context we check the description language and translate each time the user requests for the trade description. This is a very bad practise, first the Google API is not free and second this is not efficient. Instead of this you can create a Language property in the Trade entity and get the persisted language of the description each time it is requested by a user, if null detect the language and save it. If there is a language you don't need to use the API once again. Moreover you should persist the translation for each requested language to avoid unnecessary call to the API and use cache.
Process - Simplified example:

Limit number of words

At the time of writing you have to pay the Google API depending on the number of characters in your request sent to Google. The limit of characters is 2000, which is a lot if you only need to detect the language. For translations you have to provide the full text to translate but for detecting the language a good practise is to pass a reasonable number of characters, which is enough for Google to find the language (i.e. first 200 characters or first sentence).