If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

Detect the language of a string

I have a string coming from a form. Should I identify with PHP language in which it was written. I have seen that there are methods with the Google API but I think there are two problems, the first is that after a defined number of research I have to pay, the second that I get requests from libreirie Additional Zend I can not install on my hosting. Maybe I'm wrong?
Is there a way 100% free and does not require libraries difficult to install on a Hosting?

There are a few things to know about this. First, minimal text won't give you a very good guess-- the more you have, the better the guess will be, but it will still be a guess. Writing systems will give you a good clue (Chinese vs. English for example) but not between similar languages.

So one thing you can consider is why you want to do this-- for example, if you need to detect this for the purpose of text formatting (left-to-right vs. right-to-left, etc.) or text-encoding*, then you can probably do this based on the writing system itself (which isn't too hard to figure out-- you could even do that yourself just by checking the range of unicode input).

[*However, in most cases the best solution for text-encoding is always going to be unicode-- no major disadvantages, and it will encode anything.]

You will need to use an external service (there's nothing you can install on the server, beyond an API). Google is quite good, but they've made some weird decisions with the Translate API, and I would highly recommend against relying on it-- they discontinued it with little notice to developers, which was, compared to expectations, somewhat shocking (and received a lot of negative feedback). So Google probably isn't the answer (even though the technology is good). At this point I'm not sure what they're offering (or how long they will continue to support it), but I wouldn't trust it.

Other companies do have similar services. In general, language detection is easier than translation because there are often a few key indicators that can separate languages (such as writing system and details therein). In difficult cases, it would need to rely on a dictionary to guess about words in the text. There is a relevant failure rate.

Many of these are available for Javascript (as APIs), so you might need to adapt it for PHP.

As for Google, I'm not sure about those technical details. There's probably some workaround (such as adapting the Javascript API). What you need is a way to send a request and get the response-- then you need a way to format it. The PHP API is probably not technically required, just the part that requests the info. But whether you're able to adapt it will be based on your skill in PHP (and your time). You can probably do it if you want to put the time into it.

As a general answer, you will need to search for various services and look at what they offer.

[**I just tested it out now, and I really like it. The response is clear and includes a confidence level so you know how reliable the guess is. And I tried a handful of languages, and the results were good.]

There is a limit on the number of queries (5,000/day), but that's a lot, and you should realize that paying a fee is not unreasonable here-- you're using an external service for something you can't do yourself. I realize that might not fit your budget, but that's the nature of this project.

One option would be to try to write your own script, at least to get some general idea about the language. If you do figure it out, then you could use that; if not, you could query their service, but that way only when you need to.

If you have other questions about this, let me know-- this is an area I'm interested in (an intersection of my interest in coding and studying Linguistics).