Follow Us

English–Portuguese machine translation quality improvement stalled

We regularly measure the machine translation (MT) quality in English–Spanish and English–Portuguese. The latest results show that the English–Portuguese MT quality of major MT vendors has not improved at all in four months.

That is, the improvement in MT quality has stalled and we do not expect the quality to improve much in the near future. The situation is similar in English–Spanish MT which seems to have reached the plateau already in last year.

This discovery is in line with the expectations of many industry analysts. For some time it has been known that increasing the amount of MT training material produces only smaller and smaller improvements in MT quality. Google has said that 50% increase in training material improves the quality only 0.5%.

With practically all the available training material already used, the quality improvements have indeed been small. And now our discoveries have apparently confirmed this view, first in English–Spanish and now in English–Portuguese.

Concerning the future of MT, this means we do not expect the English-Spanish and English–Portuguese MT to improve much in the near future. Any bigger improvement would require improvements in the underlying MT technology, i.e. implementing linguistic ideas and research in the actual MT engines. And that takes time.

Increasing the amount of training material can not be expected to produce any considerable improvements in these languages. Regarding other languages, the amount of training material is often smaller than in the Spanish, Portuguese and other major languages.

Therefore the small and medium size languages can be expected to improve considerably more by increasing the training material. So, the quality in other languages is expected to approach the quality in English–Spanish and English–Portuguese MT.