Language problems hold up Asians on net

By Frederick Noronha Penang (Malaysia), June 25 (IANS) Asians were the most numerous internet users worldwide as of 2001. Yet, access to the net and computing are still largely restricted to those with knowledge of the English language.

A Pakistani localisation expert, working on solutions here, has suggested that the market is overlooking the computing needs of a number of smaller Asian tongues.

“Most internet activity is in languages that most don’t speak in Asia. Much content is in Chinese and English, for example, while there are 3,000 plus languages in Asia, and those are not addressed,” Sarmad Hussain, a Pakistani localization expert, told IANS.

Supported by the Canadian government’s IDRC (International Development Research Centre), the PAN Localization project seeks to boost the computing prowess of languages like Bangla, Dzongkha (of Bhutan), Khmer from Cambodia, Bhasa Indonesia, Lao, Mongolian, Nepali, Pashto, Sinhala and Urdu.

Hussain said the project was started in 2003 with six countries — Bangladesh, Bhutan, Cambodia, Laos, Nepal and Sri Lanka. Afghanistan was added later.

“The initial part was mostly enabling the very basics. Which means, operating system, developing the keyboards and fonts. For most of these language, even at the national level, it was (earlier) impossible to do word pressing or simple emailing (in the smaller languages),” he said here.

PAN Localization project aims to develop character sets, fonts, spelling and grammar checkers, speech recognition systems, machine translation and other related local language applications. This would make it easier to publish online in Asian languages.

Studies conducted under this project show that 20 different Asian languages have “varying degrees of support” for computing in their local scripts. For instance, Chinese, Korean and Japanese language computing is “very mature”.

But there are extreme cases too.

With 13 million speakers, Khmer, the official language of Cambodia and which shares features with Thai, lacked a standard keyboard till 2005.

“Residents of rural areas in developing Asian countries are particularly limited by their lack of understanding of English. Efforts to provide Internet infrastructure and training must be complemented by efforts to provide content in languages these users understand,” he said.

Hussain gives the example of the Bhutanese language of Dzongkha, used by some 600,000 people - too small and unconnected a number for any big player to offer solutions for.

With the Department of IT and Dzongkha Development Authority, the PAN Localization project developed a GNU/Linux Free Software-based solutions for the Dzongkha language.

He said different countries had achieved diverse levels of success based on their “capacity-context”.

“Some advanced partners were Bangladesh and Sri Lanka, where some of the basic work was already done, moved on to things like spell checkers for Bangla, OCR (optical character recognition) and text-to-speech in Sri Lanka,” said Hussain.

Asked about the state of Urdu computing in Pakistan, a language India also shares, Hussain said much of the “basic work” had been done. Fonts and keyboard standards — important to ensure that solutions work across computers — had been developed with the government, he said.

“I know of an Open Office Urdu localised release and Firefox localisation done for Urdu in India. We’ve used them too,” he said.

“Since it’s open source, we took that technology. That helped us for 60-70 percent of the work, but we adapted it to a Pakistani context. That made our work easier.”

He added that Pakistan’s English educated techies might also be less sensitive to local language computing needs as are their Indian counterparts.

Specialized localization centres have also been developed in four countries — Bangladesh, Sri Lanka, Nepal and Laos — through this project. Resources are being developed on both the GNU/Linux (free) and Microsoft (proprietorial) platforms.