Multi-Lingual Web – Challenges & Opportunities

My article was published in Thinking Aloud! magazine (December 2011). The entire issue focused on Indian languages on the internet and mobile.

India has about 100 million desktop & 50-150 mobile internet users. These numbers vary depending on whom you ask. IAMAI’s recent report says 18 million use the internet in India on a daily basis. Clearly, we have a long way to go to make this ‘daily’ number bigger.

One of the reasons I constantly hear from people is – Indians are not going online because the Internet is for English speaking elite user base. There isn’t enough content in Indian languages for the masses to be engaged on the internet. Yes, this is true when compared to the English language but there is sufficient language content to keep a user busy.

One of the main reasons publishers have stayed away from language space is monetization. Publishers are under the impression there isn’t big enough a user base for language content, “prospective” internet users are staying away from using the internet because there isn’t enough language content. One of them has to take the first step forward and in my opinion, it has to be the publishers.

Indian Language Font Issues

Currently, all major operating systems (Windows, Mac, Linux) support Indian language fonts (Unicode). However the same is not true on mobile handsets. It is deplorable that major handset makers like Nokia, Samsung and even Android OS have not fully embraced Indian languages on their handsets.

Recently it was announced that Murty Classical Library of India which is funded by the legendary Narayana Murthy would create new Indic fonts, Murty Indic Fonts. We hope with such a big man behind people will have easier access to Indic fonts.

Wikipedia Promoting Indian Languages

Wikipedia India has done a great job in promoting Indian languages on the internet. Indian language Wikis get about 40 million page views a month, may not be big but this number has grown very well over the last 2 years. The blog post gives a clear picture of where Indic Wikipedia stands today.

Indic Discovery Challenges On Google

As mentioned earlier, a there is reasonable amount of content in Indian languages on the internet. But most new users don’t know about its existence. Search engines like Google, Bing and now Facebook are the only way to discover content but all platforms are yet to improve their language intelligence when compared to English.

Until the increase in usage of Unicode fonts search engines were just not able to index Indic websites. Publishers realized what they were missing, the majority of them moved to Unicode fonts. A site like Oneindia saw a 50% increase in referrals from Google after it moved to Unicode.

However, a lot more has to be done on the search engine front. A typical English content site could get as much as 60% of its traffic from Google. Language sites in India don’t get referral traffic anywhere close to that number.

However, one must thank Google for their “Google Bus” project which took a bus with internet access around to many Tier-2 towns of India.

Indian Language Keyboard Issues

Venture Capitalists in India have hesitated to invest in Indian language portals for a long time. Few mentioned to me that this space had to scale and the only way to scale was to get User Generated Content (UGC).

We have met few individuals who were interested in contributing Indic content but had difficulty in using English keyboards to key in the data.

There are three well known input methods,

Transliteration/Phonetic keyboard

Inscript/Language keyboard

Soft/Virtual keyboard

Each of these keyboards is explained in detail on one of my previous blog posts.

There are sufficient tools to input text in Indian languages. It would be useful for the tool providers (including gmail) to support all the input methods provided in this post. This would ensure the majority of online users being able to input/type language text.

State Governments should do the needful for training school students in Indic language keyboards.

Indic on Social Networking

Recently Twitter officially started supporting Indian languages. But for a while, individuals & publishers like Oneindia have been using Indian languages on the internet.

Facebook has been supporting a few popular Indian languages for a while now. If you change your language in your settings, the entire interface changes to that language of your choice. Facebook has been smart in recognizing the Indic opportunity. It is not as big as English but over time we will see some traction for Indian languages.

Upcoming Opportunities in Indic Space

With internet penetration growing in India publishers the consumption of Indian languages on the internet & mobile is set to grow. There is a huge shortage of qualified content writers in the language space. Content writers should look at tapping this space at the earliest and establish themselves in the Indic space.

Translation companies have a good opportunity in this space provided they don’t overcharge the customers.

Mobile app companies can also benefit by developing language based apps for the 600+ million mobile userbase in India.