IIT Madras develops easy OCR system for reading Bharti Script

Reaserchers from IIT Madras have developed easy multi-lingual optical character recognition (OCR) schemes for reading documents in Bharati script. They also have developed universal finger-spelling language (method) for the nine Indian languages It was developed in collaboration of TCS, Mumbai and can be used to generate sign language for hearing-impaired persons

About Bharati script

It is unified script for nine Indian languages which is being proposed as a common script for India to bring down many communication barriers.

The nine languages are Devnagari, Bengali, Gurmukhi, Gujarati, Oriya, Telugu, Kannada, Malayalam and Tamil. English and Urdu have not yet been integrated as they have a very different phonetic organisation.

It was developed by team of researchers from IIT-Madras headed by Professor V. Srinivasa Chakravarthy.

OCR schemes for Bharati script

It involve first separating (or segmenting) document into text and non-text. The text is then segmented into paragraphs, sentences words and letters. Each letter is then recognised as character in some recognisable format such as Unicode or ASCII. The letter has various components such as basic consonant, consonant modifiers, vowels etc.