We seek your support to meet expenses relating to formatting of articles and books, maintaining and running the journal through hosting, correrspondences, etc.Please write to the Editor in his e-mail address mthirumalai@comcast.net to find out how you can support this journal.

Also please use the AMAZON link to buy your books. Even the smallest contribution will go a long way in supporting this journal. Thank you. Thirumalai, Editor.

Contributors from South Asia may send their articles toB. Mallikarjun, Central Institute of Indian Languages, Manasagangotri, Mysore 570006, India or e-mail to mallikarjun@ciil.stpmy.soft.net. PLEASE READ THE GUIDELINES GIVEN IN HOME PAGE IMMEDIATELY AFTER THE LIST OF CONTENTS.

Your articles and booklength reports should be written following the MLA, LSA, or IJDL Stylesheet.

The Editorial Board has the right to accept, reject, or suggest modifications to the articles submitted for publication, and to make suitable stylistic adjustments. High quality, academic integrity, ethics and morals are expected from the authors and discussants.

A Proposal for Standardization of English to Bangla
Transliteration and Bangla Universal Editor

Joy Mustafi, M.C.A. & B. B. Chaudhuri, Ph.D.

1. Introduction

Indian language technology is being more and more a challenging field in linguistics and computer science. Bangla (also written as Bengali) is one of the most popular languages worldwide [Chinese Mandarin 13.69%, Spanish 5.05%, English 4.84%, Hindi 2.82%, Portuguese 2.77%, Bengali 2.68%, Russian 2.27%, Japanese 1.99%, German 1.49%, Chinese Wu 1.21%]. Bangla is a member of the New Indo-Aryan language family, and is spoken by a vast population within the Indian subcontinent and abroad. Bangla provides a lot of scope for research on computational aspects.

Efficient processors for Bangla, which exhaustively deal with all the general and particular phenomena in the language, are yet to be developed. Needless to say transliteration system is one of them. To represent letters or words in the corresponding characters of another alphabet is called transliteration.

English to Bangla transliteration has no standard till now. Some early systems like Lekho [1], Pata [2], Bangla Pad [3] are not very user-friendly having complex rules for character mapping. Some keyboard layouts like Ekushey [4], Avro [5], and Bijoy [6] have Unicode [7] or ASCII [8] or ISCII [9] mappings, which are again very hard to use, particularly when these systems deal with compound clusters of consonant characters. The main problem is that there is no particular rule for English to Bangla transliteration.

The system described here proposes a standard, a definite rule, and application program for writing, editing, storing, reusing and viewing Bangla text in a digital media. The English text can be stored as simple plain text file in any platform and may be used for other research activities like machine translation, information retrieval, spell checker, optical character recognition, speech technology and other Bangla language technologies [10].

This system is designed for English to Bangla character conversion and representation of Bangla in Unicode. The mapping of characters follows the morphological structure and the spelling rule of Bangla. Though some systems were developed earlier for phonological representation, but, for visual editor or storage of Bangla corpus, the spelling is the more important than the pronunciation.

A universal editor for Bangla is also proposed here which follows the standard and represents Bangla in Unicode [11] with suitable Bangla open type font. The text in English script is used for the input, which can be browsed from any location, and the Bangla Universal Editor converts the text into Bangla and displays it in the specified window.

The Bangla Unicode output can be used for the development of Bangla software like operating system, compiler, word-processor, dictionary, web-page [12] and other software. It is useful in writing emails, messages, blogs in Bangla. The standards, methodologies and applications are described here.

1.1 Objective

The main objective of the work is to introduce a standard for English to Bangla transliteration system. Advanced research on Bangla language technology [13][15][16][17] by us is already established. Bangla corpus is used for developing many language technology systems.

Some early research on Bangla also proposed some transliteration rules or character mapping [14]. As there is no standard for Bangla transliteration, Bangla corpus cannot be stored in a specific format. As a result, the researchers get different representation of Bangla text from different sources. If one can write Bangla text in English script, and can store data in plain text, it may not require any other specific software for Bangla. A simple ASCII editor (like Notepad, gedit, nedit) will work. It will become platform independent also.

To view and edit the English script written to represent Bangla, a Universal editor is introduced here, which can be used for correction or modification of the Bangla text written in English script. However, in this editor, one can see the Bangla text in English and as well as in Bangla font simultaneously in separate frames of the same application program.

1.2 Justification of the Proposal

The proposal for the standard is necessary as there is no standard available for Bangla transliteration. Some important points discussed here are:

Phonetic Character Chart (English characters are chosen very close to the phonetics of Bangla characters, but the word construction rule obeys the spelling of correct Bangla words. In most cases the phonetic character chart is maintained, but there are a few exceptions)

Simple Representation of Character Clusters (easy to parse the input text).

Please ensure that your name, academic degrees, institutional affiliation and institutional address, and your e-mail address are all given in the first page of your article. Also include a declaration that your article or work submitted for publication in LANGUAGE IN INDIA is an original work by you and that you have duly acknolwedged the work or works of others you either cited or used in writing your articles, etc. Remember that by maintaining academic integrity we not only do the right thing but also help the growth, development and recognition of Indian scholarship.