Distributed Speech Recognition Issues

In my diploma thesis ``Distributed Speech Recognition for wired/wireless data networks (Internet,GPRS) using 2 Kbps coding'' (see [7] and related publications [8], [9], [10]) the use of Distributed Speech Recognition (DSR) over data instead of voice networks is proposed as a promising technology for information retrieval, especially from Wireless Information Devices (WID) like PDAs & smartphones. The emergence of wireless data networks like GPRS and the low data rate achieved (only 2 Kbps) make this technology promising although many related issues have still to be considered before the deployment of this technology to the real world.

For a technology to be successful criteria like user friendliness, low cost, standard compliance along with competitive advantages over other similar technologies need to be met. Of course one could study many related issues & topics, but trying to keep the scope limited to the most interesting of all issues and the organization as clean as possible, 4 different areas are finally discussed in this report, each including several related such issues. Firstly a related standard for DSR proposed by ETSI (European Telecommunication Standards Institue) is reviewed. Then a close look at GPRS standard is given, since it is the standard expected to give a boost to data applications over wireless networks. Next some related work by W3C & Wap Forum is reviewed & the report closes by examining application development issues related to WID platform.

Although all four areas are more or less related to DSR topic, they might seem to be irrelevant to each other. The purpose of this short report is not to closely connect all four topics together but rather just focus on 4 different topics as seen from the DSR perspective. The collection of related information although a bit difficult was fruitful enough, since knowledge from many interesting different areas was acquired. Compacting all that information to a small report was also a bit tedious. I hope that this report will be interesting to the reader too.

This section deals with a DSR proposal by ETSI. Most of the content was extracted from [3] and describes the STQ Aurora group's work on the standardization of a front-end for DSR applications. The general idea is well described but some interesting extra information is missing (or can further been acquired by membership to the specific group) and some parts of it are rather confusing. Anyway, an effort is given to present most of the information as clearly as possible right here.

The DSR idea is based on decoupling the front-end mechanism from the recognition process. This way the front-end mechanism can be located for example to a mobile terminal/phone (client) while the recognition process can take place in a remote central host (server). ETSI's work is focused on standardizing the front-end mechanism so that all front-end information produced by different clients is identical, no matter what kind of device the client is. Some requirements should be met which according to ETSI proposal are :

data transmission rate of 4.8 kbps

low computational & memory requirements

low latency

robustness to transmission errors

Focus is given on using data instead of voice network. The first reason is that mobile voice networks degrade performance due to low bit rate speech coding & channel transmission errors. Using an "error protected" data channel instead, recognition performance can been kept in higher levels. The second reason is the use of one single speech coding scheme instead of today's 4 different speech coding schemes. Apart from that, easy integration of speech and data applications can been established and ubiquitous access from different networks with guaranteed performance can been achieved.

The block diagram of DSR system is given in the following figure.

[scale = 0.41]images/ETSI/eps/fig1.epsi

The front-end mechanism produces Mel Frequency Cepstral Coefficients (MFCC) which are quantized using split VQ techniques in order to decrease the data rate to a minimal limit. Before putting this info stream onto the wireless channel the bytes are grouped into blocks (frames) and a simple error correction technique is applied. On the server side the frames are decoded after some error detection and correction (if needed) takes place and the reproduced MFCC features are used for recognition (using continuous acoustic models).

The front-end mechanism was proposed by Nokia and uses a modern approach used by most modern recognition systems. 13 coefficients per frame are produced and the logE is added as an extra coefficient (front-end is depicted in Figure 2).

[scale = 0.35]images/ETSI/eps/fig2.epsi

The VQ technique was proposed from Motorola and uses a Split VQ scheme.
A 64 size codebook is used for each pair of coefficients from C1 to C12, yielding 6 x 6bits, a total of 36 bits.
Also a 256 size codebook is used for C0, LogE coefficients (8 bits) yielding a total of 44 bits per frame.
Adding 4 bits of CRC for error protection, a data rate of 4.8 Kbps is produced (48bits/frame x 100 frames/sec).
To make sure that VQ does not produce a degradation in recognition performance compared with using the floating point MFCC parameters directly, some experiments took place using both Resource Management (RM) and ATIS databases.

[scale = 0.35]images/ETSI/eps/fig3.epsi

Some extra experiments for evaluation of error robustness were also held using TIdigits task. The process of creating a test database that has been subject to channel error is shown in Figure 3. Test set of digits at 20dB SNR with models trained on multicondition 8Khz data were used. TETRA and GSM channels with varying quality (EP1 to EP3) were used and the channel error mask was applied before decoding. The results are shown in Table 3. Although EP3 represents an extreme, only a 5% drop in performance was noticed in contrast with EFR GSM which gives a performance of 78.1% as shown in Figure 4.

[scale = 0.35]images/ETSI/eps/fig4.epsi

The latency introduced over the wireless channel is 10 ms for encoder, 9.6 ms for transmission over GSM (at 9.6 kbps), and 0 or 20 ms for decoder (error free or error mitigation) yielding a total of 30 ms at most. As a conclusion the STQ Aurora group has completed the preparation of a standard for DSR proposed at Feb 2000. A second future standard that will give half the error rate in noise is expected sometime in 2002. The working group will also cooperate with other organizations (WAP, W3C) in order to define appropriate combinations of protocols in the chain from the client terminal to recognition server utilizing DSR standard.

Since 1992 when GSM was first deployed in Europe, an enormous growth of the GSM mobile networks has been achieved,
creating a mass market with more than 150 million subscribers, about half the population. In parallel the evolution
and growth of Internet has raised the demand for the integration of mobile and data communications in order to
create new mobile data services. This industry shift to personal communication services opens up new exciting business
opportunities for operators & content providers. Considering the business theory which suggests that successful
apps & products are customer-driven the adoption of the first mobile protocol to offer the above integration can
be considered as major move towards the so-called "mobile internet". What GPRS really offers is the integration with
Internet/Intranet by data-enabling current GSM networks. In this section some key features of GPRS networks will be
given, along with more technical info on network characteristics, architecture and protocol stack (see [2]).

[scale = 0.45]images/GPRS/eps/table1.epsi

Before talking about GPRS a short review of GSM is given (see [1]). GSM was the first digital mobile network to be deployed &
incorporates key ideas like cellular coverage with adaptive cell size(see Table 1). For multiple access a combination of FDMA and TDMA is used. Specifically a total of 25 MHz bandwidth is available, which is divided into 124 carriers of 200 kHz each (FDMA) and by turn carries 8 time-slots (TDMA). The speech coding used is the RPE-LTP producing a data rate of 13Kbps (260 bits per each frame of 20 msecs). The number of bits per frame finally put onto the radio link is 456, since error protection is of major consideration. The 260 bits are divided into 3 classes according to the noise robustness. Class Ia contains the 59 first bits, Class Ib the following 132 bits and Class II the last 78 bits. 3 bits of CRC code is added to Class Ia and the addition of Class Ib and 4 bit tail is followed by a half rate convolutional code which plus Class II yields the total of 456 bits. Interleaving is used and the 456 bits are divided into 8 blocks of 57 bits plus 26 bits used for equalization as shown in Figure 2. Finally the signal is modulated by using Gaussian-filtered Minimum Shift Keying (GMSK) method.

[scale = 0.42]images/GPRS/eps/gsm2.epsi

In order for GPRS to become a mass product it must offer features that add user value and friendliness. In this paragraph some of those key features are briefly discussed. One of the most important features GPRS offers is the instant & constant connectivity. What instant connectivity means is that the cumbersome setup procedure of circuit switched data calls will no longer needed. Constant connectivity means that the mobile terminal is always ``online'' (attached to the network) no matter if data are transmitted or not. In contrast with circuit switched calls, GPRS doesn't require the preallocation & setup of a virtual circuit - this is done rather dynamically yielding network efficiency and utilization for the operator & a new billing scheme for subscriber, since he will only pay for data traffic he generates. This is perhaps the most beneficial feature for the end user among some other features like increased data rates (that theoretically can reach up to 171.2 Kbps - see next paragraph). GPRS features will be beneficial for many existing apps found in todays circuit switched environment (SMS, Wap, browsing, email, news-alerts) but new specially designed services & apps are also expected to become available that will take advantage of the GPRS environment.

[scale = 0.38]images/GPRS/eps/fig3.epsi

The maximum data rate of 171.2 Kbps is a theoretical limit that is far away from the actually expected data rate. Because much of the press coverage of the upcoming GPRS is presented inaccurately mostly because of the marketing people this paragraph aims at clarifying this topic. The truth is that the expected data rate will be modest in size & vary significantly. Actually 3 restrictions are responsible for this : allocation of timeslots, restrictions to terminals & availability of Coding Schemes (CS). As noted above each carrier ``carries'' 8 timeslots, but not all of them will be available for GPRS traffic, since the circuit switched traffic will still be the dominant traffic for still many years. So, perhaps 1 timeslot will be statically assigned for GPRS traffic, while the rest 7 timeslots will be dynamically assigned between GPRS & circuit switched traffic as depicted in Figure 3. Next it is extremely difficult for terminals to support 8 timeslots for uplink/downlink since this would impose great complexity & would require great processing & transceiver power for such a small device. So manufacturers will initially support (1,3) and up to (2,4) timeslots for uplink/downlink respectively. Third but not last, the availability of Coding Schemes is also crucial factor. GPRS will support 4 different Coding Schemes (CS1-CS4) with different data rates ranging from 9.05-21.4 Kbps per timeslots(see Table 2). 171.2 Kbps can be realized only when CS4 is used for all 8 timeslots, but this is not expected to really happen since supporting CS3 or CS4 requires changes to Abis link (see Figure 4) and is very risky from an economical perspective. Given the above restrictions a maximum of 53.6 Kbps (4 timeslots, CS2) can be achieved for downlink on radio link. In practice the data rate seen at application level will be less, depending on QoS, available timeslots & noise, yielding a typical of about 30 Kbps, far less than the advertised 171.2 Kbps!

[scale = 0.49]images/GPRS/eps/table2.epsi

To obtain access to GPRS, a mobile phone or terminal that supports GPRS is needed & the subscription and activation of an account to a GPRS enabled mobile network. Most network operators & mobile manufacturers are supposed to support GPRS by 1st quarter of 2001. By early 2002 GPRS will be incorporated as a standard into new GSM phones and lately that year the second standardization phase of GPRS will become available, addressing even more issues & offering even better services (for GPRS roadmap see Table 4). In the early phase of the GPRS adoption the most common use will be the connection of a laptop computer to Internet via a GPRS enabled mobile phone but as apps will become mature enough and mobile phones more advanced, direct use of GPRS from these phones is estimated to form the main marketplace mass.

[scale = 0.41]images/GPRS/eps/table4.epsi

GPRS may be the most successful standard on the way to 3rd generation mobile networks (UMTS) since it offers so many new features while still being compatible with existing GSM networks. To upgrade an existing GSM network to GPRS, 2 new nodes need to be added, namely Gateway and Serving GPRS Service nodes (GGSN & SGSN). Hardware upgrades to Base Station Controller (BSC or just BS) are needed and software updates for most of the rest GSM nodes. A GPRS enabled GSM network is depicted in Figure 4. The main difference to note is that now BS has two connections : one with Mobile Switching Center (MSC) for circuit switched traffic and one with SGSN node for packet switched data (GPRS traffic). The GPRS backbone is connected to other GPRS operators backbones or directly to an external data network (Internet/Intranet) via GGSN.

[scale = 0.36]images/GPRS/eps/fig4.epsi

Going into more details it should be noted how the upgrade will effect the network components. Beginning from mobile stations (terminals) three classes will be available. Class A terminals will support simultaneous circuit & packet switched traffic, Class B will support simultaneous attach but not simultaneous traffic & Class C will support only one kind of attach each time. In the GPRS architecture BS acts as a concentrator , bringing together connections from Base Transceiver Stations (25-150). BS is responsible for setting up, supervising & disconnecting CS & PS connections to MSC/SGSN respectively as noted above. To support packet switched data, a hardware upgrade (Packet Control Unit (PCU) is required which actually adds Radio Link Control (RLC) & Medium Access Control (MAC) layers to the radio interface. GRPS Support Nodes (GSN) functionality is given below :

mobility management for class A and B mobile terminals by interworking with the MSC/VLR

to route and transfer packets between mobile terminals and the GGSN

to handle packet data protocol (PDP) contexts

to interwork with the radio resource management in the BSS

to generate charging data

Gateway GSN functionality includes :

to function as a border gateway between the PLMN and external networks

to set up communication with external packet data networks

to authenticate users to external packet networks

to route and tunnel packets to and from the SGSN

to generate charging data

To 've a better idea of how the implementation of the GPRS protocol stack is implemented Figure 6 highlights the transmission of data traffic between the ``main'' GRPS nodes (namely MS , BSS, SGSN & GGSN) and a host. The Application Layer may include other protocols like HTTP, SNMP, IMAP built over TCP/IP. TCP/IP is needed for interoperability with Internet/Intranet and poses some interesting challenges regarding its successful implementation over wireless networks, which are shortly addressed :

[scale = 0.30]images/GPRS/eps/fig6.epsi

First of all lets remind that TCP was designed with wireline communications in mind & offers
reliable flow of data on the end-to-end connection,
retransmission of unacknowledged packets (buffers at both sender & receiver) and
an adaptive timeout mechanism called Automatic Repeat Request (ARQ).

Implementing TCP over GPRS wireless environment is radically different since
high and varying delay due to available bandwidth & high noise levels will exist.
This will cause retransmissions on Radio Link Control due to varying radio conditions.
While TCP thinks packets were lost and not just delayed it goes to the well known "slow start state".
To avoid this, GPRS offers fast RLC ARQ, causing many RLC retransmissions before TCP times out.

As far as IP is regarded, the IP user addresses will be allocated via GGSN, from an ISP or LAN using RADIUS/DHCP techniques. Both dynamic/static, both private/public addresses from either IPv4/IPv6 will be available.

Speech Recognition technology may be the dominant application for voice networks (wireline/wireless) but what are the chances of successful deployment of DSR applications for wireline/wireless data networks? Speech Recognition will 've to compete with other technologies which try to adapt the Internet information space to WID platform (WAP, W3C's XHTML). DSR is expected to be a complementary service where the other services fail to be appealing enough(small screen size for XHTML). In this section the WID platform is presented and some of the competing technologies are reviewed such as XHTML, WAP, ETSI STQ Aurora applications group and W3C's Voice Interfaces.

The internet evolution has made easy the access to the enormous information space of the WWW. The demand for information retrieval is expected to be also high for devices like PDAs & smartphones (WIDs). But the Internet technology has been designed for desktop computers supporting medium to high bandwidth connectivity over generally reliable data networks. Wireless devices present a more constrained computing environment compared to desktop computers (less powerful CPUs, less memory (ROM and RAM), restricted power consumption and input/output scalability). Moreover wireless data networks also present a more constrained communication environment compared to wired networks (less bandwidth than traditional networks, more latency than traditional networks, less connection stability than other network technologies, and less predictable availability). So what are the currently available approaches to present WWW information to WIDs ?

The W3C approach is motivated by the huge device diversity versus the content to deliver. The first step to this approach is to categorize the different devices used for information retrieval along with the various input/output methods used for user interface. Such a possible categorization is shown in the following table.

Output Methods

Devices

Input Methods

1600 x 1200 pixels

Desktop Computers

Windows+icons+mouse

1024 x 768 pixels

Set top box + TVs

Full keyboards

800 x 600 pixels

Public Kiosks

On screen keyboard

640 x 480 pixels

Watches

Touch Tablet

Web TV

Web pads

Pens

Public kiosk

Handhelds/PDAs

Speech only devices

Palm Pilot size

Mobile Phones

Speech augmented

Cell phone size

Standard phones

Phone keypads

Very large screens

Pagers

Wheels, knobs

Audio only

Electronic books

Buttons

Audio augmentation

Walkmans

Hand gestures

3D screens, VR

Automobiles

Facial expressions

Color, Black/white

TTYs (no graphics)

Brainwave detection

Video capable

Cameras

Combinations

Electronic paper

toys

....

Projection screens

...

....

A simplified model for study is depicted in the next figure dividing devices to linear , semi-linear and non-linear. Voice browsing for example is perhaps the only linear method since information is presented as a ``one piece at a time'' fashion to the user.

[scale = 0.41]images/Interface/wearfig2.epsi

Another simple model is to divide devices according to their presentation capabilities. One such categorization is shown in the next figure

[scale = 0.41]images/Interface/wearfig5.epsi

Since it is not possible to use a ``unified'' device to represent all devices the key idea is to accept device diversity but disallow such diversity to be applied directly to content source. The idea is that to support service convergence one should keep the content source same independently of the device to be presented and insert a transformation layer capable of making the right content transformation for a specific family of devices or just a specific device. So the key idea is to completely separate the content from its presentation. To add support for a new device, just a new presentation method is needed.

This way it would be possible to avoid creating directly different content for different devices as done today. Today one has to create separate sites :

a WAP site for handheld devices

a HTML site for larger screens and

possibly another site for voice browsers

Following the approach proposed from W3C a unified information space can be made possible. The content reuse approach proposed by W3C is depicted in the following figure.

[scale = 0.39]images/Interface/wearfig4.epsi

The content source is written in XML(Extensible Markup Language) and transformed into [X]HTML by XSLT(Extensible Specification for Language Transformation) or DOM(Document Object Model), which in turn can be rendered using different CSS(Cascading Style Sheets) to different device families. XHTML will be the convergence of HTML and WML(Wireless Markup Language - used by WAP) and will be renderable everywhere (W3C & WAP cooperation is addressed later in this section). Several groups inside W3C have been formed in order to address content reuse issues. Composite Capability/Preferences Profiles (CC/PP) is such a group which deals with the specifications of profiles the content transformation is based on. Other related groups such the ``Mobile Interest Group'' deals with issues specific to information access for mobile devices and is in close cooperation with WAP forum.

The WAP forum was formed by telecommunication leaders (Ericsson ,Nokia, Motorola, etc) in order to bring internet content to mobile devices in the best possible way.
The objectives of WAP Forum are :

This effort created an initial hype (as with many promising technologies) but soon after the initial adoption of the standard the disappointment had become evident to every single person talking to media about WAP. It seems that the truth relies between hype and disappointment. First of all WAP has the advantage of being bearer independent. Today it is experienced through GSM CSD which not only poses a cumbersome procedure to the user but also is expensive enough since the user has to pay for the duration of the connection, independently of the amount of information retrieved. If GPRS proves to be a successful standard the WAP may be considered again as a successful application. A possible mistake of the approach taken by WAP may be the rush to bring Internet to mobile devices which led to the unavoidable marriage of content & presentation, a radically different approach compared to W3C's content reuse.

In order to eliminate the gap between W3C & WAP approach a cooperation has been established. The cooperation goals include the creation of a unified information space based on common standards and technologies & the design and delivery of sophisticated information and services to mobile devices. W3C's Mobile Access Interest Group is responsible for presentation of the information (e.g. through CSS), management of information (e.g. through RDF) & technologies that structure and distribute data as objects (e.g XML and HTTP-NG). WAP related work will include issues dealing with bandwidth efficiency, smart web proxies, efficient protocols and content encoding, latency constraints and content scalability.

W3C has recently realized the importance of voice interfaces and has formed the Voice Browser Working Group.
The interest for voice interfaces comes from the following facts :

far more people today have access to a telephone than to a computer with an Internet connection

sales of cell-phones are booming enabling access services any time and any where

especially useful for devices that are not equipped with full-browsers or even the screens to support them

The Voice Browser Working Group works on a draft for defining markup languages for speech interfaces. The Markup Languages to be specified include :

N-gram or Speech Recognition Grammar ML

Natural Language Semantics ML

VoiceXML for Dialog Management

Speech Synthesis ML

These markup languages along with the architecture shown in the following figure compose the W3C Speech Interface Framework

[scale = 0.41]images/Interface/voice-intro-fig1.epsi

Another figure showing the incorporation of IP interface to Speech Interface Framework is shown right after.

[scale = 0.36]images/Interface/new_arch2-crop.epsi

The IP interface can be used by standards like ETSI's DSR. In fact, ESTI has formed the ``DSR Applications & protocols'' group in order to implement complete end-to-end DSR services using the front-end standard (protocol elements, system architecture, API, etc.) The group intents to cooperate with W3C and WAP Forum for harmonising protocol elements and multimodal markup language for DSR applications.

In this section, issues affecting the development of related software for WIDs are brought into focus. First of all one should make clear that application development for WIDs is entirely different compared to desktop PCs. WIDs impose a constrained environment for application development mainly because of their hardware characteristics (slow CPU, limited amount of RAM & ROM, slow network connections). Although there are already many applications for PDA-like devices (for which well known APIs are offered), what about ``less-like PCs'' devices like smart-phones or even ``simple'' mobile phones? What are the choices for application development on such devices ? Two different development platforms, namely the Symbian & Java platforms which appear to be two promising application environments for WIDs are presented in this section.

As previously noted WIDs impose a constrained environment for application development. CPU speeds are low although lately the adoption of new generations of CPUs for embedded market has been appeared. For example for the PDA market there is currently a wide range of CPUs available, ranging from the 16 MHz Motorola DragonBall processor to the latest 206 MHz Intel Strong Arm processor. Available memory is restricted to up to 16MB ROM and 32 MB RAM. But the User Interface is what is radically different. There is no mouse/keyboard and the preferred input modality is pen/buttons. Screen resolution is limited to a 320x240 for advanced PDAs. And that is just for PDAs which are considered a ``high-end'' solution. PDAs are powered by Operating Systems like Windows CE versions 2.* & 3.0(mainly high-end PDAs), Symbian's EPOC, Palm OS and Linux which is also a promising solution since it is widely adopted by many programmers & also open source. Symbian's EPOC is an OS specially designed from the ground up for devices ranging from high end PDAs to smart-phones. That is why we will 've a closer look at it later in this section (see [6]).

For no PDA devices, specifications of hardware & operating system are hardly known to public (for example nobody knows what OS runs on Ericsson's T10 mobile phone). That is why apps for those devices are still extremely limited to third party developers & are still manufacturer-specific. Here comes the Java platform acting as an OS wrapper opening application development for such devices. In fact many mobile phone manufacturers claim they will support J2ME (Java 2 Micro Edition - see [4]) for their next generation phones & estimations has shown that very soon at least 60% of development will be done using the Java platform.

The Symbian Platform is supported by Ericsson, Nokia, Motorola, Panasonic and Psion. As noted above it is an OS specially designed for WIDs from the ground up. Some key features of the Symbian platform, version 6.0, are :

two reference designs : Quartz and Crystal

four program and content development options : C++, Java, WAP, and Web

close integration of contacts information, messaging, browsing and wireless telephony

Also next release of EPOC (version 6.1) will incorporate GPRS support, enabling the deployment of many new applications to take advantage of the wireless data channel.

The huge diversity of computing platforms ranging from Servers to mobiles phones is a fact. Java's promise to bring a ``common'' application environment which will be platform independent is ``accomplished'' by providing 3 different Java editions : Enterprise, Standard & Micro editions. Java 2 Micro Edition is defined for a wide range of devices (mobile phones - PDAs).
But even in this range of devices big differences can exist between different ``families'' of such devices. To address this problem the J2ME defines ``configurations'' & ``profiles''. A configuration defines the minimum Java technology libraries and VM capabilities for a family of devices whereas a profile is a collection of APIs that supplement a configuration to provide capabilities for a specific vertical market or device type (Fig.1).

Limited power & intermittent connectivity to a network (often wireless)

Extremely constrained UIs, small screens

Such devices include many mobile phones. Sun provides CLDC reference implementation built using the KVM which is a new VM redesigned especially for WIDs. Note that KVM lacks support for floating point arithmetic or JNI! The first can be dealt by using fixed point arithmetic & the second by using an external (native) program.

The only currently available profile for CLDC is the Mobile Information Device Profile (MIDP) which
targets at mobile devices implementing J2ME CLDC Profile & addresses :

Display toolkit, User input methods

Persistent data storage using simple record-oriented database model

HTTP-based networking using CLDC Generic Connection framework

Figure 1:
3 editions of Sun's Java platform

[scale = 0.36]images/Develop/eps/j2me.epsi

Java apps for WIDs are expected to be simple apps requiring a minimal amount of resources. For more
resource demanding Java apps the use of jit, static compiler or even native java processor (www.zucotto.com)
will be needed. For even more advanced applications a custom hardware solution may be needed. Parthus
(www.parthus.com) offers such solutions.

Developers can use a standard PC for testing the application. Emulators for both Symbian & Java platforms
are already available. For GPRS testing Ericsson's test center ([5]) claims to support testing &
troubleshooting of GPRS apps.

The world of wireless communications changes rapidly providing new exciting technologies & offering new
opportunities for development of new applications. The different efforts to bring internet content to WIDs
(W3C, WAP), can be complemented by DSR technology that can enable the usage of information retrieval for
WIDs by exploiting the best user interface method (human speech), the low bit rate achieved in ([7]) &
the GPRS protocol to offer easy and low cost deployment of such services.