Sign up to receive free email alerts when patent applications with chosen keywords are publishedSIGN UP

Abstract:

The invention relates to remote access systems and methods using
automatic speech recognition to access a computer system. The invention
also relates to an intelligent agent resident on the computer system for
facilitating remote access to, and receipt of, information on the
computer system through speech recognition or text-to-speech read-back.
The remote access systems and methods can be used by a user of the
computer system while traveling. The user can dial into a server system
which is configured to interact with the user by automatic speech
recognition and text-to-speech conversion. The server system establishes
a connection to an intelligent agent running on the user's remotely
located computer system by packet communication over a public network.
The intelligent agent sources information on the user's computer system
or a network accessible to the computer system, processes the information
and transmits it to the server system over the public network. The server
system converts the information into speech signals and transmits the
speech signals to a telephone operated by the user.

Claims:

1. A method of providing access to a computer system over a network, the
method comprising the steps of: receiving at the server system a first
speech command from a user; determining a type of command for the first
speech command; loading a set of indexed possible utterances associated
with the type of command; receiving at the server system a second speech
command associated with the first speech command; interpreting the second
speech command using the set of indexed possible utterances; at the
server, generating first packet data based on the first and second speech
command; transmitting the first packet data over the network from the
server to an agent; at the server system, receiving second packet data
from the agent in response to the first packet data, wherein the agent is
operable to generate at least a portion of the second packet data from a
dynamic set of sources; and generating and transmitting a speech signal
to the user based on the second packet data.

2. The method of claim 1 further comprising receiving user identification
information; authenticating the user at the server system based on the
user identification information; and prior to transmitting the first
packet data, sending an initialization packet comprising the user
identification information to the agent, wherein the identification
information is for use by the agent to validate the user by comparing the
identification information contained in the initialization packet to
corresponding information stored at the agent.

3. The method of claim 1, wherein each utterance of the set of indexed
possible utterances is ranked based on a probability of usage.

4. The method of claim 1, wherein the second packet data corresponds to
an object accessible to the computer system.

5. The method of claim 4, wherein the second packet data comprises a
summarized form of the object, and wherein the summarized form is
generated by the agent based on the object.

6. The method of claim 1, further comprising the steps of establishing an
encrypted connection between the server system and the agent resident on
the computer system.

7. The method of claim 1 wherein the speech command received from the
user is selected from the group consisting of: retrieving, creating, and
modifying data accessible to the agent.

8. The method of claim 1 further comprising receiving a username and
password from the user in order to generate identification information.

9. The method of claim 1 wherein the first packet data and the second
packet data are generated based on a communication protocol that provides
a mechanism for a variety of speech requests and payload replies to be
handled by the server system, wherein the protocol specifies that the
first packet data and the second packet data comprise a name of a service
type to be accessed or delivered, an action to be performed, and
parameters.

10. A system for providing access to information over a network, the
system comprising a server system, the server system having computer
program code accessible thereto which, when executed by the server
system, causes the server system to: receive a voice call from a user;
receive a first speech command from the user; determine a type of command
for the first speech command; load a set of indexed possible utterances
associated with the type of command; receive at the server system a
second speech command associated with the first speech command; interpret
the second speech command using the set of indexed possible utterances;
establish a connection with an agent; generate first packet data based on
the first and second speech commands; transmit the first packet data over
the network to the agent; receive second packet data from the agent,
wherein the agent is operable to generate at least a portion of the
second packet data from a dynamic set of sources accessible by the agent;
and generate and transmit speech signals to the user based on the second
packet data.

11. The system of claim 10, wherein the computer program code executed by
the server system, causes the server system: receive user identification
information; authenticate the user at the server system based on the user
identification information; and prior to transmitting the first packet
data, send an initialization packet comprising the user identification
information to the agent, wherein the identification information is for
use by the agent to validate the user by comparing the identification
information contained in the initialization packet to corresponding
information stored at the agent.

12. The system of claim 10, wherein the computer program code executed by
the server system, causes the server system to configure a security
manager to manage a security policy for authenticating the user.

13. The system of claim 10, wherein the computer program code executed by
the server system, causes the server system to configure a security
manager to manage a secure connection to the agent on the computer
system.

14. The system of claim 10, wherein the agent has access to a remote
server and the user is associated with a user account for the remote
server.

15. The system of claim 10, wherein the computer program code executed by
the server system, causes the server system to configure a speech server
for communicating with the user using automated speech recognition for
received speech commands and automatic text-to-speech conversion for to
generate speech signals for transmission to the user.

16. The system of claim 15, further comprising a voice relay server in
communication with the speech server and in communication with the agent
for receiving data from the agent and for transmitting command request
data to the agent corresponding to the speech commands received from the
user at the speech server.

17. The system of claim 16, wherein the computer program code executed by
the server system causes the server system to maintain a user information
datastore; and wherein the voice relay server compares user
authentication information received from the user at the speech server to
the user information data store to authenticate the user for access to
the agent.

18. The system of claim 17, wherein the user authentication information
is determined by the speech server based on an identification utterance
received from the user over the public telephone network.

19. The system of claim 14, wherein the remote server hosts a plurality
of user accounts and the agent facilitates remote access to each of the
user accounts via the server system.

20. The system of claim 10 wherein the server system comprises a voice
relay server configured to index the speech commands received by the user
by maintaining a profile of speech commands for the user, wherein each
speech command in the profile of speech commands is associated with a
probability based on the frequency of usage by the user; the profile of
speech commands for facilitating automated speech recognition of speech
commands received from the user.

21. The system of claim 10 wherein the first packet data and the second
packet data are generated based on a communication protocol that provides
a mechanism for a variety of speech requests and payload replies to be
handled by the server system, wherein the protocol specifies that the
first packet data and the second packet data comprise a name of a service
type to be accessed or delivered, an action to be performed, and
parameters.

Description:

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application is a continuation of application Ser. No.
12/631,241, filed on Dec. 4, 2009, which is a divisional of application
Ser. No. 11/560,161, filed on Nov. 15, 2006, which is a continuation of
International Application No. PCT/CA2005/000776, filed on May 20, 2005,
which claims the benefit of U.S. Provisional Application Ser. No.
60/573,022, filed on May 21, 2004, the contents of which are herein
incorporated by reference.

FIELD OF THE INVENTION

[0002] The present invention relates to remote access systems and methods
using automatic speech recognition to access a computer system. The
invention also relates to an intelligent agent on a computer system for
facilitating remote access to, and receipt of, information on the
computer system through speech recognition or text to speech read-back.

BACKGROUND OF THE INVENTION

[0003] Desktop computers and larger computer systems are increasingly
being used to store private, secure and personal information for both
consumers and business users alike. This information is not easily
accessible from any remote location where the consumer or business user
may find themselves. Mobile phone data technologies have been developed
by companies such as Visto and GoToMyPC to allow a remote customer to
retrieve some information stored at their business on a server, or on a
personal computer, but these programs require very good data coverage
networks, high priced smart mobile devices, expensive data plans and
considerable sophistication on the part of the user. Added to this is
that they are not practical or safe for use in a vehicle, while driving.

[0004] It is desired to address or ameliorate one or more disadvantages or
shortcomings of existing methods for remote access of a computer system,
or to at least provide a useful alternative thereto.

SUMMARY OF THE INVENTION

[0005] One aspect of the invention relates to a method of providing remote
access to a computer system over a public network, the method comprising
the steps of:

[0006] receiving at a server system a voice call from a user of the
computer system;

[0007] processing information received from the user
during the voice call to generate identification information;

[0008] authenticating the user based on the identification information;

[0009] establishing a connection between the server system and the
computer system based at least in part on the identification information;

[0010] receiving at the server system speech commands from the user;

[0011] generating first packet data based on said speech commands;

[0012] transmitting the first packet data over the public network to an
agent on the computer system;

[0013] receiving second packet data at the server system from the agent in
response to the first packet data; and

[0014] generating a speech signal to the user the voice call based on the
second packet data.

[0015] Preferably, the agent is configured to interpret the first packet
data, interrogate application software installed on the personal computer
and generate second packet data based on the interrogation.

[0016] In one embodiment, the method further comprises receiving at the
server system an IP address of the computer system from the agent
resident on the computer system and the establishing is based also on the
IP address. This embodiment is applicable where the computer system is
protected by a firewall.

[0017] Another aspect of the invention relates to a system for providing
remote access to a personal computer over a network, the system
comprising a server system connected to the network and to a telephone
network, the server system being configured to:

[0018] receive a voice call over the telephone network from a user of the
personal computer and to receive information and speech commands from the
user during the voice call;

[0019] determine an IP address of the personal computer based at least in
part on the information received from the user;

[0020] generate first packet data based on the speech commands;

[0021] transmit the first packet data over the network to an agent running
on the personal computer;

[0022] receive second packet data from the agent via the personal
computer; and

[0023] generate speech signals to the user based on the second packet
data.

[0024] In a related aspect, the invention also relates to computer
readable program storage having stored thereon computer program
instructions which, when executed on a computer system, cause the
computer system to perform the method of providing remote access
described above.

[0025] Another aspect of the invention relates to a method of providing
automatic notification to a user, comprising the steps of:

[0026] providing on a computer associated with the user, at least one rule
for determining the occurrence of a notification event;

[0027] applying the at least one rule to determine the occurrence of the
notification event;

[0029] transmitting the notification data over a network to a server
system for communication of a notification, based on the notification
event, to a telephone of the user;

[0030] generating speech signals corresponding to the notification;

[0031] establishing a telephone connection to the telephone of the user;
and

[0032] transmitting the speech signals to the telephone of the user.

[0033] In a related aspect, the invention also relates to a system for
providing automatic notification to a user, wherein the system comprises
an intelligent agent running on a computer associated with the user and
the system comprises program code for causing the system to perform the
steps of the method of providing automatic notification described above.

[0034] In a further related aspect, the invention also relates to computer
readable media having stored thereon computer program instructions
executable by one or more computer processors to cause the one or more
computer processors to perform the steps of the method of providing
automatic notification described above.

[0035] A further aspect of the invention relates to a method of providing
notification to a user, comprising:

[0036] receiving a connection request over a network from an agent on a
computer associated with the user;

[0037] receiving notification data over the network from the agent;

[0038] processing the notification data to generate a speech notification;

[0039] establishing a telephone connection to a telephone of the user
based on the connection request; and

[0040] transmitting the speech notification to the telephone of the user
over the telephone connection.

[0041] Preferably, the step of establishing comprises determining a user
identifier from the connection instruction, accessing a database to
determine a telephone number of the user based on the user identifier and
calling the telephone number.

[0042] In a related aspect, the invention also relates to a system for
providing notification to a user, wherein the system comprises a server
system in communication with the user and an agent and the server system
is programmed to perform the steps of the method of providing
notification described above.

[0043] In a further related aspect, the invention also relates to computer
readable media having stored thereon computer program instructions
executable on a computer system to cause the computer system to perform
the steps of the method of providing notification described above.

[0044] A further aspect of the invention relates to a system for remote
access of a computer system over a public network, the system comprising:

[0045] a server system, the server system being in communication with a
public telephone network for receiving voice calls from a telephone
operated by a user and for transmitting automatically generated speech
signals to the telephone; and

[0046] an agent resident on the computer system, the agent being in
communication with the server system and programmed to index information
accessible to the computer system based on predetermined user rules and
to the transmit the information to the server system in response to a
predetermined event;

[0047] wherein, when the server system receives the information from the
agent, the server system is programmed to automatically generate speech
signals corresponding to the information and to transmit the speech
signals to a registered telephone associated with the user.

[0048] In one embodiment, the computer system is separated from the server
system by a firewall and wherein the agent is programmed to periodically
transmit an IP address of the computer system to the server system.

[0049] In one embodiment, the computer system is a personal computer.
Alternatively, the computer system is a server system and the user has a
user account on the server system.

[0050] In one embodiment, the predetermined event is a notification event
originating from the agent, the notification event being generated by the
agent based on at least one user-defined rule.

[0051] Preferably, the server system comprises a speech server for
communication with the user over a public telephone network using
automated speech recognition for the received speech signals and
automatic text-to-speech conversion for transmitted speech signals.

[0052] Preferably, the server system further comprises a voice relay
server in communication with the speech server and in communication with
the agent for receiving the information from the agent and for
transmitting command request data to the agent corresponding to a voice
command received from the user at the speech server.

[0053] Preferably, the server system comprises a user information data
store and the voice relay server compares user authentication information
corresponding to an identification utterance received from the user at
the speech server to the user information data store to authenticate the
user for access to the computer system.

[0054] Another aspect of the invention relates to a method of providing
access to a computer system over a network, the method comprising the
steps of:

[0055] receiving at the server system a first speech command
from a user;

[0056] determining a type of command for the first speech
command;

[0057] loading a set of indexed possible utterances associated
with the type of command;

[0058] receiving at the server system a second
speech command associated with the first speech command;

[0059]
interpreting the second speech command using the set of indexed possible
utterances;

[0060] at the server, generating first packet data based on
the first and second speech command;

[0061] transmitting the first packet
data over the network from the server to an agent;

[0062] at the server
system, receiving second packet data from the agent in response to the
first packet data, wherein the agent is operable to generate at least a
portion of the second packet data from a dynamic set of sources; and

[0063] generating and transmitting a speech signal to the user based on
the second packet data.

[0064] A further aspect of the invention relates to a system for providing
access to information over a network, the system comprising a server
system, the server system having computer program code accessible thereto
which, when executed by the server system, causes the server system to:

[0065] receive a voice call from a user;

[0066] receive a first speech command from the user;

[0067] determine a type of command for the first speech command;

[0068] load a set of indexed possible utterances associated with the type
of command;

[0069] receive at the server system a second speech command associated
with the first speech command;

[0070] interpret the second speech command using the set of indexed
possible utterances;

[0071] establish a connection with an agent;

[0072] generate first packet data based on the first and second speech
commands;

[0073] transmit the first packet data over the network to the agent;

[0074] receive second packet data from the agent, wherein the agent is
operable to generate at least a portion of the second packet data from a
dynamic set of sources accessible by the agent; and

[0075] generate and transmit speech signals to the user based on the
second packet data.

BRIEF DESCRIPTION OF THE DRAWINGS

[0076] Embodiments of the invention are described in further detail below,
by way of example only, with reference to the accompanying drawings, in
which:

[0077]FIG. 1 is a block diagram of a remote access system according to
one embodiment;

[0078] FIG. 2 is a block diagram of a voice relay server according to
another embodiment;

[0079]FIG. 3 is a block diagram of an intelligent agent according to
another embodiment; and

[0080] FIG. 4 is a process flow diagram of a method of establishing remote
access to a user computer system from a user phone; and

[0081] FIG. 5 is a process flow diagram of a method for automatically
notifying a user by telephone of an information event monitored by the
intelligent agent.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

[0082] In one aspect, the invention relates to arrangements made for using
the human voice to interact with a server system (including a speech
server and a voice relay server), which in turn interacts with an
intelligent agent running on a remote computer system to retrieve and
review information, such as personal information, accessible to the
computer system. Generally, the invention relates to methods which
involve converting speech input commands to computer commands which are
transferred to the specific computer system (ie. PC or server) associated
with the caller or the server where the caller's account or user-specific
information resides, in a secure manner. This method may be used where
the computer system does not necessarily have a public address (IP
Address) or where the server does not necessarily have a public IP
address, and where the PC or server is behind a firewall connected to an
Internet connection at all times.

[0083] Another aspect of the invention relates to an intelligent agent
residing in the computer system of a user and programmed to react to
rules-based events within or accessible to the computer system (such as a
new data item appearing in a database, or a news item appearing on a news
service with a particular matched keyword, or a new email message
containing a certain key word or phrase) and then to push a synthesized
speech message via the server system out to a (mobile or terrestrial)
telephone to alert the user to the event in a voice speech format.

[0084] Using voice commands to access information stored on a personal
desktop computer, and to have personal information on the desktop
computer converted back into voice and sent via voice lines to a
telephone, requires secure two way communications between three
components of the system. A spoken command or identity phrase identifies
the caller to a server which acts as a relay between the original voice
call and the callers home or business-based desktop computer or server
system. The caller must identify herself via a user identity, pin number
or through biometric speech pattern recognition so that the voice relay
server may directly connect the caller to her PC or server containing her
personal information. The connection is then held open in order that the
caller can issue speech commands to hear information from the caller's
personal PC or server account.

[0085] The intelligent agent on the PC or server is able to extract
information from sources, such as the Internet or local application
software programs, in order to securely deliver the information to the
voice server to be converted into a speech pattern for the caller to hear
the information. The user may issue speech commands to stop and start
services and programs on the user's desktop computer or server, or to
append, delete, forward or reply to information, which the intelligent
agent may source or otherwise be able to access. The user may also issue
speech commands to compose or create new data objects on the PC or
server, such as a calendar entry, email or database entry.

[0086] Referring now to FIG. 1, there is shown a block diagram of a remote
access system 100. The remote access system 100 comprises a server system
120 linked to a caller using a mobile or land-line telephone 110 over a
connection 115 through a public telephone network. Server system 120
processes speech signals received from the user and responds with
synthesized speech signals in order to facilitate verbal interaction
between speech server 120 and the user of telephone 110.

[0087] Remote access system 100 further comprises one or more computer
systems (shown as 280 in FIGS. 2 and 3) located remotely from server 120,
but in communication with the server 120 over a public network 140, such
as the Internet. Such remote computer systems 280 may comprise a personal
computer (PC) 145 or a server 150 having fixed IP addresses known to
server 120. Alternatively, or in addition, computer systems 280 may
comprise a PC 160 or server 165 behind respective firewalls 170. Server
150 and server 165 may have a plurality of user accounts 155 or 175,
respectively, associated therewith so that the user accounts are hosted
by the server. Computer systems 280 may be any kind of computer system,
including intelligent mobile devices and large server systems, that has
the ability to communicate over public network 140.

[0088] For a PC 160 or server 165 protected by firewalls 170, the
firewalls obscure the IP addresses from server 120. Accordingly, in order
for server 120 to communicate with PC 160 or server 165, PC 160 and
server 165 must communicate their respective IP addresses to server 120.
Such notification is performed by an intelligent agent 180 resident on
the PC 160 or server 165 and programmed to periodically notify server 120
of any change in the IP address of PC 160 or server 165. Alternatively,
the intelligent agent may perform such a notification function even when
the PC 160 or server 165 is not protected by a firewall 170. For example,
if PC 160 is connected to internet 140 by a dial-up modem connection
through an internet service provider, the PC 160 may not have a static IP
address. Accordingly, in order for the server 120 to establish a
connection to PC 160, PC 160 must notify server 120 of its current IP
address. The intelligent agent resident on the PC 160 or server 165
notifies server 120 of the IP address of the PC 160 or server 165 by
transmitting notification packet data 173 to server 120 via network 140.
The intelligent agent 180 is described in further detail below, with
reference to FIGS. 2 and 3.

[0089] Server system 120 comprises a speech server 122, a voice relay
server 124, a user table 135 and an administration interface 138. Speech
Server 122 is responsible for automatic speech recognition (ASR) of
utterances received from the user of telephone 110 over a public
telephone at link 115 and for automatic text-to-speech (TTS) conversion
for converting information received from the user's PC 145, 160 or server
account 155, 175 into audio speech signals for communication to the user
of telephone 110. Speech server 122 is preferably an IBM Voice platform
including an IBM websphere voice response server-telephone and
application integration coupled with an IBM websphere voice server for
voice response--TTS/ASR. Alternatively, speech server 122 may be a
VoiceGenie speech server running ASR and TTS resource management.

[0090] If speech signals received at speech server 122 from the user of
telephone 110 are recognized by the (ASR) module of speech server 122,
the corresponding text strings generated by the ASR module are
transmitted to voice relay server 124 through a two-way data link 125.
Voice relay server 124 processes the converted speech thus received from
speech server 122 and initiates appropriate actions based on the
processed speech. For example, during initiation of the remote access
process, the user is required to provide a user identifier (ID) and a
password. When these are received by voice relay server 124 from speech
server 122, Voice Relay Server 124 compares the user identifier and
password with entries in a user table 135 (or other form of database) to
authenticate the user as a valid user of the remote access system 100.
Once the user has been authenticated by server 120, voice relay server
124 treats further converted speech from speech server 122 as commands
concerning the user's interaction with the remote computer system. Voice
relay server 124 also use's user table 135 to store the current or last
known IP address of the computer system associated with each user. Once a
user is authenticated by server 120, data transmissions by voice relay
server 124 to the users computer system will be directed to the IP
address associated with that user in user table 135.

[0091] Administration interface 138 is used for entering the fixed IP
Addresses of PC 145 and server 150, as well as inserting and updating the
user phone numbers, ID and passwords in used in user table 135.

[0092]FIG. 1 shows remote access system 100 having a data link 142
connecting server 120 with PCs 145, 160 and servers 150, 165 via network
140. It should be understood that data link 142 may be any kind of
connection (virtual or physical) according to the appropriate network
protocols employed by public network 140. Data link 142 is unlikely to be
a physical connection. Further, data link 142 is unlikely to be the same
for each PC or server in communication with server 120. Rather, each PC
145, 160 or server 150, 165 communicates independently with a server
system 120. Data link 142 is preferably a secure socket layer (SSL)
connection.

[0093] Referring now to FIG. 2, the voice relay server 124 is shown and
described in further detail. Voice relay server 124 comprises several
functional modules for interacting with speech server 122, intelligent
agent 180 and user table 135. Voice relay server 124 comprises a voiceXML
conversion module 210 for accepting incoming voice commands from speech
server 122 and translating these commands into an internal data structure
(which is described in further detail below) for communication to a
command request module 230 of the voice relay server 124. The command
request module 230 interprets the internal data structure and generates
packet data based on the speech commands processed through the VoiceXML
conversion 210. Command request module 230 communicates with intelligent
agent 180 via data link 142 over a public network, such as the Internet,
by transmitting the packet data corresponding to the converted speech
command to the user.

[0094] Voice relay server 124 further comprises a notification module 220
for converting packet data received from intelligent agent 180 (as
described further in relation to FIG. 5) into the internal data structure
recognized by voiceXML conversion module 210 for transmission to the user
of telephone 110 using the text-to-speech conversion feature of speech
server 122. The notification module 220 may receive the incoming packet
data from intelligent agent 180 directly or through command request
module 230.

[0095] Voice relay server 124 further comprises a network ID registration
module 250 for receiving the periodically transmitted IP addresses of the
computer system 280 on which intelligent 180 is resident. This is only
done by intelligent agents 180 resident on computer systems having
dynamic IP addresses or which are otherwise unable to provide a fixed IP
address. network ID registration module 250 receives the IP address
notification packet 173 and stores the IP address in user table 135 with
the relevant user entry. For this purpose, the IP address notification
packet (or packets) indicates the user ID (and optionally also the
password) for which the updated IP address is being provided. Intelligent
agent 180 knows the IP address of voice relay server 124 from when the
intelligent agent 180 is first installed on the user's computer system
280.

[0096] Voice relay server 124 also has a user authentication module 270
for authenticating the user prior to setting up communications between
the voice relay server 124 and the intelligent agent 180 on the user's
computer system 280. User authentication module 270 communicates with
speech server 122 to receive a converted text string corresponding to the
user's spoken identification and password utterances. User authentication
270 then interrogates user table 135 and compares the user ID and
password strings received from speech server 122 with the entries in user
table 135. If there is a match of the received string with a user entry,
the user is authenticated to establish a link to the user's computer
system 280 and the last known IP address stored for that user in the user
table 135 is provided to voice relay server 124 for initiating
communication with the intelligent agent 180 resident on the user's
computer system 280.

[0097] If the user authentication module 270 is not able to match the
received user ID and password strings to an entry in user table 135, user
authentication module 270 communicates this match failure to speech
server 122, which provides a corresponding synthesized speech message to
the telephone 110 of the user. Such a message is also transmitted to the
user from speech server 122 if voice relay server 124 is unable to obtain
a response to the initiation packets transmitted from the voice relay
server 124 to intelligent agent 180 or if the user ID and password are
not recognized by the intelligent agent 180 or the user's computer system
280.

[0098] Referring now to FIG. 3, the intelligent agent 180 is shown and
described in further detail. Intelligent agent 180 comprises a command
request module 310 for handling the transmission of command (or
initialization) related data packets between voice relay server 124 and
intelligent agent 180. If command request module 310 receives an
initialization packet from Voice Relay Server 124 to establish a
connection with the computer system 280 which is a server on which
intelligent agent 180 is resident, the command request module 310 will
check the user ID and password information embedded in the initialization
packet by forwarding the packet to a user validation module 370 in
intelligent agent 180. If user validation module 370 matches the user ID
and password in the initialization packet with a stored user ID and
password for a valid user account on the server 280, the user is
validated for the computer system. Otherwise, user validation module 370
indicates to command request module 310 that the received user ID and
password do not correspond to a valid account on the server 280 and a
corresponding message is sent back to voice relay server 124 for
communication to the user via speech server 122.

[0099] Intelligent agent 180 further comprises a cache module 320 in
communication with the command request module 310 and in communication
with an indexing module 340. Cache module 320 keeps a local store of
personal or other information that can be retrieved quickly without
having to source the information from an external source, such as a data
source available through network 390.

[0100] Intelligent agent 180 further comprises indexing module 340, a
sourcing module 350, and a rules engine 360. The rules engine 360 is
programmed to source information on a scheduled basis according to a
number of user-defined rules. These rules may specify the kinds of
information which are to be sourced, such as emails, calendar items, news
items, database entries or other dynamic forms of information which are
updated or newly created over time.

[0101] The rules engine 360 is used in concert with sourcing module 350 to
source information primarily for the purpose of alerting a user
(according to the rules defined by the user) of new or updated
information of interest to the user so that a telephone call can be
placed to the user and the salient details of the information read to the
user by speech server 122. Alternatively, rules engine 360 may be used to
configure the user interface that the user encounters when connecting
into the user's computer system. For example, according to the rules, the
intelligent agent 180 may present only unread emails for the user to
retrieve and hear through speech server 122, rather than making the
user's entire email inbox available.

[0102] The sourcing module is responsible for periodically sourcing the
information (according to the rules in the rules engine 360) from data
sources on network 390 or on computer system 280. Network 390 may be a
local area network or a large network, such as the Internet. Sourcing
module 350 provides the source information to indexing module 340, which
compresses and indexes the information and provides it to cache module
320, which stores the compressed and indexed information in a data source
or database 330 accessible to the computer system 280. Sourcing module
350 may source information from within computer system 280 as well as, or
instead of, sourcing information from data sources in network 390.

[0103] Intelligent agent 180 further comprises a network ID notification
module 380 which transmits notification packets on a regular (scheduled)
basis to network ID registration module 250 of voice relay server 124 if
intelligent agent 180 is resident on a computer system 280 that does not
have a fixed IP Address.

[0104] Referring now to FIG. 4, a method 400 for establishing remote
access to computer system 280 from a user phone 110 is shown and
described. Method 400 begins at step 405, when a user calls into the
speech server 122. The speech server 122 can receive calls from either
mobile or terrestrial telephone networks. The user is prompted by an
interactive voice response system executing on speech server 122 to
provide the user's ID and password. This may be through spoken utterances
or by input in the direct keypad of the user's phone (or other
telecommunication device) 110 or a combination of these. The automatic
speech recognition function of speech server 122 converts the user's
utterances to one or more text strings at step 410. If the utterances are
not recognizable at step 415, speech server 122 requests the user, at
step 420, to try again to provide the user ID and password.

[0105] If the text is recognized at step 415, the user ID and password
strings are transmitted to the voice relay server 124 from speech server
122. At step 425, the voice relay server 124 checks the user ID and
password against entries in user table 135 to authenticate the user. If
the user ID and password are not recognized (i.e. not contained in user
table 135), the user is prompted for further input at step 420 and steps
410 to 425 are repeated (up to a predetermined number of times.)

[0106] If the user ID and password are recognized at step 425, the last
known IP Address corresponding to the matched user ID and password is
determined, at step 430. If no corresponding IP Address is found at step
435 or if a match is found but that IP Address fails to respond to an
initiation packet from Voice Relay Server 124, the user is notified that
the system does not have access to a valid or current IP Address, at step
440. The user may then be prompted for further input at step 420 or the
voice call may be terminated.

[0107] If, at step 435, an IP Address corresponding to the user ID is
found in user table 135, a communication initiation packet is transmitted
from voice relay server 124 to the intelligent agent 180 to computer
system 280 and, unless the intelligent agent 180 fails to respond with an
acknowledgement packet, a communication session is established between
the server system 120 and the user's computer system 280, at step 445.

[0108] Once a communication session is established, intelligent agent 180
and voice relay server 124 exchange packet data so that information can
be retrieved through command request module 310 of intelligent agent 180
and commands can be transmitted from command request module 230 of voice
relay server 124 for retrieving information requested by the user. Such
command transmission and information retrieval in response is performed
by transmission of data packets between computer system 280 and server
system 120 while the user of telephone 110 communicates with speech
server 120 in real-time, issuing speech commands and listening to
synthesized speech signals corresponding to the information retrieved
computer system 280.

[0109] Method 400 is used to facilitate the primary purpose of remote
access system 100, which is to enable a user to access information on the
user's computer system 280 while the user is located remotely from the
user's computer system 280, for example while traveling or working from
home. Once method 400 is performed and the communication session is
established between intelligent agent 180 and Voice Relay Server 124, the
user may retrieve, create or modify information or objects as desired,
including documents, emails, news items, calendar, objects, database
entries or other information or objects susceptible of translation
between speech and other data forms.

[0110] In another aspect, the remote access system 100 may be used to
automatically notify a user of new or updated information available to
computer system 280 (as sourced and indexed by intelligent agent 180) by
automatically establishing a connection between intelligent agent 180 and
server system 120 and automatically placing a telephone call to a phone
registered to the user. Such a method of automatic notification is shown
and described in FIG. 5 as method 500.

[0111] Method 500 begins with the creation by the user of rules for
sourcing information content to be provided to the user on an automatic
basis. The rules are created by the user at step 505, preferrably using a
rules creation wizard to assist the user in establishing the type of
information to be sourced, and the frequency with which it is to be
sourced and the conditions or timing by which by the user is to be
notified. The rules may also specify key words or other criteria for
sourcing the information.

[0112] At step 510, the intelligent agent 180 sources information from
computer system 280 or network 390 as described previously. At step 515,
intelligent agent 180 checks that the source information matches one or
more of the user defined rules. If the information does not match a rule,
it is discarded or at least not indexed by indexing module 340 and the
intelligent agent continues to source further information at step 510. If
the information does match one of the user defined rules at step 510, the
information is processed through indexing module 340 and cache module 320
as content objects (described below) and stored in database 330 until the
scheduled time for notifying the user. At the scheduled time, information
satisfying the user-defined rules is loaded into command request module
310 from cache module 320 and converted to packetized text at step 520.
Each of the packets having the notification information also includes the
user ID.

[0113] At step 525, the notification information packets constructed at
step 520 are transmitted to the command request module 230 of Voice Relay
Server 124 and passed on to notification module 220. Using the user ID
attached to the notification information packets received from computer
system 280, notification module 220 accesses user table 135 to determine
a telephone number corresponding to the user ID at step 530. Once all of
the notification information packets are received at Voice Relay Server
124, the notification module 220 processes the packets to place them in a
form readable by the VoiceXML conversion module 210 and the notification
information is transmitted to speech server 122 as a Voice XML object. At
step 535, speech server 122 converts the VoiceXML to synthesized speech
signals for transmission to the user.

[0114] At step 540, speech server 122 dials the user's phone number and,
if the user picks up the call, the speech signals are transmitted to the
user. If the user does not pick up the call, server system 120 may
prepare a non-speech notification to the user, such as a text message or
other form of short message indicating that the user has a notification
awaiting receipt on the server system 120 and providing a return access
number for the user to call to retrieve the notification (according to
method 400).

[0115] Advantageously, remote access system enables the user to easily
access information on the user's computer system 280 from a remote
location where the user would not otherwise be able to access his or her
desired information. Further advantageously, the remote access system 100
can be configured by the user to be proactive in providing information to
the user, automatically notifying the user of available information which
the user has already indicated he or she would like to receive remotely.

[0116] In one embodiment, the voice relay server uses a probabilistic
indexing technique for indexing utterances from the user. This indexing
technique allows for improved performance and greater efficiency for
accepting speech requests on a speech server and converting these to data
requests to be sent to the voice relay server.

[0117] A probability that a certain request via voice will be made by a
voice caller is used to assess what key words would be loaded at the
point when a caller requests information. For example if the user request
an email to be heard via the server system 120, then by a probability
match the most likely responses will be commands commonly (or previously)
associated with emails, such as "delete, remove, next, reply, respond,
forward, previous, repeat". However the user may say with a lower
likelihood commands such as "transmit, response, subsequent, erase,
replay". Based upon the speech command that is said, the server system
120 builds a profile of speech commands, ranks them according to
frequency of usage and associates them to a particular event in the
speech interaction process.

[0118] Thus, the voice relay server probabilistic indexing technique
provides dynamically created pattern matched maps of possible utterances
as input to the speech. Whenever a user interacts with the remote access
system 100, a mapping of user interactions is developed and maintained
such that an indexed list of possible interactions is developed
dynamically. When the user subsequently accesses the system, the speech
server 122 will be fed an indexed list of possible inputs from the voice
relay server 124 to ensure that the recognition process success rate
rises above the previous failure rate norm.

[0119] In order to perform the probabilistic indexing, the voice relay
server comprises an interaction analyzer and indexed list generator (both
not shown).

[0120] Interaction Analyser--For each user interaction with the system the
components accessed are analyzed and the interaction ranked as a probable
future voice interaction. These interaction lists are kept available for
future processing. A predictive model of utterances is generated and
stored on disk or in memory and adapts in real-time to continued use.
This list is sorted according to the patterns and profiling of the use of
the service and is continually revised.

[0121] Indexed List Generator--Based on a particular voice interaction and
the content of a voice page of information (VoiceXML page). This
component accesses the lists created by the Interaction Analyser and
dynamically generates a list of indexed possible utterances. The latter
lists are made available to the voice relay server 124 based on the
specific context, and are generated as a set of grammar which is loaded
onto the speech server 122 as necessary.

[0122] The intelligent agent 180 may be programmed via rules engine 360 to
respond to information which is new or changes to information on the
desktop PC or in the server. New or changed information that is captured
by a rule triggers an outbound alert to the voice relay server in the
form of a packet message, which is converted by the voice relay server
into speech packets or signals and then forwarded to a telephone (mobile
or terrestrial). The user, upon receiving the call, listens to the
automatically generated alert message.

[0123] The rules engine allows for discrete profiles to be created for
each data source to allow for the user to be able to have notifications
of only information that they specifically choose. Below are brief
descriptions of components of rules engine 360.

[0124] Rule Builder--This component accepts a set of inputs to construct a
rule. It predefines a set of conditions and constraints which should be
met for the execution of a particular action.

[0125] Reasoning
Module--This component analyzes a set of conditions and inputs and checks
if rule and policy criteria are met. It locates rules, confirms
permissions on rules and also checks that execution times and conditions
are adhered to.

[0126] Rule Validator--This component is responsible for
evaluating the validity of the rule. It checks for completeness,
integrity and the presence of the necessary conditions and their
measurement indicators before a rule is considered valid for processing
by the rule processor.

[0127] Rule Processor--This component is
responsible for using the Rule Validator to validate the rule. Next it
uses the Reasoning Module to check if the rule has to be executed. Then
it is responsible for the execution of the Rule Action.

[0128] Functional processing--A rule typically consists of a set of
strings that are generated by a Rule Builder, which may be a programmatic
interface or may be a user form for input of these strings. Each string
entered is validated by the Rule validator that may for example inform
the Rule Builder of inconsistent structure to a string (extra characters
or non-conforming characters) or inconsistent data structure (for example
a date in EU format 25/12/04 may not be processed or accepted unless it
is in the US format 12/25/04). This rule is then stored and associated
with a particular user.

[0129] The Intelligent agent 180 application is designed to manage one or
more rules for one or more users. A scheduler built into the intelligent
agent is designed to interrogate incoming information and match the
strings or user defined rules against the information which may be
structured. Hence a news item with a date field may be compared against a
user selected date item and when a match occurs a notification event is
generated. The intelligent agent also supports the ability to be
triggered by an in-coming data item and can perform the same matching
process against these data items as well.

[0130] The voice relay server 124 may have several connections to remote
PC's or servers. The main services that handle the communication between
the two end points are the intelligent agents command request module in
the PC or server and the command request module in the voice relay
server. A command processor (not shown) within the command request module
is responsible for constructing and de-constructing the packets of
information that are sent to and from the remote services. The command
processor module is designed to handle the variety of requests that may
originate at either point in the network.

[0131] The Command Processor listens for in-coming requests over a TCP/IP
connection. If the user is first logging on to the service then
credentials are sent to the remote PC or server intelligent agent as a
serialized string representation of the user name and password. This
establishes the authenticity of the user and allows for a continued
dialog with the remote intelligent agent, or will reject the request.
Upon successful authentication, a TCP/IP session is established between
the two systems and the Command Processor then deals with the information
received at either end. Serialized strings are passed between the
intelligent agent 180 and the voice relay server 124. A protocol for
communication that provides a mechanism to allow for a wide variety of
speech requests and payload replies to be handled by the system, is
implemented the intelligent agent 180 and the voice relay server 124 and
is used to invoke actions and services, an example of this protocol is
shown:

[0132] In the example above a Servicenameheader is a standard name of a
data service type that is required to be accessed or delivered, the
methodname determines what action is to be performed on that item and the
method parameters are the dynamic variables such as a list of items or
specific data elements.

[0134] This protocol also can be used to invoke actions or services on
both the voice relay server and the intelligent agent at on the PC or
server. The service method handler is different at each of the voice
relay server and the intelligent agent, since the services are different
for either. For example the intelligent agent will send a request to the
voice relay server to prompt a user for some voice input--such as
requesting the name of a person to retrieve from a local telephone book
stored on the PC/server. The voice relay server 124 will in turn request
that the intelligent agent open the phone book for interrogation. This
protocol provides the framework for bi-directional communication between
the voice relay server and the intelligent agent at the end points.

[0135] The intelligent agent manages a set of user-defined rules that
pertain to electronic information filtering and retrieval from programs
and information connected to the PC or server. For example, the user may
create a rule that reads from a database file a specific field entry and,
when there is a change to this entry that satisfies the rule, the
intelligent agent will recognize the change and will send an alert
(summarized or otherwise) to the voice relay server, which is targeted to
the specific user's telephone number, to read the alert to the user as a
speech stream.

[0136] Changes to content to which the rules apply are monitored and
recognized by intelligent agent software that controls content sourcing
on the PC. Indexing techniques for content are described below.

[0137] The intelligent agent also creates a local cached object reference
in cache module 320 to data items that satisfy the rules, which is
maintained and managed by the intelligent agent for expediency and
efficiency. Each cached object in the object reference is called a
personal assistant content object (PACO). The handling of such objects is
described further below. The local cached object reference allows changes
to the interfaces to data information services to be easily added and
maintained.

[0138] To interface with local data on the PC or server, the intelligent
agent preferably uses a commercial interface to microsoft outlook such as
outlook redemption available at www.dimastr.com (as of May 21, 2004).
Data content from outside the PC is obtained by sourcing module 350. In
the case of computer system 280 being a server system, the intelligent
agent may utilize a commercial interface such as compoze exchange
connector (www.compoze.com as of May 2004) to be able to connect to mail
and calendaring services within Microsoft Exchange.

[0139] The intelligent agent 180 has a software interface designed to
allow the user to customize the rules which they would like to have
configured for their desktop PC or server information. The intelligent
agent also has a set of software interfaces for communicating with any
structured data source, such as XML data, SQL data, HTML data, email
data, RSS news data, operating system data, file data, word processed
document data.

[0140] The voice relay server 124 performs a number of critical functions.
These services or functions include: maintenance of a physical network
connection to the remote PC or server, logical connection to the remote
PC or server through which data packet exchange can be performed,
presence detection, and secure connection management. These functions and
services are described in more detail below.

[0141] Before a logical connection can be set up, one or more physical
connections must be established between the voice relay server and the
intelligent agent on the PC or server. In the case that neither the voice
relay server or the desktop Pc or server can be reached directly (for
example, because systems may be behind firewalls or on an internal
network and do not have physical Internet addresses), then a physical
connection manager (PCM) (not shown) on the voice relay server will
establish a connection to the remote PC using a secure port. This PCM
connection through a secure port is established by the PC or server
intelligent agent declaring itself to the Voice Relay Server and
establishing a port connection in one direction to the voice relay server
and then using piggy-back capabilities of TCP/IP allowing the voice relay
server to communicate back with the PC or Server over the same secure
TCP/IP connection.

[0142] Since only the voice relay server has a well-known IP address and
can be connected to directly, the PCM relies on participating intelligent
agent PCs or servers to initiate their respective connections to the PCM
(to declare themselves to the voice relay server). The PCM maintains the
intelligent agent connections, associates them with the identities of
their owners, and performs the actual data exchange at the protocol
level, upon the direction of the users, using voice commands to access
the set of services available to them on the remote PC or server.

[0143] Before messages can be exchanged, a data pipe, or logical
connection must exist between the voice relay server and the intelligent
agent on the PC or server. A logical connection manager (LCM) (not shown)
on the voice relay server is responsible for defining such a pipe, and
establishing and maintaining it. A logical connection can be made up of
one or more physical connections. If the IP address of either party
engaged in a message exchange session can be reached via the open
network, a physical connection can be set up directly between the parties
and serves as the logical connection, without the server LCM being
involved. If a logical connection is made up of more than one physical
connection, the LCM has to keep track of the identities and states of
these Physical connections.

[0144] Before messages can be exchanged between the voice relay server and
the remote intelligent agent PC, the communication end point (i.e. IP
address) of the participants must be known. A presence manager (PM)
(which is the network ID registration module 250) running on the voice
relay server is responsible for keeping track of this information. This
is a relatively passive component. It relies on the intelligent agents to
inform the PM on the voice relay server of their IP addresses. A PM
client (which is the network ID notification module 380) must be running
in each participating client device (i.e. as part of the intelligent
agent function) to register the device's presence to the PM on the voice
relay server periodically over time.

[0145] The presence manager is constantly managing a table of intelligent
agent PCs or servers such that an incoming user connection is always
guaranteed to be able to connect to the corresponding remote intelligent
agent on the user's PC or server, assuming that this PC or server is
switched on and declaring its presence.

[0146] In order to determine that a physical connection request is from an
authorized participant, the requester's identity needs to be
authenticated. An identity unique in the voice relay server namespace
must be associated with each participating user and end device. Each such
device belongs to an owner who must also have a unique identity. These
identities also serve as the basis to control access to services and
resources anchored on the voice relay server and on the remote
intelligent agent PC or server. User authentication module 270. A
security manager (SM) is responsible for managing, maintaining and
enforcing the security policy for authenticating the user in an internal
table (typically in a database). The security manager is also responsible
for managing a secure connection to the remote PC or server, such that a
data pipe may be encrypted between the voice relay server and the remote
intelligent PC. The security may be at the packet level or at the
connection level (such as SSL type secure connections).

[0147] The intelligent agent responds to commands from the user, which may
entail retrieving, creating or modifying personal data on the user's PC,
such as a database item, email items, news items, personal local files,
calendar items, word processed documents or other structured data. The
user may also issue commands to start processes on the personal PC, such
as launch programs or stop programs. Advantageously, the user may issue
commands to manage specific data items on their personal PC, such as
composing an email, deleting a database item, reviewing an item in a
database or reading a web page from their PC.

[0148] The intelligent agent has a number of defined `data interfaces`
which can communicate with information stored locally on the PC, or
available to the PC through a network connection. One such data interface
is to SQL databases, such that the intelligent agent may read, update,
delete, add or replace data items inside a standard SQL database on the
PC or on a network connected to the PC.

[0149] Advantageously, once the intelligent agent on the PC determines
(using the rules engine) that alert data is to be sent, an alert or data
packet is sent to the voice relay server in order to place an outbound
call to the appropriate remote users telephone. The intelligent agent is
configured to access local information in an unattended and scheduled
manner. The rules engine on the intelligent agent has a built in
scheduler with which a user can configure periodic accesses to
information on the local PC or a network that the PC is connected to. The
rules engine also provides a method for certain data strings to be
created, which are compared against local data on the PC or the local
network.

[0150] In the event that a match or satisfying `data parameter` is
detected (according to the user-defined rules), an alert mechanism
triggers the intelligent agent to send a command to the voice relay
server to place an out-bound call to the correct user (i.e. the user
registered for that PC). The voice relay server establishes a telephone
connection to the user and converts the alert to a speech stream for the
user to hear. The user can customize the alerts that are to be sent to
the voice relay server and the user can set the frequency of these alerts
using a desktop administration software tool, which is a part of the
intelligent agent program on the PC.

[0151] In indexing module 340, whenever any content source is accessed a
mathematical digest of the individual item is calculated and stored in
the sourcing components store. At any point if content sourcing module
350 retrieved an item whose calculated digest did not match any of the
digests stored for that particular item type it would be considered as
non-duplicate and original for the purpose of filtering and processing.

[0152] If however the calculated digest matched a stored digest for that
item type it would be considered a duplicate and would be ignored for the
purpose of filtering and processing. This provides a way to
mathematically represent a data object (such as a text email) in a form
that would significantly improve performance and minimize the data
storage requirement for the content filtering of the intelligent agent.
Hence an email that was still on a mail server and had not been
previously viewed, would appear as a new data content item for the
intelligent agent to recognize and match against.

[0154] Digest Generator--This component would accept a content source
object and create a mathematically computed digest of the inputted
object. The algorithm used could be MD5 or SHA or a combination of the
two. It would have an interface for accepting a content object such as an
email or a news item or a stock quote and returning a mathematical
digest.

[0155] Indexed Digest List Repository--This component is a repository for
digests of different content source types. It provides mechanisms for
storing, matching and reconciling digests of different types.

[0156] The remote access system 100 may be required to monitor, filter and
deliver different types of content to end users with varying form
factors, via a voice notification. This content constantly changes with
varying needs. This content source may have to be stored in a cache or in
a persistent database for processing. As content sources change the
structure of the underlying data store will also have to change. For
example the content store may be required to store email and news
presently, while in the future it may store stock transactions and
weather information or traffic alerts.

[0157] An analysis of most content sources identified attributes which
were consistent across content sources such as source, destination,
title. The content data varied with the type. Thus, the common attributes
can be normalized at the data persistence level while the varying
attributes can be normalized at the application level. Additionally
serialization and de-serialization mechanisms ensure that the varying
content objects can be persisted to a data store and recreated form the
data store.

[0158] Functional Components of the PACO structure are described below.

[0159] Generic PACO--All content source objects should be capable of being
stored as generic PACO objects. The component has a mechanism to allow
conversion from a specific content object to a generic PACO object.

[0160] PACO Serializer--This component is capable of serializing a generic
PACO object into a format suitable for the underlying data store. The
latter could be a relational database, a local file or even memory
storage. It is also capable of deserializing a stored generic PACO
object.

[0161] PACO Encoder--The PACO Encoder is capable of encoding the
serializable generic PACO object into a form which meets the security
requirements of the system. It is also capable of decoding the serialized
object.

[0162] PACO Access Component--This component is capable of storing
encoded, serialized generic PACO objects. This has a
PACOAccessObjectFactory which creates a requisite PACOAccessObject. The
latter would also have to exist to interface with the underlying data
store.

[0163] Requirements--The PACO and its datalayer module are to be used to
encapsulate all disparate content objects within a generic content
object. The latter object is called a PA Content Object. The PA Content
Object should lend itself to quick data access by its merits of mapping
disparate content objects to the generic content object at the
application level. At the persistence level, the generic content object
should always map to the same table in a relational database. The main
searchable fields namely the source, the destination, the title, the
identifier and the type should be normalized at the database level,
whereas the integrity of the application specific details field should be
maintained at the application level. This will allow a multiplicity of
content object types to be stored in a singular representation at the
data persistence level while yet offering simplicity of use and no
changes to the database with changes in the content type.

[0164] Embodiments of the invention have been described herein by way of
example. Some modifications or alterations may be made to the
embodiments, as will be apparent to persons skilled in the art, without
departing from spirit and scope of the invention.