Overview of SpeechView

SpeechView feature allows you to receive voice messages in your mail box in the form of text. When a voice message arrives, it is delivered to the mailbox of the recipient with a blank text attachment. When the completed transcription is returned by the transcription service, the text attachment is updated with the text of the transcription, or with an error message if there was a problem with the transcription. Only the first 500 characters of a message transcription are provided, so longer messages are truncated. However, users have access to the original recording in its entirety.

SpeechView is a feature of the Unity Connection unified messaging solution, therefore, the original audio version of each voice message remains available to you anywhere, anytime.

Note When a voice message is sent from Web Inbox to VMO, the voice message is delivered to the mailbox of the recipient along with the transcribed text in the both in the transcript view box and in the mail body.

When the SpeechView feature is enabled, Unity Connection uses a third-party external transcription service to convert voice messages to text. In Unity Connection 8.6(2) and later, the SpeechView feature provides the following types of transcription services:

Standard Transcription Service: Works as a fully automated non human assisted transcription service. In the standard transcription service, the transcription service automatically converts the voice message to text and then the transcription received from Unity Connection is sent to the recipient through e-mail.

Professional Transcription Service: Involves automated transcription and human intervention, if required, to convert voice messages to text. In the professional transcription service, the transcription service first converts the voice message to text automatically and confirms the accuracy of the transcription. If the accuracy is low in any part of transcription, the particular part of transcription is sent to a human operator who reviews the audio and improves the quality of the transcription. As the professional transcription service involves both automatic transcription and human intervention, the accuracy of text message is more as compared to the standard transcription service. The professional transcription service is also known as SpeechView Pro service.

Note Unity Connection version 8.0(2) to 8.6(1) supports only standard transcription service that involves automatic conversion of voice message to text without any human intervention.

Unity Connection sends the audio portion of a voice message to the transcription service, without details about the sender or recipients of the message. Communication between Unity Connection and the external transcription service is secured using S/MIME over SMTP.

To use SpeechView, users must belong to a class of service that enables transcriptions of voice messages. The class of service specifies the type of transcription service, either Standard or Professional, for which the user is subscribed. Members of the class of service can view the transcriptions of their messages using an IMAP client that is configured to access their Unity Connection messages. The original voice message remains attached to the transcribed text message.

You can also configure an SMS or SMTP notification device for users so that Unity Connection sends transcriptions to an SMS-compatible phone or an external email address; users can configure SMS or SMTP notification devices for themselves if they have access to the Unity Connection Messaging Assistant web tool. The original voice message is not attached to transcription messages sent to notification devices, but the device can be configured to include the phone number that users call to reach Unity Connection, so after viewing the transcription they can call Unity Connection to listen to the voice message.

The following messages are never transcribed:

Private messages

Broadcast messages

Dispatch messages

Secure messages (configurable)

Messages with no recipients in an enabled class of service

Note Messages with no recipients in an enabled COS includes those messages that are sent to a group of recipients and no recipient from the group is subscribed for the SpeechView transcription service. In such cases, the voice message is not transcribed for any of the recipients.

Secure messages are transcribed only if the user belongs to a class of service for which the Allow Transcriptions of Secure Messages option is enabled.

SpeechView Security Considerations

All transcriptions are handled by a third party transcription service outside of the customer environment. Communication between the customer Cisco Unity Connection server and the transcription service is handled through the use of S/MIME. The S/MIME public and private key negotiation happens transparently when registering with the transcription service. A new key pair is created each time a system registers.

When messages are sent to the transcription service, customer information is not passed with the message. The transcription service is unaware of the specific user the message belongs to. If human involvement is required in the transcription, the person working on the transcription is not able to determine the user or company from which the message originated. All audio stays on the conversion system and is never stored on the workstation of the person who processes the transcription. After the transcribed message is sent to the Unity Connection server, the copy at the transcription service is purged.

Recommendations for Deploying SpeechView

Advantages to Forwarding Personal Phones to Unity Connection

To take full advantage of SpeechView, encourage users to configure their mobile phones to forward to Unity Connection so that all of their voice messages are available in one mailbox and are all transcribed. To do this, users would configure the mobile phone to forward to their work phone number, which should correspond to their primary extension on Unity Connection. To configure the mobile phone for call forwarding, obtain specific instructions from the mobile phone carrier. A generic procedure is provided in the “Task List for Consolidating Your Voicemail from Multiple Phones into One Mailbox” section of the “Changing Your User Preferences” chapter of the User Guide for the Cisco Unity Connection Messaging Assistant Web Tool, at http://www.cisco.com/c/en/us/td/docs/voice_ip_comm/connection/8x/user/guide/assistant/b_8xcucugasst/8xcucugasst_chapter3.html.

Note that due to the call going to the mobile phone first and then to the work phone, callers may hear a lot of rings before reaching the mailbox of the user. To avoid this problem, you can instead forward the mobile phone to a special DID number that does not ring a phone and forwards directly to the mailbox of the user. This can be accomplished by adding the DID number as an alternate extension for the user.

Using SpeechView with Networked Unity Connection Servers

To consolidate the interface between the customer and the third-party transcription service, configure one of your Unity Connection servers (or clusters) to act as a proxy for the other Unity Connection servers in the network. In this configuration, only the proxy server registers with the transcription service. This can make it easier to troubleshoot any problems with transcriptions, track your transcription usage and monitor the load it introduces to your network.

If one of your Unity Connection servers has a lower call volume than others in the network, consider designating it as the proxy server for transcriptions.

If you do not use a proxy server for transcriptions, you need a separate external-facing SMTP address for each server (or cluster) in the network.

If user accounts are configured to relay voice messages to an alternate SMTP address, their voice messages cannot be transcribed. If users want transcriptions as well as the relay feature, you can instead configure user accounts to accept and relay voice messages. This allows the copy of the message that is stored on the Unity Connection server to be transcribed. Configure SMTP notification devices for users so that the transcription is sent to their SMTP address. This means that users will receive two emails at their SMTP address. The first one is the relayed copy of the message WAV file. The second is the notification that includes the transcription. If users do not want two emails for each message, consider setting their account to accept messages so that they receive only the email with the transcription. If they need to access the original recording, users can call to Unity Connection or use an IMAP client to access their Unity Connection account.

Step 2 On the Smart Host page, in the Smart Host field, enter the IP address or fully qualified domain name of the SMTP smart host server. (Enter the fully qualified domain name of the server only if DNS is configured.)

To Configure Your Email System to Route Incoming SpeechView Traffic to the Unity Connection Server

Step 1Select an external-facing SMTP address that the third party transcription service will use to send transcriptions to the Cisco Unity Connection server. For example, select “transcriptions@<yourdomain.com>”

If you have more than one Unity Connection server, you need a separate external-facing SMTP address for each server, unless the servers are part of a Unity Connection cluster. Alternatively, you can configure one Unity Connection server or cluster to act as a proxy for the remaining servers or clusters in the digital network.

Step 2 For each external-facing SMTP Address that you selected in Step 1, configure your email infrastructure to route messages that are sent the SMTP address to the “stt-service” alias on the Connection server. For example, if the SMTP domain for the Connection server is “connectionserver1.cisco.com,” the email infrastructure must be configured to route “transcriptions@cisco.com” to “stt-service@connectionserver1.cisco.com.”

If you are configuring SpeechView on a Unity Connection cluster, configure the smart host to resolve the SMTP domain of the cluster to both the publisher and subscriber servers in order for incoming transcriptions to reach the cluster subscriber server in the event that the publisher server is down.

Step 3 Add “nuancevm.com” to the “safe senders” list in the email infrastructure so that incoming transcriptions do not get filtered out as spam.

Note In Cisco Unity Connection 8.5 and later versions, to avoid timeout or failure of the registration request with the Nuance server, make sure to:

Remove the email disclaimers from the inbound and outbound email messages between Unity Connection and the Nuance server.

Step 5 In the Registration Name field, enter a name that will uniquely identify the Unity Connection server within your organization. This name is used by the third-party transcription service to identify this server for registration and subsequent transcription requests.

Step 6 If you want this server to offer transcription proxy services to other Unity Connection locations in a digital network, check the Advertise Transcription Proxy Services to Other Connection Locations check box.

Step 7 Select Save.

Step 8 Select Register.

Another window displaying the results will open. The registration process normally takes several minutes. Wait for the registration process to complete successfully before going on to the next step.

If registration does not complete within 5 minutes, there may be a configuration issue. The registration process will timeout after 30 minutes.

Step 9Select Test.

Another window displaying the results will open. The test will usually take several minutes, but can take up to 30 minutes.

SpeechView Reports

Unity Connection can generate the following reports about SpeechView usage:

SpeechView Activity Report by User —Shows the total number of transcribed messages, failed transcriptions, and truncated transcriptions for a given user during a given time period. If the report is run for all users, then the output is broken out by user.

SpeechView Activity Summary Report—Shows the total number of transcribed messages, failed transcriptions, and truncated transcriptions for the entire system during a given time period. Note that when messages are sent to multiple recipients, the message is transcribed only once, so the transcription activity is counted only once.