Browse DevX
Sign up for e-mail newsletters from DevX


Top 10 Tips for Designing Telephony Applications : Page 2

Using Microsoft Speech Server, .NET developers can build telephony or voice-only applications quickly and easily. This article lists 10 tips to consider before designing these types of applications.




Building the Right Environment to Support AI, Machine Learning and Deep Learning

Tip 3—Let Users Enter Multiple Items with One Command.
In many cases it is faster for a user to speak a complex command than to type in a query or click a series of controls. For example, to make an airline reservation, you need to provide several key pieces of information, such as departure and destination locations, arrival dates and times, and travel preferences. Online systems typically collect this information through a series of text and combo boxes; but for a telephony system, this information is collected entirely through spoken commands. Such applications could prompt users for each piece of information required—this is the most straightforward approach, and involves a relatively simple grammar. However, it is faster to allow users the option of speaking more than one piece of information at a time. For instance, they might be allowed to say "Find me all Delta flights out of Louisiana for October 18th." The downsides to allowing this type of query are that the potential for error or misunderstanding increases and developers must spend more time building the grammar. Ease of use vs. development resources is a tradeoff that you should consider when developing telephony applications.

Tip 4—Limit the Number and Depth of Choices
The overall goal of a telephony application should be to make information available to the caller as quickly as possible. For the same reasons users of traditional Web-based applications do not want to waste time clicking through nested menus, telephony callers do not want to waste time navigating complex voice-only menus. This means that as the designer of a telephony application, you will need to limit the number of menu choices and nested submenus. I recommend limiting each menu to no more than three choices, and submenu depth to no more than two levels.

In addition to limiting the number and depth of menus, you also need to limit the number of words in each menu choice. If you force users to speak a four word menu choice in its entirety, they're likely to make mistakes and become frustrated. You can reduce the likelihood of mistakes by defining the grammar carefully. For example, if a menu choice has four words, such as "get bank account balance," make sure that the grammar also allows users to say "get bank balance," or "account balance."

Tip 5—Keep Prompts Short and Simple
Limiting the length of prompt messages helps prevent overwhelming users with too much information. Experienced users of telephony applications often know exactly what piece of information they are trying to retrieve. A long prompt that goes into great detail about all the available options will aggravate anyone but a first time caller—and even for first time callers, too much information can be confusing or overwhelming.

As an alternative to lengthy instructional prompts you can include help commands, designing the application so that users can access help at any time by simply saying "help," or "I need help." The help messages can change depending on context—where the user is in the application. Remember, you should state that help is available, and might have to remind users periodically that a help command is available.

Limiting the length of the prompt message is especially important for the welcome message. A telephony application plays a welcome message it's first initiated. Messages that are too long or complex will discourage first time callers from using the system. Return callers will be frustrated and will generally tune out the message, even if it changes and contains some piece of valuable information.

Tip 6— Make Sure Prompts are Understandable
A telephony application built with the Microsoft SASDK uses a prompt database to store all potential prompts. Your applications can deliver prompts created using text-to-speech technology or associated with pre-recorded messages. The pre-recorded messages can be recorded using the developer's voice or a professional voice talent. If you choose to record prompts, be sure to have the recorded messages spoken slowly and clearly so that all users can understand them. In addition, when your telephony application will be accessed by callers from different areas of the country or outside the United States, you may want to consider using regional dialects.

The Microsoft SASDK gives you the ability to tune both pre-recorded and text-to-speech prompts. For text-to-speech prompts, you can use SSML (Speech Synthesis Markup Language) tags to specify volume and control the speech rate. For example, the following code uses the ssml:prosody element to adjust the pitch and rate of the spoken text. It is wrapped inside the ssml:speak tag, which is a required root element for SSML.

<ssml:speak version="1.0" xmlns:ssml="http://www.w3.org/2001/10/synthesis" xml:lang="en-US"> <ssml:sentence> No flights are available on October <ssml:prosody pitch="+0.5st" rate="-10%"> 18 th</ssml:prosody> </ssml:sentence> </ssml:speak>

The spectrogram view, part of the Microsoft SASDK's built-in Prompt Editor, can help you tune pre-recorded messages. When a recording is imported into the prompt database, the speech recognizer does a good job of aligning the recording with the typed text; however, sometimes adjustments are needed to make the prompt sound clearer. The spectrogram view lets developers adjust the word boundaries for imported recordings.

Comment and Contribute






(Maximum characters: 1200). You have 1200 characters left.



Thanks for your registration, follow us on our social networks to keep up-to-date