Browse DevX
Sign up for e-mail newsletters from DevX


Design and Implement a Voice-only Web Application in ASP.NET : Page 3

This whitepaper demonstrates how to use the Microsoft .NET Speech SDK to build a complete e-commerce starter application. Use these detailed techniques to build your own commerce system that will have your customers browsing, shopping, and making purchases using nothing but the sounds of their voices.




Building the Right Environment to Support AI, Machine Learning and Deep Learning

Designing the Application
We designed the system for a group of target users ranging from novices with little or no experience using voice-only systems to technophiles with a lot of experience. With this in mind, we tried to include advanced features to enable an experienced user to navigate the system quickly, while keeping the system simple and well-explained enough so that a novice would not feel lost.

Target User and Voice Personality
For the personality of the speaking voice, we had two goals:

  • Speed: The recorded sentences should be spoken at pace that a new user can easily understand and also have sufficient time to commit several commands to memory. An appropriate speaking pace helps usability by striking a balance between speaking so fast that users miss options and speaking so slowly that they begin to lose attention.
  • Mood: The system's voice should be friendly, patient, and may use a bit of accentuation. Any voice-based system should make a user feel good using the system, both for the sake of usability and of providing a good user experience with the company.
Navigation Design
Designing a voice-only system is much different from designing traditional GUI-based applications. Whereas a Web page is a two-dimensional interface, the voice medium is one-dimensional. For example, a table of data on a Web page needs to be read item by item over the phone. As one designer put it, the challenge becomes, "How do you chop things up to establish a coherent flow of information? How do you express content in a way that the user can digest, understand, and then act upon?"

Start With a User-Centered Design Approach
We started our design process by following our standard methodology of user-centered design. The 80/20 rule is a good guide: 80 percent of the users use 20 percent of the application. We focused on ideal scenarios and common user paths rather than considering exceptional cases in the preliminary stages. We acted out sample dialogues that helped us get a better sense of how a typical conversation might go.

From these sample dialogues, we began creating high-level flow charts for each major component of the system (see Figure 3)

Figure 3: This diagram illustrates the high-level flow of the application.
In addition to the flow diagram above, several global commands are available to the user throughout the application:

  • Main Menu: Returns the user to the main menu.

  • Help: Provides the user with context-sensitive help text at any prompt.

  • Instructions: Provides instructions on the basic usage of the system and global commands available to them at any point.

  • Repeat: Repeats the most relevant last prompt. If the last prompt informed the user that his/her input was invalid, the repeat text will provide the user with the previous question prompt instead of repeating the error message.

  • Representative: Transfers the user to a customer service representative.

  • Goodbye: Ends the call.

Special Case: Implicit Confirmation
One of the more interesting navigational scenarios in the Commerce Application occurs when the user enters a product ID after saying "Start Shopping" from the main menu. We wanted to take advantage of the Speech SDK's "implicit confirmation" feature here: if the product ID is recognized with high confidence and the recognized ID exists in the system, we want to bypass the explicit confirmation of that prompt. A typical scenario might look like this, illustrated by the flow diagram (see Figure 4).

System: If you know the three-digit product number of the item you want, say it now. If not, say browse.

User: 3 5 5 (Mumbled, recognized with low confidence).

System: I understood, 3, 5, 5, Rain Racer 2000. Is this correct?

User: No, I said 3 5 9 (Clearer, recognized with high confidence).

System: You selected product 3 5 9, Escape Vehicle (Water). How many would you like?

Figure 4: This high-level flow diagram for a typical scenario shows how the system would interact with a user to obtain and confirm a product ID number.
This scenario makes use of a combination of the Speech SDK's answers, extra answers, and confirmations user input types. It makes possible complicated flow control situations.

Prompt Design
The design team found creation of a prompt specification document to be a challenge in itself. The number of paths available to the user at any one prompt leads to a complicated flowchart diagram that, while technically accurate, loses a sense of the conversation flow that the designers had worked to achieve. The design team arrived at a compromise specification that allowed them to illustrate an ideal scenario while also handling exceptions. The following example illustrates the beginning of the "Start Shopping" scenario from the main menu:

Prompt, Main Menu

Expected User Input

"Start Shopping"


System Response

Recognized Expected Input

Remember, you can start over by saying main menu. If you know the three digit product number of the item you want, say it now. If not, say browse.

Recognized Alternate Input: "Help"

You have reached the IBuySpy store. Our store is pretty simple. If you want to shop, say start shopping. To review your previous orders say review previous orders.

Prompt: Start Shopping

Expected User Input

"3 5 5"


System Response

Recognized Expected Input

You selected product 3 5 5, Rain Racer 2000. How many would you like?

Recognized Alternate Input: "Help"

You can place orders quickly by saying the three-digit product number. Say each digit with a clear pause between each number or enter it on your Touch-Tone phone. If you don't know the product number say browse.

This format of specifying functionality makes it very easy to conduct "Wizard of Oz" -style testing. In this scenario, the test subject calls a tester who has the functional documents in front of him/her. The tester acts as the system, prompting the test subject as the system would and responding to their input likewise. Trouble spots are easily identified and fixed using this style of testing.

Comment and Contribute






(Maximum characters: 1200). You have 1200 characters left.



Thanks for your registration, follow us on our social networks to keep up-to-date