Login | Register   
RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX


Buy and Sell Stocks with the Sound of Your Voice Using the .NET Speech SDK : Page 2

Some applications are even more useful when people can interact with them using nothing but a telephone. We used the .NET Speech SDK to voice-enable the existing FMStocks sample application—and learned some useful lessons along the way.

Designing the Application
The FMStocks Voice system was designed to target more advanced users, who use this system frequently and are familiar with the options and navigation. With this in mind, the system was designed to include advanced features to enable the typical user to navigate the system quickly, yet include enough help and explanation such that the user would not get lost.

You should select the speaking voice for your application carefully, being sure to account for both speed and personality. The recorded sentences should be spoken at a moderate-to-quick pace, since the user usually knows what will be said, yet not so quickly that the recordings are mumbled or too difficult to follow. The system voice should be businesslike and factual.

Designing a voice-only system is much different from designing traditional GUI-based applications. Whereas a Web page is a two-dimensional interface, the voice medium is one-dimensional. For example, a table of data on a Web page needs to be read item by item over the phone. As one designer put it, the challenge becomes, "How do you chop things up to establish a coherent flow of information? How do you express content in a way that the user can digest, understand, and then act upon?"

We started our design process by following our standard methodology of user-centered design. The 80/20 rule is a good guide: 80 percent of the users use 20 percent of the application. We focused on ideal scenarios and common user paths rather than considering exceptional cases in the preliminary stages. We acted out sample dialogues that helped us get a better sense of how a typical conversation might go.

From these sample dialogues, we began creating flow charts for each major component of the system. Figure 2 shows the high-level flow diagram for the application:

Figure 2: This application flow diagram shows the high-level flow.
In addition to the flow diagram above, several global commands are available to the user throughout the application:

  • Main Menu: Returns the user to the main menu.
  • Help: Provides the user with context-sensitive help text at any prompt.
  • Instructions: Provides instructions on the basic usage of the system and global commands available to them at any point.
  • Repeat: Repeats the most relevant last prompt. If the last prompt informed the user that his/her input was invalid, the repeat text will provide the user with the previous question prompt instead of repeating the error message.
  • Representative: Transfers the user to a customer service representative.
  • Goodbye: Ends the call.
To buy stock, sell stock, or get a quote on a stock, the user must first choose a company from among a large company list. To do that, they must first say a company name, such as "Microsoft Corporation," or a partial ticker symbol, such as "M" to get started. If there is more than one match, the user enters a speech Navigation control and selects which company they want. In each of the three pages mentioned, we implement a user control called the "Selectable Navigator," which encapsulates a Statement-only QA control, a Navigator application control, and a Select command control. This user control is discussed in detail in the next section, "How It Works."

The design team found creation of a prompt specification document to be a challenge in itself. The number of paths available to the user at any one prompt leads to a complicated flow-chart diagram that, while technically accurate, loses a sense of the conversation flow that the designers had worked to achieve. The design team arrived at a compromise specification that allowed them to illustrate an ideal scenario while also handling exceptions. The following example illustrates the beginning of the "Buy Stock" scenario from the main menu:

Prompt: Main Menu

Expected User Input

"Buy Stock"


System Response

Recognized Expected Input

Please say the name of the company or spell the ticker symbol of the company that you are interested in. You may also say Main Menu, Help, or Representative at any time.

Recognized Alternate Input: "Help"

To help me direct your call, please say one of the following: Quotes, Buy Stock, Sell Stock, or Check Portfolio.

Prompt: Buy Stock

Expected User Input

"Microsoft Corporation"


System Response

Recognized Expected Input

I understood 'Microsoft Corporation.' Is that correct?

Recognized Alternate Input: "Help"

Please say the name of the company or spell the ticker symbol, leaving clear pauses between letters. You may say Main Menu if you wish to cancel this transaction.

This format of specifying functionality makes it very easy to conduct "Wizard-of-Oz" style testing. In this scenario, the test subject calls a tester who has the functional documents in front of him/her. The tester acts as the system, prompting the test subject as the system would and responding to their input likewise. Trouble spots are easily identified and fixed using this style of testing.

Comment and Contribute






(Maximum characters: 1200). You have 1200 characters left.



Thanks for your registration, follow us on our social networks to keep up-to-date