Browse DevX
Sign up for e-mail newsletters from DevX


Buy and Sell Stocks with the Sound of Your Voice Using the .NET Speech SDK : Page 14

Some applications are even more useful when people can interact with them using nothing but a telephone. We used the .NET Speech SDK to voice-enable the existing FMStocks sample application—and learned some useful lessons along the way.




Building the Right Environment to Support AI, Machine Learning and Deep Learning

Running the Application
Our user tests were designed with two main goals in mind:

  • Verify that the system performed well in real-life scenarios: The main goal is simply to verify that testers can manage the basic tasks that real customers would want to perform.
  • Exercise the full feature-set of the application: In addition to testing standard goals, it was important to make sure that the complete feature set of the application was tested as well. Testers were guided to parts of the system that might not necessarily be on a most-likely-path scenario, in order to make sure that the entirety of the system worked as expected.
To accomplish these goals we gave our testers scenarios that included both common tasks and special cases designed to guide the user toward special situations. A sample script might look like this:

TASK ONE (Researching and Buying) You are considering buying shares from IBM, Microsoft or Grey Advertising, but you are not sure which one. Check the market value of each of these companies. Once you know the market values, buy as many shares as you can of the least expensive stock with the money in your account. TASK TWO (Checking a Porfolio) 1. Check your portfolio to verify that your purchase has been made. TASK THREE (Searching for a Company) 1. You hear a hot stock tip about a company, but you can't remember the full name. You only remember that it starts with the word "American." Find companies that might match, select the correct one when you hear it, and buy ten shares. (Since you don't actually know the company, choose whichever one you want.) TASK FOUR (Selling Stock) 1. After your purchase you want to sell all of the shares of the two holdings with the least expensive per-share cost. Look up the company.

Test subjects were given account numbers and PINs to log into their account, but otherwise were left alone to complete the tasks. Tests were repeated with a number of different test subjects and over a number of successive product revisions.

Lessons Learned
We learned a great deal about building voice-only applications through the process of building these samples. Here we note some of the major points in the areas of user testing, design, and development.

Testing: The testing and tuning phase is important in any application, but in terms of design, it is especially important in voice-applications. We found that tuning our prompts, accept thresholds, and timeouts, were key to making the application useful. Here are a few suggestions on how to conduct effective testing and tuning for voice-only systems.

Properly Configure Testing Equipment First: Many of our early user tests generated numerous usability problems that were due to improper configuration of the microphone. The microphone was too sensitive, picking up background noise, feedback from the speaker output, and slight utterances as user input. Users became increasingly frustrated as they found it difficult to hear a prompt in its entirety. This affected test results significantly.

Select Testers Carefully: We found that testing subjects brought a variety of expectations to the testing process. Developers whom we used as subjects often made assumptions about the way the system was working and became confused with ambiguous prompts like, "Would you like to start shopping or review your previous orders?" They preferred more explicit choices: "Say start shopping to start shopping or review orders to review your account history." Testers with a less technical background preferred less structured prompting; they felt they were speaking with a more friendly system.

To conduct effective tests, make sure the user group you are testing matches the target user group for your application.

The most important lesson designing the application was the importance of tuning the prompt design throughout development. From the first stages of implementation through user testing of the completed system, we made changes to prompts to achieve a more fluid program flow. Our experience speaking with other teams who have attempted similar projects is that this is a fundamental part of voice-only application development.

With that in mind, here are a few points that will make the tuning process much more efficient:

  • Long Prompts Don't Equal Helpful Prompts: At the outset, our design team approached the goal of a friendly interface by writing friendly text. Testing quickly revealed that verbose prompts were a serious impediment to usability. By keeping prompts short, users understood better what to do.
  • Express Sentiment with Tone/Inflection: We found that helpfulness is best expressed through intonation and inflection, rather than extra words. A prompt like, "I'm sorry. I still didn't understand you. My fault again," expresses an apologetic sentiment on paper quite well, but spoken, it becomes excessive. This prompt became, "I'm sorry. I still didn't understand you," and we let the inflection of the speaker express the emotion.
  • Build Cases For Invalid (but likely) Responses: Our tests surprised us when a majority of users answered, "Yes," to the question, "Would you like to start shopping or review your previous orders." We realized that part of the problem was the way in which the question was asked, but still, we built in a command to accept that response and provide a helpful response.
  • Maintain a Prompt Style Guide: Design teams are used to maintaining style guides for their designs, and voice-only applications should be no exception. Having a consistent set of prompt styles and standard phrasings is paramount to creating a sense of familiarity for the user. Our team recommends an iterative process: modify the guide liberally in the early stages of a project as new cases arise. Then, toward the later stages, tweak new cases to fit the existing rules. This process should lead to a consistent user experience throughout your system.
We needed to make several changes to our development strategy worth noting here.

Necessary Modifications to the Business and Data Layers: The concept of building a voice-only presentation layer as a replacement for a GUI necessitates a few changes to the database and business logic layers we didn't foresee.

Account Balance: Instead of tracking the user's account balance by calculating their portfolio transactions, we added the CurrentBalance field to the Accounts table. The stored procedures Ticker_ListByTicker and Ticker_ListByCompany were modified to accept Account Number as a parameter, and now return the user's account balance in addition to the matching companies.

Limited amount of Companies: We only chose 100 companies out of the original 7,950 companies, because we wanted to keep the grammar manageable, and we felt it unrealistic to record over 7,000 company prompts.

Field for Grammar Names: We added a field to the TickerList table called CompanyGrammar, for creating a dynamic grammar file. This field contains slightly normalized text so it's easier to load it in. The stored procedure Speech_PopulateCompanyGrammarField was created to automatically read in the company names, normalize the text, and populate the CompanyGrammar field.



J.P. Morgan & Co.

j p morgan and company

Nat'l Western Life Insur.

national western life insurance

Different Login Information: The Web version of FMStocks accepts an email address and password as its login information. Both of these pieces of information are not easily expressed in a voice context. We replaced these fields with "Account Number" and "PIN" fields, which would typically also necessitate database changes.

For More Information
The complete documentation and source code for the FMStocks Voice application can be obtained at http://www.microsoft.com/speech.

Matt Hempey is a software developer for Vertigo Software, Inc., a San Francisco Bay Area software consulting firm. Joining the team in November, 2002, Matt has helped developed sample applications for the .NET Speech SDK. Prior to Vertigo, he was a project lead and senior developer for Takira, Inc., an online marketing provider. Matt graduated Magna Cum Laude from Washington University in St. Louis and went on to receive a Master's Degree in Computer Science from Stanford University.
Comment and Contribute






(Maximum characters: 1200). You have 1200 characters left.



Thanks for your registration, follow us on our social networks to keep up-to-date