Browse DevX
Sign up for e-mail newsletters from DevX


How to Build Grammars for Speech-enabled Applications

Speech-enabled applications require specialized grammars that clearly define the types of input they're expected to parse and understand. Find out how to build grammars by walking through the process of building a grammar for an order status retrieval system that lets callers retrieve orders by voice.




Building the Right Environment to Support AI, Machine Learning and Deep Learning

ost companies you call have automated voice attendants that walk you through a series of menu choices until you are finally directed to a live person. As the automated portions of these systems gain use and popularity, they're becoming increasingly sophisticated—even helpful. For example, voice-automated systems can now let you reset passwords, check email or get flight information. The good news is that as a .NET developer you already have some of the expertise needed to create these kinds of cutting edge applications. Microsoft Speech Server lets you build complex voice-only applications in the same environment used to build today's Web-based applications, but instead of a point-and-click graphical user interface, users access information using a phone and a series of voice commands. The dialog between the user and the computer can be natural, intuitive and in many cases more convenient for the end user.

Don't worry if you don't know anything about developing a speech-based application. To start, you need to go to the Microsoft Speech Web site and download a free copy of the Speech Application Software Development Toolkit (SASDK). After installing the SASDK, you may want to familiarize yourself with the terminology for speech-based applications by walking though the SDK tutorial that comes with the download. To be useful, a speech-based application needs to understand what the user says, in other words, it needs a grammar. The grammar represents what the user says to the system. For the computer to understand callers without a formal training process, the computer must have a good idea what the caller might say. This is where the grammar comes in. It identifies all the likely potential word choices a caller might use when accessing the system. This article walks you through the process of creating a grammar for a voice-only application using the SASDK.

What You Need
Microsoft SQL Server 2000 or 2005, Visual Studio .NET 2003, and the Microsoft Speech Application SDK

The downloadable application that accompanies this article, HowToBuildGrammar.csproj, uses data from the Northwind database that installs by default with SQL Server 2000. This ASP.NET application uses the Microsoft.Speech.Web.UI namespace, and allows callers to access order information for pending or shipped orders. Each caller hears a welcome message and is then asked to say a contact name that identifies the company they wish to access. After the company is confirmed, callers are asked whether they want to hear information about shipped or pending orders. Based on their answer, the application retrieves a list of orders that callers can navigate through using voice commands such as Next, Previous, or Select. The start page for the HowToBuildGrammar project is Default.aspx. Because the interface is not graphical, the page contains no fancy graphics; instead it contains a series of speech-based Web controls that guide callers through the application. Each Question/Answer (QA) control on this page references a grammar file built with the SASDK's Grammar Editor.

Comment and Contribute






(Maximum characters: 1200). You have 1200 characters left.



Thanks for your registration, follow us on our social networks to keep up-to-date