Browse DevX
Sign up for e-mail newsletters from DevX


How to Build Grammars for Speech-enabled Applications : Page 2

Speech-enabled applications require specialized grammars that clearly define the types of input they're expected to parse and understand. Find out how to build grammars by walking through the process of building a grammar for an order status retrieval system that lets callers retrieve orders by voice.




Building the Right Environment to Support AI, Machine Learning and Deep Learning

Building Grammars with the Grammar Editor
If you download the code that accompanies this article, you can access the Grammar Explorer by opening the project file and drilling down to the Orders.grxml file located in the Grammars subdirectory. From there, you should see a list of rules associated with this grammar file. Double-click the OrderTypeRule item, and the Grammar Editor opens (see Figure 1). From here you can build, edit, and test grammar files.

Figure 1. The Grammar Editor: The figure shows the Grammar Editor opened on the OrderTypeRule item.
You build grammars by dragging controls from the toolbox on the left side of the Grammar Editor onto the drawing design surface. Table 1 identifies the grammar controls available in the grammar toolbox. The OrderTypeRule in Figure 1 uses a group control to relate three different rule references. The rule enables the OrderTypeQA control to identify exactly what order type the caller wishes to hear. It's referenced by the OrderTypeQA control in the Default.aspx page through the grammar collection. You can see this reference if you view the HTML for Default.aspx. The reference should appear as follows:

<Grammars> <speech:Grammar Src="Grammars/OrderType.grxml" ID="OrderTypeQA_Grammar"></speech:Grammar> </Grammars>

Table 1: The table lists the grammar controls available through the grammar toolbox in Visual Studio.

PhraseRepresents the actual phrase spoken by the user.
ListUsed to contain multiple phrase elements that all relate to the same thing. For example, a "yes" response could be spoken as "Yes", "Yeah", or "OK".
RuleRefUsed to reference other rules through the URI property. This is useful when you want to reuse the logic in existing rules.
GroupUsed to group related elements such as a List, Phrase, or RuleRef.
WildcardUsed to specify which words in a phrase can be ignored.
HaltUsed to stop the recognition path.
SkipUsed to indicate that a recognition path is optional.
Script TagUsed to compare the value of the Semantic Markup Language (SML) tag with the matching values.

The PreambleRuleRef contains a list control that identifies several phrases a caller might say at the beginning of a response. For instance, callers are prompted for the order type with the following text to speech: "Do you need to hear about pending or shipped orders?" Callers might respond in a variety of ways: They might say, "Give me all shipped orders," or "I want the pending ones please." Either response is fine, because you can use PreambleRuleRef and PostambleRuleRef rules to represent the extra words people use in natural speech. In the example phrases listed above, "Give me all" and "I want the" represent the preamble phrases, while "orders" and "ones please" represent postamble phrases. The meat of the response is the word "shipped" or "pending" that represents which order type the caller wants to retrieve. The OrderTypeRuleRef captures this essential part of the caller's response.

Testing Grammars
The Grammar Editor lets you to test grammars by typing a phrase into the Recognition String text box. After typing a phrase, click the Check button, and watch the Output window for results. If the grammar was located and validated successfully, then you should see an XML based string that represents the SML. So, if you were to type in the phrase, "Give me all shipped orders," the output would appear as shown in Figure 2.

Figure 2. Testing Grammars: When testing grammars, the test results appear in the Output window as an XML string that represents the SML.
The SML result returned from the Speech Recognizer is used to value semantic items. These semantic items are declared in the Default,aspx page as part of the SemanticMap speech control. Each semantic item is used to store some piece of information collected from the caller. For example, the semantic item named siOrderType stores the value associated with the tag named Type. This value will be used in the code when populating the DataTableNavigator speech control with customer orders.

You may notice that the SML contains an attribute named confidence. In Figure 2 this attribute contains a value of 1.00, which indicates 100 percent confidence that the result returned from the speech recognizer is accurate. The confidence score is this high because I typed in the text when testing from the Grammar Editor. If I had used a microphone to speak the order type, the confidence score would have been less than 100%. Exactly how much less that value will be depends on the quality of the microphone and the speaking tone and rate used by the caller.

Comment and Contribute






(Maximum characters: 1200). You have 1200 characters left.



Thanks for your registration, follow us on our social networks to keep up-to-date