Designing Your Grammar
Items in your grammar files define what words and phrases are recognized. When the Speech engine matches an item from the grammar file, it returns SML, or Speech Markup Language, which your application uses to extract definitive values from the text that the user spoke. Having too strict a grammar will result in no flexibility from the user's perspective in regards to what they can say; however, too many unnecessary grammar items can lead to lower speech recognition.
Preambles and Postambles
Very often, you will want to allow a generic "preamble," text said before the main item, and "postamble," text said after the main item. For instance, if the main command is "Buy Stock," you would want to allow the user to say "May I Buy Stock please."
Typically, you can use one grammar (.grxml
) file for your preambles and one for your postambles. Within your other grammar rules, you can then reference the pre- and post-ambles by using RuleRef's.
|Tip: Make the pre- and post-ambles generic and robust enough that you don't limit your users' experience, but keep them reasonable in size so that you don't risk lowering the speech recognition for your main elements.
Use the Grammar Editor tool (see Figure 3) to graphically set up grammar files. The basic task is to set up a text phrase or a list of phrases, and then assign a value that you want your application to use when each phrase is recognized.
|Figure 3: Use the Grammar Editor tool to set up grammar files visually.|
We found that the following strategies helped us in grammar development:
- Typically, if we only need to recognize that a text phrase has been matched, especially in the case of commands, we fill in the Value field with the empty string rather than a value. For example, if you want to capture when the user says "Help," you can simply return the following SML:
<SML text="help" confidence="1.0">
<GoHelp text="help" confidence="1.0"></GoHelp>
- The control associated with this grammar file recognizes the phrase, and returns the SML element "GoHelp"; the code-behind or client-side script makes a decision based on the SML element being returned, rather than the value.
- Use rule references within grammar files to avoid duplicating the same rule across different speech controls.
|Tip: You must make sure that a rule to be referenced is a public rule, which you can set through the properties pane.
Creating Grammar Files Programmatically
- A common grammars file is included with the Speech SDK, both in an XML file version (cmnrules.grxml) and in a smaller, faster compiled version (cmnrules.cfg). We copied the compiled version into our project and used it for commonly used grammar elements, such as digits and letters in the alphabet.
Because grammar files are simply XML files, it is possible to create grammars programmatically. This was especially helpful when creating the grammar for the stock trading companies, as not only were there a number of companies, but also there needed to be at least two grammar phrases for each company. For instance, if the company in question is "Microsoft Corporation," we want the grammar to recognize both "Microsoft" and "Microsoft Corporation."
We created two Web pages to be used as a tool to dynamically create company grammar from the database, and also as a way to show how this can be done.
: This is the main Web page to create the company grammar, and it resides in the Tools folder. It consists mainly of a button and a text area. When you run the page and press the button, you should see a printout of either the converted XML, for debugging purposes, or an error message if there was a problem. The XML is automatically saved into the grammar file Companies.grxml
, so there is no need to copy and paste the XML.
: This page is also located in the Tools folder. Once you use the CreateCompanyGrammar.aspx
page, a link to this page will become visible. This page is simply available to test the newly created company grammar file. Once you say either a company name or ticker, a DataGrid will appear and show you what database matches were found.
Database Stored Procedures
: There are two stored procedures and one user-defined function installed from the database scripts that relate to dynamic grammar creation. Each has to do with string manipulation and/or loading the companies from the database, in order to most efficiently create the grammar.
: Markup characters like '&' (ampersand), while common in company names, cannot be used within XML strings or within the grammar and prompt tools. Several string replacement functions are performed to normalize these company names for use in the grammar files.
The most common example of this is the case of the ampersand. We replace the ampersand with the string 'amp' in the code-behind, for grammar/prompt recording matching. Our transcriptions in the Prompt Database also read 'amp', again, to be sure to match what is being sent in by the prompt functions (see Figure 4). However, when we record the company name, we say 'and,' not 'amp'.
Special Semantic Cases
|Figure 4: Note the difference between the Transcription and the Display Text.|
: In some rare cases, the speech recognition engine cannot match a company name with its correct pronunciation. We then have to manually add an extra grammar phrase in order to correctly recognize that company. For instance, the speech engine cannot understand 'Novo-Nordisk', but will match correctly to 'No vo nor disk'. We enter a grammar element with the text 'no vo nor disk', with a corresponding value of 'Novo-Nordisk'.