Examining the Multimodal Speech Application
ARobot accepts commands that use the following syntax structure:
- The first letter is an exclamation point (!) used to indicate a new command.
- The second letter is the controller ID, which is always a "1."
- The third letter indicates the command to perform; for example, "b" indicates a beep.
- All remaining command characters are specific to the type of command you want the robot to perform.
For example, if you want ARobot to beep twice, you would need to issue the following command:
In this case the command parameter is "b" and the "2" character indicates that ARobot should beep twice. While this command structure is relatively simple, communicating these commands to your robot as typed commands is not exactly natural. It would be much easier to say the words, "Say Hello" and have your ARobot turn on its green LED light, beep twice and turn off the light.
|Figure 5. Grammar for the Say Hello Command: The diagram shows the structure of the grammar used to identify all the valid phrases associated with the Say Hello Command.|
The voice-activated remote control mentioned in the beginning of this article used a training process in order to recognize a user's command. While that's one method of accomplishing speech recognition, Microsoft Speech Server uses a grammar to identify what the user is saying. The grammar includes all the likely word choices that a user might say. For example, a user that wants ARobot to say hello could say, "Say Hello" or "Say Hi" or "Say Hey". The SASDK provides tools for building a grammar that identifies all the valid alternative phrases (see Figure 5
So, after identifying all the commands that ARobot can accept (see Table 1), it is just a matter of creating a grammar file for each command and then mapping the spoken command to the one that ARobot needs.
Table 1: The table shows a list of commands that the downloadable program can accept. Each command represents a specific action or sequence of actions that ARobot needs to perform.
||Potential Spoken Command
||Command(s) sent to ARobot
||!1l21 --> !1b2 --> !1l20
||!1r1ff --> pause for 400ms --> !1r100
||!1r101 --> pause for 400ms --> !1r100
shows the complete ProcessCommand
code that processes each command recognized by the speech engine:
You store connection string parameters for the serial port in the Web.config
file for the speech application in an appSettings
section, as shown below.
<add key="BaudRate" value="300" />
<add key="PortNum" value="1" />
<add key="ByteSize" value="8" />
<add key="Parity" value="0" />
<add key="StopBits" value="1" />
<add key="DefaultSpeed" value="3" />
You can then retrieve those settings to open a serial connection for sending a command.
//Get our serial connection string parms
//from the Web.Config
_Rs232 = new Rs232();
_Rs232.BaudRate = nBaudRate;
_Rs232.PortNum = nPortNum;
_Rs232.Parity = nParity;
_Rs232.StopBits = nStopBits;
_Rs232.ByteSize = nByteSize;
//Open the serial connection
Next, you construct the appropriate command, depending on what command was spoken. This section consists of a set of if-else if
commands, any recognized one of which triggers the writing of a matching set of command characters, as shown below:
//Determine what command sequence should be sent
if (inval.ToUpper() == "STOP")
else if (inval.ToUpper() == "BACKWARD")
else if (inval.ToUpper() == "FORWARD")
The entire process is wrapped in a try/catch/finally
block. If an exception occurs, the method writes the exception's description to the Debug window. If the spoken command was not recognized, the method closes the port and in the finally block, sets the _Rs232
variable to null
catch (Exception ex)
//Close our serial connection
_Rs232 = null;
Of course, the most enjoyable part of this project will be when you see your robot in action. Keep in mind that this can't happen until you use a serial cable or wireless serial receiver as described earlier to send the commandmode.bs2
program to the ARobot. After you have done that, you can copy the Web application to a Web folder on your desktop or laptop. At that point, bring up a browser and browse to the start page (default.aspx
) in that folder. When the page loads, you can begin speaking or typing your command. What you've probably recognized by now is that even though this particular article deals with creating voice control for a robot, using a multimodal application in this way gives you an alternative method of communicating with any
serial device. The device could be a robot, as shown here; but it might equally as easily be a Personal Digital Assistant (PDA), or any "Smart" device that communicates through a serial port.