Using the IVocoder Interface to Capture Audio
The IVocoder interface gives you real-time access to the underlying hardware audio encoder and decoder. Like the IMedia interface, it's bidirectionalyou can use it to both record and play audio. However, the interface is significantly more complex to use, because the audio data is passed continually between the interface and the client of the interface. While this makes simple tasks such as recording an audio clip more difficult (a task better suited to an IMedia instance), it's the only way to create truly interactive audio applications such as a push-to-talk telephony or an audio chat application.
Although all BREW applications must work asynchronouslyperforming their work in response to either events or function callbacks invoked by interfaces or timersthe real-time nature of the IVocoder interface places additional demands on your application design. This stems from two related reasons: the time-dependent nature of the data from the vocoder and the implementation of the vocoder itself. First, audio data is sensitive to timing; the vocoder returns data to its client in frames, self-contained units of compressed speech. If you drop a frame, you've lost a segment of audio (the result sounds like a digital cell call just as it's about to drop due to poor coverage). Dropping frames can happen either because of insufficient buffering (you're not reading from the vocoder fast or often enough, or not writing to it fast or often enough) or because the underlying hardware is occupied by other things. Thus the second challenge: because the vocoder shares processing hardware, it's important that when using the IVocoder interface you keep other processing to a minimum, especially on low-end hardware. As the vocoder collects audio from the microphone, it compresses the data in real time based on the options you specify, using some of the hardware's processing resources to do so. (Similarly, replaying audio requires processing to decompress the audio data in real time.)
To use IVocoder, you must:
- Create an instance of the IVocoder interface.
- Configure the instance by selecting the vocoder type along with the compression options and callbacks for data availability and data playback.
- Start and stop the vocoder via the interface according to program control.
- Release the vocoder interface instance when you're through using it.
Vocoder configuration is a little trickier than configuring an IMedia interface, due to the nature of the vocoder itself. First, you must choose a vocoder algorithm (the default used by CDMA cellular networks is IS127). In addition to selecting an algorith, you must select a data ratethat is, how much digital data is generated from the audio channel for a given unit of time. You specify the data rate in terms of the bounds your application requiresthe minimum and maximum data ratesand the vocoder dynamically chooses the data rate based on the complexity of the audio stream. This process is adaptive; the vocoder will move to a higher data rate for more complex audio, and throttle back to the minimum data rate when encoding near-silence or audio with fewer compression features. In addition, some audio encoders permit you to specify an additional factor which permits the vocoder to further trim the size of each frame, resulting in lower bandwidth (often with only slightly diminished quality) for a given vocoder rate.
Here is a snippet that creates a vocoder interface and configures it for a typical audio session:
if (ISHELL_CreateInstance(pMe->a.m_pIShell, AEECLSID_VOCODER,
(void **)&pMe->pIVocoder) != SUCCESS)
DBGPRINTF("Error Creating IVocoder interface");
// Set necessary config data, Description in API
pMe->vocConfig.needDataCB = NeedDataCB;
pMe->vocConfig.haveDataCB = HaveDataCB;
pMe->vocConfig.playedDataCB = NULL;
pMe->vocConfig.readyCB = ReadyCB;
pMe->vocConfig.usrPtr = pMe;
pMe->vocConfig.max = HALF_RATE;
pMe->vocConfig.min = EIGHTH_RATE;
pMe->vocConfig.overwrite = TRUE;
pMe->vocConfig.txReduction = 0;
pMe->vocConfig.vocoder = VOC_IS127;
pMe->vocConfig.watermark = 24;
// Configure IVocoder
status = IVOCODER_VocConfigure(pMe->pIVocoder, pMe->vocConfig,
The IVocoder interface requires four callbacks set in in the vocoder configuration structure:
- The needDataCB, which the vocoder invokes when it needs additional data while playing audio.
- The haveDataCB, which the vocoder invokes when it has additional data while recording audio.
- The playedDataCB, which the vocoder invokes each time it has played one or more frames.
- The readyCB, which the vocoder invokes when it is ready to start or stop audio capture or playback.
To specify both the vocoder algorithm and data rates, Qualcomm provides constants in AEEVocoder.h; this listing configures the vocoder to run at half its maximum data rate at maximum, and an eighth its maximum data rate at minimum. The actual data rates depend on the type of codec; in this case the IS127 codec has been selected, with a maximum data rate of approximately 9kB/sec, so the data rates range between about 4.8kB/s and 1 kB/S.
Finally, because the vocoder interface provides some buffering of data in the event that you cannot service a callback as soon the vocoder requires or has data, you must specify the number of frames to accumulate before engaging the audio path, as well as whether when the vocoder's buffer overflows new or old frames should be discarded. The example just shown specifies that twenty-four frames should be required (the watermark field) and that new frames should be discarded in the event the buffer is full (overwrite is TRUE).
Once the vocoder is configured, it invokes the callback you specify in the configuration's readyCB field. At that point, you can begin using the vocoder for input or output. For input, simply invoke IVOCODER_VocInStart; for output, invoke IVOCODER_VocOutStart. In a similar vein, to stop the vocoder, use IVOCODER_VocInStop (to stop capture) or IVOCODER_VocOutStop:
static void ReadyCB(void * usrPtr)
CApp* pMe = (CVocApp*)usrPtr;
// Start Reading in
// Set Vocoder variable to on
pMe->bVocOn = TRUE;
This callback starts the vocoder immediately once it's ready, and sets an internal flag used to track the status of audio capture. You can also update your application's user interface in response to the callback; I like to do this asynchronously, by sending an event to my application using ISHELL_PostEvent
indicating that the UI should be updated in accordance with the new state.
If you are going to repeatedly start and stop the vocoder, it's a good idea to reset it between each invocation to ensure that it has reset its internal buffers and state:
Once the vocoder starts, it's your responsibility to keep collecting data from it and processing the data when it invokes the callback you specified in the configuration's haveDataCB
slot. (In a similar vein if you're playing data you need to be able to keep up with serving data to the vocoder when it invokes the callback you passed in the needDataCB
slot.) What you do with this data (or where you get it, depending on whether you're recording or playing audio) really depends on your application. For example, if you're just storing the data for playback later, you might write something like this:
static void HaveDataCB(uint16 numFrames, void *p)
CApp* pMe = (CVocApp*)p;
// Data integrity checks
if (!pMe || !pMe->pIVocoder) return;
// Read in each frame and send to network
for (i = 0; i < numFrames; i++)
status = IVOCODER_VocInRead(pMe->pIVocoder,
&rate, &length, pMe->arbyFrameData);
// If we succesfully read in data, then write to IVocoder
if (status == SUCCESS)
IFILE_Write(pMe->pIFile, pMe-> arbyFrameData, length);
callbacks each take the number of frames available (or that can be accepted in the case of audio playback) along with the user data pointer you specified in the vocoder's configuration. You should do as little processing as necessary in these callbacks, because they will occur quite often. In this example, I'm simply copying each frame as it arrives to a file on the flash file system; I could just as easily be buffering it in a memory buffer for analysis (perhaps run in the background using BREW's ISHELL_Resume
mechanism) or sending it along a network socket to a server. Regardless, note that I'm copying the frame data into a static memory region arbyFrameData
in my application structure, rather than creating this buffer with MALLOC each time the vocoder invokes the callback as a simple optimization. (The buffer is sized according to the constant MAXFRAMELEN
defined in AEEVocoder.h
Once you're done with the vocoder, you must stop and release the associated interface. (You can, of course stop and restart the vocoder repeatedly without needing to release the interface.) To do this, simply use the appropriate Stop methods and invoke its Release method:
// We stop incoming and outgoing data
pMe->pIVocoder = NULL;
You've already seen how to use the Stop methods in a previous snippet; the IVOCODER_Release
method is just like that for any other interface.
Both Good Solutions
Qualcomm provides both simple and low-level access to the audio hardware on BREW-enabled handsets via the IMedia interface. If you're looking to record audio, your best bet is IMedia, which provides a simple wrapper to package audio in a standard format for later playback or to exchange with servers or other uses. On the other hand, if you need real-time, low-level access to the real-time audio stream, it's easy to build an application with the correct callbacks and use an instance of the IVocoder interface.