Browse DevX
Sign up for e-mail newsletters from DevX


SALT or VoiceXML For Speech Applications?

Competing speech-recognition standards, SALT and VoiceXML, are remarkably similar in what they achieve. But for the developer, there are important distinctions in how each language behaves. Microsoft's Stephen Potter details the technical and philosophical differences between the two so you can choose the right specification for your needs.

ALT and VoiceXML are both markup languages for writing applications that use voice input and/or output. Both languages were developed by industry consortia (the SALT Forum and the VoiceXML Forum, respectively), and both were contributed to W3C as part of their ongoing work on speech standards.

So why two specifications? Mainly because they were designed to address different needs, and they were designed at different stages in the life cycle of the Web. VoiceXML arose out of a need to define a markup language for over-the-telephone dialogs—Interactive Voice Response, or IVR, applications—and at a time (1999) when many pieces of the Web infrastructure as we know it today had not matured. SALT arose out of the need to enable speech across a wider range of devices, from telephones to PDAs to desktop PCs, and to allow telephony (voice-only) and multimodal (combined voice and visual) dialogs. SALT was also designed at a time (2002) when many key Web technologies have become well-established (XML, DOM, XPath, etc.).

I will declare my interest here: I represent Microsoft in the SALT Forum's Technical Working Group. However, I have studied SALT and VoiceXML in depth, and will use this forum to take an objective look at the two specifications, and point out the main technical differences between them in an unbiased way. You can decide for yourself which specification is most suitable for your applications. (See Sidebar: Developer Communities)

Thanks for your registration, follow us on our social networks to keep up-to-date