Multimodality: Simple Technologies Drive a New Breed of Complex Application Input

Multimodality: Simple Technologies Drive a New Breed of Complex Application Input

he proliferation of handheld devices means that users are increasingly relying on them to perform routine, daily tasks?the cell phone, with its video card and internet access is becoming a constant companion. With this ubiquity comes the necessity to make these devices smaller, more convenient to carry, and that means smaller display and keyboard screens.

“The age of bulky 3G handsets is over,” says Hiroshi Nakaizumi, Sony-Ericsson’s Head of Design in its latest press release touting the K600 UMTS handset, which, despite the fact that it weighs no more than your average 2G handset, still delivers video telephony, a 1.3 MegaPixel camera, and high-performance download capabilities.

Phones just keep getting smaller and their capabilities keep getting more complex. In order to facilitate increasingly complex levels of interaction with devices that keep getting smaller, you’re going to need to learn how to equip their mobile applications with input modes besides a keypad or a stylus. You’re going to need to learn about multimodal development.

Multimodal applications allow users to interact with a device through more than one input mode simultaneously. They allow input through speech, a keyboard, keypad, mouse and/or stylus?even motion sensoring?and allow output through synthesized speech, audio, plain text, motion video, and/or graphics.

Kirusa’s Voice SMS (KV.SMS) is a typical example of a multimodal application, integrating voice messaging with text-based SMS and multimedia-based MMS. Using this program, users can dictate and send SMS messages using only their voices, send voice messages to phones without actually ringing the phone, click on an SMS message to hear a voice message, or respond to a voice or SMS message by voice or text. Ultimate convenience.

Another typical example of a multimodal application is the Ford Model U SUV. This car’s multimodal interface uses speech technology to allow drivers to control navigation, make phone calls, operate entertainment features such as the radio or an MP3 player, and adjust the climate control, the retractable roof, and personalize preferences.


The Sony K600 UTMS is representative of future handheld devices.

Another major application of multimodal technology lies in the special needs sector. Speech-enabled technologies, in particular, are of great help to those whose disabilities prevent them from taking full advantage of a GUI interface. eValues (e-library Voice Application for European Blind, Elderly and Sight-impaired) is a project to use multimodal development for the benefit of those whose disabilities present barriers to reading. This Internet-based service uses advanced text-to-speech conversion to allow blind or sight-impaired users to download any on-line book or document and listen to it. This capability works not only with PCs, but with common PDAs.

How Do They Do That?
The widespread adoption of XML and derivative markup languages has, for all intents and purposes, enabled the advent of multimodal development. The existence of an independent translator for stored data frees developers from having to develop for specific devices. XML and, most significantly, VoiceXML make it remarkably easy for developers to create flexible interfaces with which to access varying clients.

The three building-block languages for multimodal development are: SALT (Speech Application Language Tags), X+V (XHTML + Voice), and EMMA (Extensible MultiModal Annotation). All three have been submitted to the W3C for consideration as standards for telephony and/or multimodal applications. Currently, all three are under consideration for the next version of VoiceXML.

SALT: This language is an extension of HTML and other markup languages (cHTML, XHTML, WML). It’s used to add speech interfaces to Web pages and it’s designed for use with both voice-only browsers and multimodal browsers?meaning, cellular phones, tablet PCs, and wireless PDAs.

Microsoft developed SALT specifically to enable speech across a wide range of devices and to allow telephony and multimodal dialogs. Because SALT uses the data models and execution environments of its host environments (HTML forms and scripting), it is more familiar to Web developers. Its event-driven interaction model is useful for multimodal applications.

However, SALT is merely a set of tags for specifying voice interaction that can be embedded into other “containing” environments. Because of this dependency on an external environment, developers using SALT may need to generate differing versions of an application for each device?for instance, an application for use on cell phones will require separate versions for Nokia and Motorola phones.

X + V: This IBM-sponsored language combines XHTML with VoiceXML 2.0, the XML Events module, and a third module containing a small number of attribute extensions to both XHTML and VoiceXML. This allows VoiceXML (audio) dialogs and XHTML (text) input to share multimodal input data.

The fact that X+V is built using previously standardized languages makes it easy to modularize?that is, to break apart its code into modes, where one mode is for speech recognition, one is for motion recognition, etc..

But using the XML Events standard is what really differentiates X+V from SALT. Whereas events drive the creation of X+V, thus defining the environment, SALT merely attaches its tags to events within a pre-existing environment. Because X+V is self-sufficient in this manner, applications written with it are generally more portable.

EMMA: This language was developed in order to provide semantic interpretations for speech, natural language text, keyboard/, and ink input (a type of stylus input that includes handwriting recognition).

EMMA is a complimentary language to SALT and X+V, functioning as a sort of middleman between a multimodal application’s components?that is, between a user’s input and the X+V- or SALT-based interpreter. This frees developers from having to worry about writing code to interpret user input. EMMA simply translates input into a format interpreted by the application language, greatly simplifying the process of adding multiple modes to an application.

More Bells and Whistles: Who Cares?
Back when basic wireless technology began to take off, analysts, journalists, and vendors alike predicted it would change the way everybody did business, and it hasn’t really?except for those who were mobile in the first place (sales people, UPS drivers, etc.). Multimodality is interesting to be sure, but should we realistically expect it to have a broad-reaching impact on the typical enterprise developer? In other words, you’ll soon be able to tell your iPod to play a certain song while driving in your car without taking your hands off the wheel. So what?

To get our arms around that question, we need to look a little closer at the way mobilized application development proliferated. Earlier this year at SpeechTEK, Intel’s Peter Gavalakis outlined three reasons for the lack of wireless adoption in the enterprise: cost, lack of infrastructure, and lack of standardization.

In fact, it is the cost of standardization and the cost of infrastructure that prevented wireless penetration in the enterprise.

In order to understand why this is so, it’s important to look at wireless adoption outside the United Sates. While wireless adoption in the enterprise in the United States has been slow, internationally, it has not. “Poorer countries have a higher wireless penetration,” says Former W3C Multimodal Working Group member and EMMA author, Roberto Pieraccini. This is because poorer countries didn’t have the money for traditional wired infrastructures in the first place.

In countries such as China (which is the second largest mobile market in the world), mobility and multimodality have been adopted rather quickly. Obviously, it is more cost effective for a country to develop global satellite systems in order to accommodate a wireless business culture than develop a wired infrastructure at this late date.

However, greater wireless penetration is not limited to the poorer countries. “Last year,” says Pieraccini, “cell phones outnumbered landlines in Europe.” What’s the reason for this? “Europe adopted the GSM standard,” says Pieraccini, whereas in the United States, phone manufacturers use different standards, making many phones un-interoperable. Thus, each company has an interest in seeing a standard adopted only if it’s the standard they currently use.

Not to worry, theorizes Pieraccini, who invokes Wi-Fi as a potentially “disruptive technology,” capable of eliminating these cost, standardization, and infrastructure issues altogether. If Wi-Fi allows you to use the Internet and VoIP, to talk to anyone you want anywhere for only cents a minute, why do you need your telephone or your cell phone?

The Future of Multimodality
If multimodality can render your phone and your cell phone obsolete, the barriers to total wireless penetration disappear. Will such an event prompt the mobilization phenomenon to finally impact the enterprise with crushing urgency?

“We don’t know, we’re still in the early adoption phase,” says Pieraccini, speaking of multimodal adoption within the framework of Geoffrey Moore’s Chasm Theory. Essentially, the Chasm theory states that the technology adoption life cycle is different than in other adoption cycles, due to a “chasm” between the early adopters of the product (the technology enthusiasts and visionaries) and the early majority (the pragmatists). Early adopters may embrace a technology, but that technology may never cross the chasm to the early majority. Multimodal applications have yet to demonstrate significant appeal to the early majority.

Perhaps the reason that wireless technology?and thus multimodal technology?development has had such a hard time trenching first through the youth and niche markets, is because wireless and multimodal have generated their own, correlative chasm theory. The chasm, in this instance, is not one of market awareness, but of developer knowledge and confidence. Because this type of technology is so complicated, projects that “start big usually fail,” says Pieraccini. This warns developers to begin with small, uncomplicated applications?a simple voice recognition app that allows you to select a ring tone, for instance.

SALT, X+V, and EMMA are three nascent, sometimes complementary, languages, all looking to be standardized in the multimodal area, and they’re good places to start. Fear of a lack of standardization is not a reason to avoid getting familiar with the other aspects of multimodal development: application design, architecture, and testing are all aspects of programming that won’t change, even if the language you choose becomes obsolete. When it comes time to develop multimodal applications for global deployment, you’ll need to know which languages comply with which standards, where, and on what devices.

It’s important to remember that mobile and multimodal development have taken off in a big way in the global marketplace. And this trend, combined with the capabilities provided by XML abstraction, has made multiple inputs an obvious destination for a wide range of applications.

devx-admin

devx-admin

Share the Post:
Battery Breakthrough

Electric Vehicle Battery Breakthrough

The prices of lithium-ion batteries have seen a considerable reduction, with the cost per kilowatt-hour dipping under $100 for the first occasion in two years,

Economy Act Soars

Virginia’s Clean Economy Act Soars Ahead

Virginia has made significant strides towards achieving its short-term carbon-free objectives as outlined in the Clean Economy Act of 2020. Currently, about 44,000 megawatts (MW)

Renewable Storage Innovation

Innovative Energy Storage Solutions

The Department of Energy recently revealed a significant investment of $325 million in advanced battery technologies to store excess renewable energy produced by solar and

Development Project

Thrilling East Windsor Mixed-Use Development

Real estate developer James Cormier, in collaboration with a partnership, has purchased 137 acres of land in Connecticut for $1.15 million with the intention of

USA Companies

Top Software Development Companies in USA

Navigating the tech landscape to find the right partner is crucial yet challenging. This article offers a comparative glimpse into the top software development companies

Battery Breakthrough

Electric Vehicle Battery Breakthrough

The prices of lithium-ion batteries have seen a considerable reduction, with the cost per kilowatt-hour dipping under $100 for the first occasion in two years, as reported by energy analytics

Economy Act Soars

Virginia’s Clean Economy Act Soars Ahead

Virginia has made significant strides towards achieving its short-term carbon-free objectives as outlined in the Clean Economy Act of 2020. Currently, about 44,000 megawatts (MW) of wind, solar, and energy

Renewable Storage Innovation

Innovative Energy Storage Solutions

The Department of Energy recently revealed a significant investment of $325 million in advanced battery technologies to store excess renewable energy produced by solar and wind sources. This funding will

Renesas Tech Revolution

Revolutionizing India’s Tech Sector with Renesas

Tushar Sharma, a semiconductor engineer at Renesas Electronics, met with Indian Prime Minister Narendra Modi to discuss the company’s support for India’s “Make in India” initiative. This initiative focuses on

Development Project

Thrilling East Windsor Mixed-Use Development

Real estate developer James Cormier, in collaboration with a partnership, has purchased 137 acres of land in Connecticut for $1.15 million with the intention of constructing residential and commercial buildings.

USA Companies

Top Software Development Companies in USA

Navigating the tech landscape to find the right partner is crucial yet challenging. This article offers a comparative glimpse into the top software development companies in the USA. Through a

Software Development

Top Software Development Companies

Looking for the best in software development? Our list of Top Software Development Companies is your gateway to finding the right tech partner. Dive in and explore the leaders in

India Web Development

Top Web Development Companies in India

In the digital race, the right web development partner is your winning edge. Dive into our curated list of top web development companies in India, and kickstart your journey to

USA Web Development

Top Web Development Companies in USA

Looking for the best web development companies in the USA? We’ve got you covered! Check out our top 10 picks to find the right partner for your online project. Your

Clean Energy Adoption

Inside Michigan’s Clean Energy Revolution

Democratic state legislators in Michigan continue to discuss and debate clean energy legislation in the hopes of establishing a comprehensive clean energy strategy for the state. A Senate committee meeting

Chips Act Revolution

European Chips Act: What is it?

In response to the intensifying worldwide technology competition, Europe has unveiled the long-awaited European Chips Act. This daring legislative proposal aims to fortify Europe’s semiconductor supply chain and enhance its

Revolutionized Low-Code

You Should Use Low-Code Platforms for Apps

As the demand for rapid software development increases, low-code platforms have emerged as a popular choice among developers for their ability to build applications with minimal coding. These platforms not

Cybersecurity Strategy

Five Powerful Strategies to Bolster Your Cybersecurity

In today’s increasingly digital landscape, businesses of all sizes must prioritize cyber security measures to defend against potential dangers. Cyber security professionals suggest five simple technological strategies to help companies

Global Layoffs

Tech Layoffs Are Getting Worse Globally

Since the start of 2023, the global technology sector has experienced a significant rise in layoffs, with over 236,000 workers being let go by 1,019 tech firms, as per data

Huawei Electric Dazzle

Huawei Dazzles with Electric Vehicles and Wireless Earbuds

During a prominent unveiling event, Huawei, the Chinese telecommunications powerhouse, kept quiet about its enigmatic new 5G phone and alleged cutting-edge chip development. Instead, Huawei astounded the audience by presenting

Cybersecurity Banking Revolution

Digital Banking Needs Cybersecurity

The banking, financial, and insurance (BFSI) sectors are pioneers in digital transformation, using web applications and application programming interfaces (APIs) to provide seamless services to customers around the world. Rising

FinTech Leadership

Terry Clune’s Fintech Empire

Over the past 30 years, Terry Clune has built a remarkable business empire, with CluneTech at the helm. The CEO and Founder has successfully created eight fintech firms, attracting renowned

The Role Of AI Within A Web Design Agency?

In the digital age, the role of Artificial Intelligence (AI) in web design is rapidly evolving, transitioning from a futuristic concept to practical tools used in design, coding, content writing

Generative AI Revolution

Is Generative AI the Next Internet?

The increasing demand for Generative AI models has led to a surge in its adoption across diverse sectors, with healthcare, automotive, and financial services being among the top beneficiaries. These

Microsoft Laptop

The New Surface Laptop Studio 2 Is Nuts

The Surface Laptop Studio 2 is a dynamic and robust all-in-one laptop designed for creators and professionals alike. It features a 14.4″ touchscreen and a cutting-edge design that is over

5G Innovations

GPU-Accelerated 5G in Japan

NTT DOCOMO, a global telecommunications giant, is set to break new ground in the industry as it prepares to launch a GPU-accelerated 5G network in Japan. This innovative approach will

AI Ethics

AI Journalism: Balancing Integrity and Innovation

An op-ed, produced using Microsoft’s Bing Chat AI software, recently appeared in the St. Louis Post-Dispatch, discussing the potential concerns surrounding the employment of artificial intelligence (AI) in journalism. These

Savings Extravaganza

Big Deal Days Extravaganza

The highly awaited Big Deal Days event for October 2023 is nearly here, scheduled for the 10th and 11th. Similar to the previous year, this autumn sale has already created

Cisco Splunk Deal

Cisco Splunk Deal Sparks Tech Acquisition Frenzy

Cisco’s recent massive purchase of Splunk, an AI-powered cybersecurity firm, for $28 billion signals a potential boost in tech deals after a year of subdued mergers and acquisitions in the