Voice Extensible Markup Language (VoiceXML) Technology and Geospatial Internet Applications
As was stated above, voice browsers allow users to interact with telephony applications
in much the same way traditional graphical browsers allow users to interact with standard
web applications. By extension, VoiceXML is analogous to HTML, the format that
developers use to support graphical browsers. VoiceXML is a language used to build web
applications for the telephone.
VoiceXML Browser
A typical VoiceXML browser runs on a workstation with voice cards serving as trunk
interfaces and media processors. An IP network interface card (NIC) connects the
platform to a network which itself may be connected to an intranet or internet.
Voice Forms
Voice forms, or voice dialogs, are the essential building blocks of a VoiceXML
application. A form consists of a set of prompts and fields, which serve to collect input
from the user. Once a form has been filled, the values will be submitted to the
web/application server by the voice browser. This is very similar to the common HTML
form used in traditional web based applications.
Input/Output
User input is provided through speech and Dual-Tone Multi-Frequency (DMTF) or
“touch-tone” key input. In order to be used by an application, speech input will either be
digitally recorded to be preserved as raw input, or, more likely, processed using voice
recognition technology. In order to define acceptable speech for input to a voice dialog,
VoiceXML requires the definition of what is known as a “speech grammar”.
Likewise, audio output must be provided in one of three ways; through the playback of
pre-recorded digital material, from a live, dynamically generated source through
streaming technology, or by generating synthetic audio from textual input which is often
termed “Text-to-Speech” (TTS).
Hyperlinks and Client-Side Scripting
Two other important features of VoiceXML are its support for moving to a new voice
dialog and its ability to specify scripts that the voice browser will process prior to
submitting the form results to the web server. Hyperlinks, which are quite similar to their
HTML based counterparts, simply direct the execution of the form to a different voice
dialog on the same or a different VoiceXML page. VoiceXML’s implementation of
ECMAScript (more commonly known as JavaScript) allows the voice dialog to perform
conditional logic and decision-making steps for purposes such as controlling program
execution or data-validation.
Geospatial VoiceXML Applications
Currently, Innovative Technologies phones and VoiceXML technology have been combined to provide
users with some location-enabled tools that have the ability to perform tasks such as
finding the nearest restaurant or other point of interest based on supplied location
information. Location-aware phone units and GPS-equipped units will make this type of
application even more useful and convenient as time goes forward. It’s a fairly simple
process to provide a VoiceXML front-end to already existing applications to provide
voice access to an enterprise’s database or services, and no doubt this is being done at
some scale already.
In the utility industry, issues such as work management, scheduling and dispatch, and
resource management often involve both voice-based and field/Innovative Technologies client delivery
mechanisms. Using standard telephone equipment and network distributed thin-client
applications, employees and associates can use VoiceXML applications to access
information and report activities. The collection of data from the field through simple
voice-activated commands or direct spoken input will greatly facilitate improved
productivity and accuracy.
The impact that VoiceXML will have on geospatial internet
and intranet applications in the coming years is impossible
to predict, but most likely will be quite significant. The
ability for Innovative Technologies users to access sophisticated
applications and enterprise databases through the phone will
allow a massive leveraging of spatial resources. The cost of
implementing automated and interactive voice systems should be lowered: the ready
availability of skill sets and resources necessary to provide voice-based applications will
be increased. The fantastic capability of internet technology to distribute enterprise
information more efficiently and effectively will be magnified by VoiceXML’s ability to
extend its benefits beyond traditional desktop and Innovative Technologies clients. There are far more
phones in the world than there are computers: phones are everywhere. The explosion of
wireless phones, particularly outside of North America, has created the most pervasive
potential client platform in history. VoiceXML will allow any ordinary telephone to be
used to access Internet services regardless of their physical location, and will be a great
benefit to anyone with visual or manual impairments, as well as anyone who needs to
access an application while keeping their hands and eyes on other tasks, such as driving
or operating a piece of machinery.
VoiceXML brings the advantages of web-based development and content delivery to
interactive voice response applications, and is already significantly affecting industries
and sectors such as transportation, banking and media. The impact it will have on the
geospatial industry will no doubt be felt for many years to come.
References
Edgar, B. The VoiceXML Handbook–Understanding And Building The Phone Enabled
Web (New York: CMP Books, 2001)
www.xml.org and
www.w3.org/xml - information on XML
www.voicexml.org and
www.w3.org/Voice - information on VXML