Speech recognition and mobile field solutions
Why speech technology?
The traditional mobile computing data collection process is slow, cumbersome and error-prone. Even
with software specifically developed for field applications, the gains in efficiency are incremental at best.
The problem is the computer itself. You have to constantly stop and look down at the computer and take
your hands away from whatever you're doing to enter data. With speech technology you keep your hands
free and your eyes up.
Speech has been viewed as the next major break through in mobile computing. Hands free eyes up data
entry are seen as the next logical evolution in field automation. Speech-to-database technology offers an
innovative system using proven and emerging technologies for accurate and rapid field data and asset
information collection and maintenance. These technologies include: Speech Recognition, GPS, Multi-Media
Data Collection, GIS, Mobile Communications, and Mobile Computers.
Most organizations are familiar with speech recognition technology. This may include speech engines for
speech commands, hands-free dialing, and direct dictation tools. But, field use is more than speech-to-text
or even speech commands. The answer lies in relating spoken information to relational and spatial
databases along with GPS coordinates, digital photos, .wav files, laser measurements and other relative
information from the field. The system must be speaker independent, requiring no speech training and
include audio prompting.
While mobile computers are flourishing, they are not living up to their full potential or the extensive
needs of users. The objective to fully empower the mobile computing environment with a natural, easy-to-
use interface that is compatible with the types of activities performed in the field—has not been met
through mobile computing, that is until the interface is truly mobile, and uninhibiting -- namely speech.
By nature, field activities are physical and do not lend themselves easily to traditional computing
interfaces, which require the full attention of the user’s mind and body—eyes are focused on a screen,
hands are positioned on a keyboard. This may be acceptable for workers sitting at desks, with source
materials organized around them, but in the field, it is awkward, distracting and counterproductive.
By utilizing speech as the computer interface, field professionals are free from the burdensome limitations
of desk-based interfaces and enable them to use their computers to enhance their jobs and improve their
productivity. The ability to configure systems quickly and easily to handle unique data collection
requirements using speech-enabled applications is the next generation in speech-to-data applications.
Speech offers a natural and intuitive way of interfacing with a computer that lends itself very well to
solving the problems of field-based computing. The computer becomes a useful tool instead of a
distraction. Field professionals’ jobs are complex and physical and they can't afford to spend time each
day working with a temperamental computer. With speech-enabled applications, users have shown all
the office-based efficiencies of other mobile GIS applications and have also shown significant
improvements in speed, accuracy, efficiency and safety.
With hands free and eyes up, operators can enter data while walking, driving or riding a bike. They can
input information while they're making a repair or tracing cables through a connection box. Integrated
with GPS or other devices, they can capture location and other measurements automatically. Field
professionals routinely realize order of magnitude improvements in productivity - and corresponding cost
savings. Speech-to-database technology makes it just as easy and efficient to update changes and maintain
the integrity of the database over time. The long-term value of a dispatch system and databases lies in assuring the currency of the data. With its intuitive interface and direct integration with the database, it
becomes much easier to keep the data accurate, current and immediately accessible to workers in the
field.
Speech recognition enables the use of non-computing devices such as a cell phone for interaction with
server-based applications. The user is able to dial into the server and use speech recognition to retrieve
service orders, update status, and closeout orders. The enterprise can choose the best device, laptop
computer to cell phone to meet their field requirements. Costs may be reduced by increasing productivity
and optimize the tools in the field based on the job and data requirements.
Speech-To-Database Technology
Speech recognition solutions are being used by organizations across the United States and Canada such as
Florida Power (FPL) and Light, Cingular Wireless, and Bell South for rapid and reliable speech based
data entry to improve field productivity.
Speech recognition is allowing users with the ability to collect field data faster and with greater accuracy
than traditional methods. This speech-to-data capture software records speech input, processes it into a
structured format, can combine it with GPS location information, and deliver a standard database file that
is easily uploaded to any relational or spatial database. Speech is providing a complete set of modular
tools that are used together to create an adaptable system suited to any data input requirement, in a world
where flexibility and adaptability is required.
Speech recognition over a cell phone to allow field personnel to receive information on dispatch,
provisioning, directions, or input information to status and closeout orders is deployed at many industry
leaders in the US and Canada. The field technician using a standard cell phone or landline phone dials
into a Company phone number. The server answers the phone call and prompts the technician for their
name and password. Using a voice verifier, the system can provide additional levels of security by using
the unique voiceprint of the person much like a fingerprint. The technician completed the security
enrollment and is then asked what they would like to do. Using rich prompts and embedded help, the user
is able to have work dispatched, retrieve required information, run remote tests, and complete orders. The
speaker independence means no voice training so that accents and voice inflections do not affect
performance of the system. The system is optimized for high background noise so that a hands free visor
microphone may be used in the vehicle even at highway speeds.
FPL has fielded speech recognition as part of their Storm Recovery response strategy. FPL personnel,
foreign crews, and contract staff equipped with a cell phone are able to call in from the field and report
damage from the storm directly into the trouble management server. Damage, location and needed
response are collected using speech recognition over a cell phone. Once the damage has been repaired,
crews may call in and report on repairs completed and restore service faster and with less manpower than
in prior storms. The system has been deployed and used under storm conditions. FPL extended the
system from its original use to include the retrieval of enterprise information when other systems were not
available during the storm.
Cingular Wireless has deployed speech recognition for management of trouble tickets into Remedy. The
need for field technicians to call into the dispatch center was reduced from 6 calls per day to less than 2
providing a return on investment in less than 6 months.
BellSouth has deployed speech recognition for installation field technicians to report on work completed
and to perform remote testing through a speech-cell phone interface. An application for supervisor
personnel to call in and retrieve information on technician location and job status is in development.
Mobile Gas in Alabama is using speech recognition over a cell phone to allow field personnel to retrieve
service orders and close service orders. Accents and speech inflections have not caused any issues with
the system. Real time update and access from the field is providing productivity gains.
SoCal Edison, using speech recognition and a pen computer equipped with GPS allows field personnel to
speak information on light poles to identify the location, type, and condition of their over two million
light poles. The field technician walks or drives to the pole and speaks in the object, attribute, and value
information on the pole. The speech is recognized, the GPS is correlated to the speech, and the
information is loaded directly into the GIS for later analysis. Day burners, burned out lights, as well as
lights that are not in the system are identified so that revenue as well as cost items are identified for
action. The data collection speed improved over 60% to SoCal Edison’s traditional approach and the backend processing of the data improved by over 900% resulting in a return on investment of less than 4
months.
Speech recognition uses proven and emerging technologies for accurate and rapid field data and asset
information collection. These technologies include:
- Speech Recognition: A system for converting a person's speech input into words and commands
in a computer. The speech can be either live or recorded.
- GPS: A constellation of satellites that broadcast position data to a GPS (Global Positioning
System) receiver. DGPS is Differential GPS and refers to a system using a reference site with a
known location to "correct" other GPS receivers in the field. This approach can easily improve
accuracy to less than 1 meter in error.
- GIS: A system for maintaining, displaying, and analyzing geographically oriented data. Speech
recognition is used with GIS (Global Information System) like ArcView to display and manage
spatial data and use the inherent GIS capabilities.
- Mobile Data Communications: A radio based or cellular system for transferring computer data
between the field and office. Speech recognition can be used with mobile communications for
transferring data real-time, back to the office.
So what is new? The market is familiar with voice recognition technology. Speech recognition today does
more than speech to text or even voice commands. Speech recognition can enter spoken information into
relational and spatial databases along with digital photos, wav files, laser measurements, and other
relative information. The system is speaker independent requiring no voice training and includes audio
prompting.
With the advances in mobile computing and the resulting portability, the devices continue to get smaller.
The major obstacle to true mobile computing is input/output. Speech recognition addresses the issue of
input/output.
Speech recognition may be packages to address different business needs such as asset inventory and field
data collection to completing routine reports and forms.