Speech recognition and mobile field solutions
Steve Bourgeois Datria Systems, Inc. 7211 South Peoria Street, #260 Englewood, CO 80112 Introduction Telecommunications and Utilities are using Mobile Computing with PDAs, new wearable computers and cell phones for faster customer response, quicker outage management, storm recovery, and improved customer service. Speech recognition as the user interface and light weight wearable computers have progressed to the point where proven solutions are now in the market allowing hands free, eyes up speech data collection and data entry, whether collecting asset location and condition or inputting information through a defined forms template. Combine speech and a cell phone with connectivity to the Internet along with dispatching and routing allows real time mobile dispatching and access by the field for reporting. This paper will discuss speech recognition and wearable computing/cell phone applications in Telecommunications and Utility market as a critical component to an enterprise-wide information management system in creating a competitive advantage. Specific examples of how companies such as BellSouth, Cingular Wireless, Florida Power and Light, Mobile Gas, and SoCal Edison are using speech recognition as part of their mobile workforce strategy with real business benefits will be discussed. Abstract As organizations strive to become more efficient, they continue to explore and apply advanced technologies to assist them in improving their operational processes and becoming more responsive to their customers and constituents. They invest millions in powerful enterprise systems, including Relational Database Management Systems (RDBMS), Work Force Automation Systems (WFA), Enterprise Resource Planning (ERP), and Geographic Information Systems (GIS). They expect to leverage these complex, data-intensive systems to achieve better management and maintenance of physical assets and improved operational efficiencies within their business processes. However, these systems are meaningless without data and extension of the information into the field. The Enterprise Wide Solution is comprised of three main components: 1) Database, 2) Data Exploitation Infrastructure and 3) Data Collection Applications. The enterprise solution has no value without the database. The database loses value as the quality of the data decreases. Issues surrounding data are: 1) Currency of data, 2) Cost of collection, 3) Accuracy of the data. It takes a constant investment to keep current and accurate data in the system. The enterprise is constantly changing and maintaining an awareness of assets is critical to most organizations. Data accuracy and the cost of making bad decisions weigh heavily on management. The cost of collecting the data prohibits many organizations from having sufficient information to make informed decisions. This leads to the issue of data availability. Without data, the enterprise cannot capitalize on the functional performance of its information systems. Interestingly, while many of these enterprise systems have provided cost-benefits by improving field support functions in the office, few have had any real impact on field productivity. Most field applications provide benefit by improving or eliminating office functions, such as paperwork and data input processes, or by reducing overtime through better scheduling. But they rarely see major efficiencies in the actual fieldwork. An enterprise wide solution without data is just more interesting software. So, what if you could replace labor-intensive data collection methods with a faster, more accurate and intuitive method of collecting data and completing forms while saving time and money? What if there was a voice-enabled system that allowed you to create a virtual office wherever your work took you? Advances in the field of data collection have progressed to the point where you can collect the data using speech, with your hands free and your eyes up. This applies to collecting data on the location and condition of assets, routine work reports through speech enables form templates. Field professionals now have the freedom to travel within their environment, while actively populating their database speech-to-data products. Field professionals have current and accurate information immediately available for on-the- spot decisions. This paper will discuss the benefits of utilizing speech technology in a mobile computing environment to collect the data required in implementing enterprise-wide solutions to effectively manage your business. Field Professionals Go Mobile Mobile computing has become a major growth market in the last year, as the price-performance point of hardware has reached a level acceptable to mainstream corporate users. Within many of these corporate users, transferring data to and from the field is one of the largest challenges. Many telecommunications and utilities have invested in expensive laptop based solution with wireless data connectivity to allow near real time communications to keep the enterprise current. At a cost of $10,000 per fully equipped field technician, many millions of dollars are spent to enhance field access to data. Improving customer service with fewer full time equivalents (FTEs) is the goal. Hardware alternatives to the rugged laptop and pen-based computer have arrived on the market such as WAP enabled cell phones, CE and Palm Personal Digital Assistants (PDAs) and, more recently, wearable computers that are lightweight, with innovative screen options. The smaller, lower cost PDAs are reducing the cost of ownership but with challenges for the field. The awkwardness of trying to enter data while driving, walking, being bounced around and looking at a screen that washes out in the sunlight has frustrated many field professionals. The safety aspect of mobile computers and the need to carry devices in your hand while navigating a difficult path has caused many to wait. The resistance to technology by field professionals is a common excuse given for not investing field automation. The reality is that managers still need reliable information and often that information must come from observations in the field. Whether it is the management of assets, in-situ data for a design project, construction reports, asset inventories or condition assessments, managers depend on the field professional’s ability to collect accurate, dependable, and thorough information. In large measure, effectiveness can be directly tied to how efficiently it controls and manages its assets for the customer or public's well being. In the past, field professionals would go into the field, gather important data, write down the critical information, and enter it into the computer back at the office. Before leaving the field, the field professional must be certain that they have everything they need and must be able to find and understand their notes when they get back to the office. The ability to take neat, thorough and organized notes becomes critical. In addition, many field technicians possess a cell phone. Calls into dispatch centers and central offices are frequently made to get that piece of information from a person to complete the job or to report results. This adds costs to operations that must be absorbed as part of the cost of doing business in today’s economy. ![]() Field professionals doing the data entry are so consumed with just trying to get the information documented that critical information is missed requiring revisits to the site. Notes are poorly written, or incomplete, or photos cannot be referenced to the written note. In some cases, the information missed cannot be recovered – such phone service installed or gas line repaired. Going back to the field is time consuming, expensive, and frustrating. What is really needed is a faster and easier way to collect the data and enter it into a database, and then cross-reference and tag all information, locations, and measurements. That would leave the field professional free to inspect; free to make assessments; free to study the situation, notice the unusual and think without being distracted by note taking or struggling with a interface that is intrusive and distracting. Orders need to be completed faster, with fewer live interactions with the central office. In data-intensive computing environments, organizations are desperately seeking data collection solutions that reduce the financial and operational drag on their enterprises. The complexities of large-scale data collection, the need for ongoing data input and the integration of media types make it vital to develop innovative ways to input data quickly, efficiently and accurately. Advances in mobile computing and integrated devices such as cell phone with CE, Palm, and enhanced displays has provided the ‘state of the art’ for field data collection – the “mobile messenger”, whether it is a laptop, pen based computer or cell phone. Mobility is key, and taking the computer into the field was thought to be the answer. Many organizations have implemented wide scale mobile computing solutions but have not seen the effective gains they are seeking, and so the search continues. As computers continue to become smaller, the major obstacle to mobile computers is input/output. Keyboards, pen interfaces, and icon driven systems require relatively large input/output devices. For this reason, mobile computers remain a niche industry. Walking and driving while trying to type or pen enter field information is difficult at best and error prone, not to mention dangerous. These devices, in addition to being awkward, are hard to see in the sunlight. This is the interim step and not the solution needed for field data collection. Simply throwing mobile computers into the field has not offered. But, what if data input and output were as simple as speaking to your computer? Or, what if the computer could be eliminated by using a cell phone where short transactions could be performed reducing the costs of ownership of mobile field solutions. ![]() Field professionals that normally write information down and enter it later, or use a pen based collection tool would benefit from some assistance. The answer is what we do everyday to communicate, speak. Speech technology enables field professionals to see & say, they simply speak to their computer, directly or over a cell phone, just like they have been speaking to each other for years! The concept is simple: use speech-to-data technology to speed data input, increase efficiency, improve accuracy and reduce the overall cost of data collection. Why speech technology? The traditional mobile computing data collection process is slow, cumbersome and error-prone. Even with software specifically developed for field applications, the gains in efficiency are incremental at best. The problem is the computer itself. You have to constantly stop and look down at the computer and take your hands away from whatever you're doing to enter data. With speech technology you keep your hands free and your eyes up. Speech has been viewed as the next major break through in mobile computing. Hands free eyes up data entry are seen as the next logical evolution in field automation. Speech-to-database technology offers an innovative system using proven and emerging technologies for accurate and rapid field data and asset information collection and maintenance. These technologies include: Speech Recognition, GPS, Multi-Media Data Collection, GIS, Mobile Communications, and Mobile Computers. Most organizations are familiar with speech recognition technology. This may include speech engines for speech commands, hands-free dialing, and direct dictation tools. But, field use is more than speech-to-text or even speech commands. The answer lies in relating spoken information to relational and spatial databases along with GPS coordinates, digital photos, .wav files, laser measurements and other relative information from the field. The system must be speaker independent, requiring no speech training and include audio prompting. While mobile computers are flourishing, they are not living up to their full potential or the extensive needs of users. The objective to fully empower the mobile computing environment with a natural, easy-to- use interface that is compatible with the types of activities performed in the field—has not been met through mobile computing, that is until the interface is truly mobile, and uninhibiting -- namely speech. By nature, field activities are physical and do not lend themselves easily to traditional computing interfaces, which require the full attention of the user’s mind and body—eyes are focused on a screen, hands are positioned on a keyboard. This may be acceptable for workers sitting at desks, with source materials organized around them, but in the field, it is awkward, distracting and counterproductive. By utilizing speech as the computer interface, field professionals are free from the burdensome limitations of desk-based interfaces and enable them to use their computers to enhance their jobs and improve their productivity. The ability to configure systems quickly and easily to handle unique data collection requirements using speech-enabled applications is the next generation in speech-to-data applications. Speech offers a natural and intuitive way of interfacing with a computer that lends itself very well to solving the problems of field-based computing. The computer becomes a useful tool instead of a distraction. Field professionals’ jobs are complex and physical and they can't afford to spend time each day working with a temperamental computer. With speech-enabled applications, users have shown all the office-based efficiencies of other mobile GIS applications and have also shown significant improvements in speed, accuracy, efficiency and safety. With hands free and eyes up, operators can enter data while walking, driving or riding a bike. They can input information while they're making a repair or tracing cables through a connection box. Integrated with GPS or other devices, they can capture location and other measurements automatically. Field professionals routinely realize order of magnitude improvements in productivity - and corresponding cost savings. Speech-to-database technology makes it just as easy and efficient to update changes and maintain the integrity of the database over time. The long-term value of a dispatch system and databases lies in assuring the currency of the data. With its intuitive interface and direct integration with the database, it becomes much easier to keep the data accurate, current and immediately accessible to workers in the field. Speech recognition enables the use of non-computing devices such as a cell phone for interaction with server-based applications. The user is able to dial into the server and use speech recognition to retrieve service orders, update status, and closeout orders. The enterprise can choose the best device, laptop computer to cell phone to meet their field requirements. Costs may be reduced by increasing productivity and optimize the tools in the field based on the job and data requirements. Speech-To-Database Technology Speech recognition solutions are being used by organizations across the United States and Canada such as Florida Power (FPL) and Light, Cingular Wireless, and Bell South for rapid and reliable speech based data entry to improve field productivity. Speech recognition is allowing users with the ability to collect field data faster and with greater accuracy than traditional methods. This speech-to-data capture software records speech input, processes it into a structured format, can combine it with GPS location information, and deliver a standard database file that is easily uploaded to any relational or spatial database. Speech is providing a complete set of modular tools that are used together to create an adaptable system suited to any data input requirement, in a world where flexibility and adaptability is required. Speech recognition over a cell phone to allow field personnel to receive information on dispatch, provisioning, directions, or input information to status and closeout orders is deployed at many industry leaders in the US and Canada. The field technician using a standard cell phone or landline phone dials into a Company phone number. The server answers the phone call and prompts the technician for their name and password. Using a voice verifier, the system can provide additional levels of security by using the unique voiceprint of the person much like a fingerprint. The technician completed the security enrollment and is then asked what they would like to do. Using rich prompts and embedded help, the user is able to have work dispatched, retrieve required information, run remote tests, and complete orders. The speaker independence means no voice training so that accents and voice inflections do not affect performance of the system. The system is optimized for high background noise so that a hands free visor microphone may be used in the vehicle even at highway speeds.
FPL has fielded speech recognition as part of their Storm Recovery response strategy. FPL personnel, foreign crews, and contract staff equipped with a cell phone are able to call in from the field and report damage from the storm directly into the trouble management server. Damage, location and needed response are collected using speech recognition over a cell phone. Once the damage has been repaired, crews may call in and report on repairs completed and restore service faster and with less manpower than in prior storms. The system has been deployed and used under storm conditions. FPL extended the system from its original use to include the retrieval of enterprise information when other systems were not available during the storm. Cingular Wireless has deployed speech recognition for management of trouble tickets into Remedy. The need for field technicians to call into the dispatch center was reduced from 6 calls per day to less than 2 providing a return on investment in less than 6 months. BellSouth has deployed speech recognition for installation field technicians to report on work completed and to perform remote testing through a speech-cell phone interface. An application for supervisor personnel to call in and retrieve information on technician location and job status is in development. Mobile Gas in Alabama is using speech recognition over a cell phone to allow field personnel to retrieve service orders and close service orders. Accents and speech inflections have not caused any issues with the system. Real time update and access from the field is providing productivity gains. SoCal Edison, using speech recognition and a pen computer equipped with GPS allows field personnel to speak information on light poles to identify the location, type, and condition of their over two million light poles. The field technician walks or drives to the pole and speaks in the object, attribute, and value information on the pole. The speech is recognized, the GPS is correlated to the speech, and the information is loaded directly into the GIS for later analysis. Day burners, burned out lights, as well as lights that are not in the system are identified so that revenue as well as cost items are identified for action. The data collection speed improved over 60% to SoCal Edison’s traditional approach and the backend processing of the data improved by over 900% resulting in a return on investment of less than 4 months. Speech recognition uses proven and emerging technologies for accurate and rapid field data and asset information collection. These technologies include:
With the advances in mobile computing and the resulting portability, the devices continue to get smaller. The major obstacle to true mobile computing is input/output. Speech recognition addresses the issue of input/output. Speech recognition may be packages to address different business needs such as asset inventory and field data collection to completing routine reports and forms. Asset Inventory and Field Data Collection Speech recognition used with sensors such as GPS, laser range finders, barcode, and others allow users to quickly and accurately identify the location of outside plant assets with attributes. This information is invaluable in the management of infrastructure to deliver services. Improved customer service and reduced operations costs are the result of accurate infrastructure databases. Routine Reports and Forms For environments that require a more direct interaction between the data collection application and the host database, speech recognition answers the bill. With speech-enabled data access, users can query database records using voice commands, view data in predesigned screen forms and then edit or add to that information by voice commands and a variety of input devices, including keyboards, pens, touch-screens, bar-code readers, and other devices. Speech enabled forms provides a more standard interface to existing database applications. Speech enabled forms may be run locally on a Windows Intel Computer, CE PDA, or over a cell phone or landline phone. Speech Recognition Developers’ Kit Tools exist in the market that allows developers to build speech enabled forms and vocabularies working with the back office systems. Conclusion Mobile computing was meant to be the means to reduce field data collection costs, increase speed and improve overall data reliability. In real-world user environments, mobile computing has not proven to be the ‘solution’ that answers all of these objectives. However, mobile computing combined with speech has demonstrated significant benefits in terms of operational efficiencies and cost savings as seen from leaders in the Telecommunications, Utility, and Public sector users. The systems themselves are designed from the ground up with efficiency in mind: Natural, human interface. The speech is the first mode of communication humans developed. Written language, the basis for most other communication, is learned much later. Consequently, people tend to be quite comfortable with verbal language (notwithstanding fear of public speaking) and adapt to using it in a computer environment quickly and easily. Short learning curve. As a natural user interface, operators can learn to use speech quickly and can be productive in a critical data-collection environment in a short time frame. Even operators with little or no previous computer experience have been able to learn the system quickly and use it productively in a matter of hours or days, depending on the complexity of the application and the data they must collect. Greater safety. Speech-enabled input frees the hands and the eyes, which must be fully involved to accomplish written input. With free hands and undistracted vision, operators can concentrate on the world around them, not the words they are writing—particularly important for field data collection. If driving in a car, inventorying street signs, for example, operators can keep their hands on the steering wheel and their eyes on the road and still collect pertinent, accurate data without stopping at each object. This freedom reduces fatigue from looking up and down at a computer monitor. In addition, field crews can operate in adverse weather conditions. For example, they can wear gloves in winter without slowing down. Increased speed. Without the constant stopping and looking down at the computer screen to input data, operators can move much faster from object to object. Whether they are counting merchandise on warehouse shelves or inspecting the condition of utility poles in a neighborhood, they are concentrating primarily on the assets they need to observe, not on the computer. In addition, speech recognition applications may be configured so that single verbal phrases will populate multiple fields, further increasing the speed at which data can be collected. Optimum accuracy. With input of data based on a predefined vocabulary, operators not only collect data quickly, but accuracy is also assured. Using an optional interactive mode, a rules base can assure quality and provide immediate feedback to the operator if a speech statement is misunderstood or is inconsistent with the allowable criteria for a particular data object. All these benefits equate to reduced costs, improved productivity and more reliable data to make the end-use application more functional for its intended use. The real value of information systems lies in the data they manage and their ability to leverage that data to improve core business activities. Speech-enabled data collection makes it possible to collect more data, faster, with higher accuracy and lower costs to support applications and vital business processes. | ||
|
|