GISdevelopment.net ---> GITA 2002 ---> Municipal Perspective

Getting it right: 99.9% pure OMS data

Langley Willauer
Integrated Mapping Services, Inc.
58 Bayview St.
Camden, ME 04843


Abstract
No application is harder on data than an Outage Management System (OMS). At any time, any customer can call with a report of no power. And periodically, the system is stressed with massive volumes of trouble calls. When the data aren’t clean, dispatchers are quickly overwhelmed with unlocated calls, and worse, OMS functionality failures due to incorrect data. Utility companies continue to find it challenging to keep OMS data current and clean, but much has been learned in recent years.

Outage systems employ specialty data models optimized for rapid prediction and switching. Utilities periodically refresh the OMS database from the GIS. Each update is an opportunity for inconsistent and incorrect data from the GIS to infect the OMS. Data model changes in the GIS to support changing business needs further complicate the updating process.

This paper reviews the issues inherent with OMS data, then gives a number of practical strategies for maintaining these data, including ways of keeping GIS data from becoming inconsistent, incremental vs. complete updates to OMS, offline data exception discovery, and using metadata to manage the frequency, type, and resolution of any errors.

Objectives
  1. Overview of Outage System data issues
  2. Specific real-world examples illustrating the issues
  3. Strategies for managing OMS data
OMS Basics
An outage management system allows utilities to better support the detection and restoration of service interruptions. Mature products and custom solutions support the following functionality
  1. Graphical display of trouble calls, outages and crews
  2. Rapid Prediction of Outages from trouble calls
  3. Creation of Outages from SCADA information
  4. Separate call-taking application or interface to billing system
  5. Reporting & archiving capabilities
  6. Efficient user interface
OMS operations are characterized by long periods of light activity, punctuated by storms. During storms, any deficiencies with the software or the underlying data are magnified, potentially rendering the system useless.

For a typical utility, the most expensive part of an OMS is the data. However, because the database is often in place before an OMS is deployed, this cost can be hidden. Much of the cost of deploying an OMS can be attributed to making the GIS data model support OMS, finding and fixing inconsistencies in the GIS data, reconciling customer information between GIS and the billing system, and bringing all the data up to an acceptable standard.

An OMS depends heavily on the data completeness and quality. The OMS is typically very good at revealing problems with the underlying GIS data, but not so good at fixing them. The OMS may be especially sensitive to certain kinds of errors, especially errors that result from uncommon combinations of processes.

Separate dataset
GIS in utilities grew out of facilities management and automated mapping. The data and modeling requirements of these systems differ from the requirements of OMS. Therefore, all OMS systems rely on a separate database for their operation. OMS databases are highly optimized to support prediction and switching operations, at the expense of maintainability. One of the requirements utilities face in deploying these systems is to build a bridge between their operational data, and the OMS data structure. This bridge is used often.

Even utilities in stable areas face changing facilities and customer locations. Utilities in growth areas (e.g. Phoenix, Arizona) add thousands of new customers each month. The highly-tuned OMS data structure must be refreshed regularly from the underlying facility data. Some vendors offer an incremental update tool for accomplishing this, while others require that the OMS data be rebuilt for any change in the facility data.

High Volume
In early 1998 the Northeast U.S. experienced a massive ice storm. Tens of thousands of customers lost power, some for more than two weeks. During this event, call centers were at their maximum throughput for days, taking thousands of calls per hour. And utilities that had contracted with outside call centers were flooded with data from those services. The modern call center has given customers the expectation that they can report information and get feedback. This customer expectation translates into a high volume of data throughput for the OMS. When done right, the OMS can keep up with call center and Interactive Voice Response (IVR) system volumes, providing the operational crews with synthesized information on the state of the network.

Mission Critical
In the days of smaller utilities and lower customer expectations, operational personnel did not consider OMS to be important. Now, with a fire hose of data pointed at them from offshore call centers and IVRs, and performance-based rates on the horizon, OMS has become mission critical.

Kinda of good data
Good data is no accident. There is no substitute for the painstaking effort required to figure out what makes it good and how to keep it that way. These efforts can be divided into three areas, systematic, random and automation.

Systematic—underlying software works
A systematic error is one that is caused by a design flaw in the data model or software. For example, a primary meter might not be modeled correctly, in that the primary tap does not connect to the primary conductor. This problem will affect every primary meter. In some cases those responsible for the software may have fixed the cause of the systematic error, but the data have not been corrected. Sometimes these errors can occur for a very specific set of criteria, and therefore only affect a few customers. In this case the problem may not even be detected until the data-cleanup process is well on its way.

Random—users are careful
Random errors are minimized when well-trained users enter data carefully into a well-designed system. At one successful project I was involved in, the manager deliberately “overtrained” the people who were making updates. The idea was that they would know the full range of what was possible, even if they couldn’t use all that they learned. For the most part, people want to do the right thing, they just get frustrated by not having enough information.

Automation—software is designed to support users
There’s a lot of poorly designed software in the world. This is not because people don’t care, it’s more because they don’t know. Good software is hard to design. You have to live with your users, have lunch with them, follow them around. But once you do that, you can make the tool fit their needs.

Good software design makes clean updates possible, despite constant changes in the state of the network. It is always better to take the time to make incorrect data combinations impossible, rather than figuring out how to fix them after they do happen. Here database triggers, validators and application logic are all your friends, for with them you can capture business rules and enforce them.

People are all different, and behave in random ways. This is a good thing. But software is the interface between the randomness of people, and the precise requirements of application logic. A user interface must therefore be a filter between randomness and precision. People should be allowed to press any button, at any time, and the software should only allow operations that make sense. The software should always respond with meaningful information to every user action. The software should provide the user with an indication of how long something is taking, or will take.

Steps to clean data
Now that we know what good data are about, we have something to strive for. To get there, you have to start at the beginning.

Source Data
Most OMS data are derived from the GIS. There are many aspects of GIS data that affect OMS.

Link to Billing System
An OMS needs a source for customer information. Although some systems may have their own customer information, most depend on a view, snapshot, copied table or real-time interface to a billing system. This link gives rise to several issues:
  • For non-real-time links, what about customers who are getting power, but their record hasn’t made it to the OMS?
  • Active customers with an incorrect link?
  • Seasonal customers?
  • Disconnects for non-payment?
Each of these questions gives rise to a potential problem.

Phase Attributes
There are three ways to track phasing, each requiring more data (1) ignore it, i.e. don’t consider phasing at all, (2) Use A, B, C, AB, AC, BC, ABC, and, (3) consider also the order of the physical configuration in the field, e.g. AC is not the same as CA. The more detail a utility captures, the more time data cleanup personnel will spend getting it right.

Connectivity
A map edit that causes a discontinuity in the primary necessarily affects all downstream customers. The solution is to either not allow such conditions through automation, or provide personnel who make edits with tools to check continuity (trace routine). Relational Integrity

Some systems provide built-in checks for data integrity. These tools are very useful for discovering underlying errors in the source data. If not caught, these problems become exacerbated once in the OMS.

Creating OMS Data
An OMS requires a separate data structure from the GIS. Therefore, every OMS has a mechanism to create this data structure from the GIS data. These mechanisms vary widely in their sophistication and features.

Some generalizations are useful. The process for an initial build is time intensive. Once a structure is built, source-data (GIS) changes can be incorporated either with an incremental build for the affected areas, or by repeating the full build. The build can reveal GIS errors, or errors within the special OMS structure.

Systematic errors
To bridge GIS and OMS, some logic is required. Sometimes, this bridge is built using “glue code,” i.e. specific logic which tells the OMS about the GIS. However, errors in the glue code lead to errors in the OMS’s understanding of the GIS. For example, I investigated a glue-code error where move-in/move-out customers where not being updated properly, leading to missing links between customers and OMS.

Another problem is a single-phase device on a multi-phase line. If these are not modeled correctly by the glue code, outages can be overstated.

Incremental
An incremental update process can save a great deal of processing time, however, the update process may lead to errors. These errors can be unique to the particular incremental update process. One of our customers found the random problems associated with processing incremental updates harder to manage than just rerunning the full update each time, and so abandoned incremental updates.

OMS Data
Once built, OMS data need to be checked in several ways.

Tests for completeness
One of the most important tests is whether the OMS view of the world matches that of the GIS. Any errors in the network can lead to a few or many thousands of customers not appearing in OMS. When this occurs before a storm, the OMS application is overwhelmed with “unlocated” calls. It is critical for a utility to have a mechanism to find and fix areas where the OMS data are incomplete.

Strange errors
The nature of the OMS data structures make them intolerant to errors. If errors from the GIS go unchecked into OMS, then strange things can happen. Also, if anything can go wrong in a production environment, it will. Certain situations within the OMS itself can lead to internal inconsistencies in the OMS data.

At one client site we discovered two such problems. One involved an anomaly in the program’s in-memory data structures triggered by bad GIS data. Through analysis of the error trace, we were able to build a tool to detect these problems in the future. A second problem resulted from erroneous loops in the primary. These loops, if undetected, caused the prediction engine to fail, leading to a system failure during a storm.

Measuring Progress
If the people responsible for clean data have a reliable way to measure their progress, they are going to be a lot more interested in that progress. It is therefore vital to the entire OMS data effort to first set an achievable goal for data quality, and then provide a means to measure progress towards that goal.

What's ahead
The last five years have been marked by success at many utilities in deploying OMS. Many of those successful companies are bruised and scared from dealing with the these systems for the first time, but also from the immaturity of some of the systems. Now that the science of OMS is better understood, we can expect further refinements.

More automation
Successful OMS implementations are about quality data, good work flow, and an effective sales effort to the organization. Once the systems are in place, utilities can incrementally build on what’s working to include more automation. Such growth depends on high-quality data. Better interfaces to other systems are critical, including Supervisory Control and Data Acquisition (SCADA), billing/customer care, field units, etc.

Push errors back
The best situation for data errors is to have the person creating the error discover and correct the problem on the spot. The worst situation is for the error to go undetected until it’s three ‘o clock in the morning and then to have it take down OMS. The effort, therefore, for utilities wanting 99.9% pure data, is to push errors back to their origin. Start with the GIS. Use database triggers, connectivity rules, business rules and common sense. Make it impossible for a transformer’s phase to be incompatible with the conductor’s phase. Make it impossible for a device to exist if not connected to a conductor. Make it impossible to close out a construction job if a trace hasn’t been successfully completed from the service point to a power source.

Scrub the GIS. For errors that you can’t block, build tools in the GIS to find them and either automatically fix them, or drive an operator to them. Do this regularly. Periodically review the list of tools, and see if any can be pushed back to data creation.

Manage the creation of OMS data. Any errors that slip through GIS may affect OMS. When creating the OMS data, build tools that tell you if anything was missed between GIS and OMS. For every error encountered in OMS, analyze the problem and either build a tool for blocking it, or better yet, push the error detection back to the GIS.

Use Metadata to Manage Everything. Metadata (data about data) can help manage this entire process. Instead of simply detecting an error and then fixing it, detect and RECORD in the DATABASE the fact of the error. When it’s fixed, RECORD that. Such a strategy allows managers to (1) see the quality of the data at any point in time, (2) generate statistics about how many errors are occurring and how quickly they’re being resolved, and (3) have a synoptic understanding of the health of the data.

Conclusion
Some utilities have achieved 99.9% pure OMS data. Getting there requires efforts along many fronts. Required is a good understanding of the various software systems in the mix, and the resources to make them meet the businesses’ needs. Required also is the ability to create effective business processes to keep errors from happening twice. Equally important is the ability to measure quality: when given a tool to measure their success, employees are much more motivated to achieve the goal. And finally, patience: this work takes time and effort.

© GISdevelopment.net. All rights reserved.