Logo GISdevelopment.net

GISdevelopment > Proceedings > GITA > 2001


GITA 2002 | GITA 2001 | GITA 2000 | GITA 1999 | GITA 1998 | GITA 1997 |  
Sessions

A tangled web of pure opportunity

Directions for data

Forging the future

How they did it - and what's next

Integrating work management

Mobile solutions- taking it to the streets

Operations support

People make the difference

Systems architecture

The local government perspective

Tying IT all together

Vertical applications


GITA 2001


Direction for Data


Foundation of the quality assurance odyssey


A sample of maps from each unique set should be fully reviewed. If map standards or a legend exist, it provides a good starting point. Where a unique symbol, line, text or combination of these elements exists, it should be noted, copied and matched to an entity/attribute/value in the target data model. In theory, the sample maps should be reviewed to the point where no new unique symbology is being found. It is important to use a photocopy from the source map itself when conveying the symbology. This ensures the full context of the source symbology is conveyed. For example, an 'X' along a Water Main might mean Closed Valve but represent an entirely different entity along a Sewer Main. In addition to a photocopy of the source symbology, any specifications or comments describing how the asset should be interpreted and converted should be included.


Figure 2 shows an example of a Source Interpretation Document for a Hard Copy Map.

Side Note: There is an additional benefit of performing this exercise: It can assist in determining the map standards for the GIS/AM/FM system. A good start in determining what standardized symbology should exist in new system begins with knowing exactly what is currently being used for manual mapping.

For Tabular Source Data
Interpretation of tabular data, whether from a hard copy format, database or, commadelimited files while less complicated than maps, is still important and should be documented. Evaluating tabular data typically consists of a column mapping from the tabular data to the target entities, attributes, and values along with any specifications or migration rules, such as parsing of fields that might pertain to the target system. Depending on the amount of data that is coming from tabular sources, it could be included in the mapping document or source matrix. For example,


For Legacy GIS Source Data
Where source data exists in a legacy GIS/AM/FM system, the source interpretation document typically evolves into a data mapping document where the entities, attributes and values from the legacy system are mapped to the target system. However, the source interpretation goes much further and also must cover the migration of graphic elements, including x and y coordinates, as well as connectivity.

Map Standards Document
Prior to starting conversion, it is important to have a clear understanding as to what the converted data should look like graphically. Specifically, there is a need to define the symbols, lines, and text that will be used to represent the various combinations of features, attributes, and values. The size of the symbols/text and the thickness of lines must also be defined. In addition, the frequency of annotation placement must be specified. The sum of these specifications results in a Map Standard, much of which is typically determined by or becomes the target system's rulebase. A Map Standard such as the one in Figure 3 compiles the graphic information and allows for an easier understanding of what the final map product should look like.


Figure 3: Map Standards Example

Mechanism for resolving sourcre/specification queries
No matter how much detail is established through the source matrix and source interpretation document, the unknown always arises, i.e., there are bound to be unique instances of symbology on the source data that are not covered by, or perhaps conflict with the specifications in the Source Interpretation Document.

Another typical scenario occurs when a source document is simply not legible. These conflicts essentially represent occurrences outside the reasonable constraints of the Source Matrix/Source Interpretation Documents.

The customer will feel more comfortable with the delivered data knowing that 'guessing' is not involved in handling source anomalies, but rather they are being queried and resolved. Therefore, a mechanism for resolving these source conflicts, and updating the specifications with the resolution, should be part of the planning process.

The mechanism for raising source conflicts is performed by the team performing conversion. That team provides queries to the experts at the utility who best understand the source data. Those queries are answered by the source expert and returned to the conversion team. In this process, it is important that all parties use the common terminology of the target data model since the use of 'local' terminology often leads to confusion. For example, if the model contains specific types of valves such as, Flow Regulating Valve, Pressure Regulating Valve, or Control Valve, it is important that both the query and resolution address exactly what type of valve is in question. Otherwise, the answer might refer to an 'isolation valve' leaving the recipient guessing as to exactly which valve is being discussed.

Ideally, all specification and source queries should be logged in a database if possible by entity, attribute, and value. This helps ensure the same query is not asked twice, (perhaps from a different map or source material), and it organizes the queries/resolutions in a manner that facilitates updates to future versions of the conversion specifications.

Specifications for addressing source queries should also specify a time frame for their resolution. Many a project schedule has suffered due to delays in answering queries. In the event turn-around time becomes a project risk, it might be wise to consider implementing defaults for specific types of queries that can help control production if a fixed response period has expired. These pre-defined, mutually agreed defaults for specific types of queries could be determined by examining past queries and their associated resolutions. For example, if a common query was that the size of a valve was not shown on the source, and the answer to that query nine out of ten times was "set it to 4", then that answer could be used as the default. The ability to track and identify these query/resolution trends is yet another benefit to managing them through a database.

Acceptance Criteria
A lot of attention is focused on defining the acceptance criteria with regards to the percent accuracy that should be met. However, many RFPs and Proposals make references to meeting different categories of accuracy but often do not fully define them or discuss the means by which they should precisely be measured. A set of acceptance criteria should be established that defines not only the error percentage by category, but also defines how errors in the category will be determined as well as how they will be measured.

The methodology for defining how an error percentage should be measured depends heavily on the category or accuracy that is being measured. Accuracy requirements for categories that are set at 100 percent typically only need a definition. For example, if the expectation is that connectivity and programmatically derived data must always be correct, then measuring those against the number of attributes or entities is not critical. However, if an accuracy requirement for a category is less than 100 percent it becomes critical for the measurement approach to be defined. For example, completeness and attribute and location accuracy might be measured as follows:

Attribute Accuracy
This measurement criterion is based on the total number of attribute errors found divided by the total number of possible attributes for those entities that are sampled. An attribute error is recorded when any attribute value for a given entity is not correct as reflected in the source and conversion specifications. X% of all attributes must correspond to the value that is reflected in the source, conversion specifications, etc.

Positional Accuracy
This measurement criterion is based on the number of graphic features whose symbols or lines have positional errors divided by the total number of graphic features having symbols or lines that are contained in a given sample. A positional accuracy error will be recorded when any symbol or line that is not correctly located as reflected in the position shown on the source or as reflected in the Map Standard placement rules. X percent of all sampled graphic features with symbols or lines must correspond to the locational placement rules as defined in the Map Standards and/or the relative location reflected on the source document or material.

Entity Completeness Accuracy
This measurement criterion is based on the number of entities that are found to be missing (or extra) divided by the total number of entities that were sampled. A completeness error will be recorded when any entity depicted on the source document or material is not reflected in the data. The count of the entities captured must be X percent of the number of entities in the sample.

Page 2 of 3
| Previous | Next |

Applications | Technology | Policy | History | News | Tenders | Events | Interviews | Career | Companies | Country Pages | Books | Publications | Education | Glossary | Tutorials | Downloads | Site Map | Subscribe | GIS@development Magazine | Updates | Guest Book