|
GISdevelopment.net --> Policy --> Geographic Information Infrastructure
Supporting the National Spatial Data Infrastructure at the United States Census Bureau
Randy J.Fusaro Chief,TIGER Operations Branch ,Geography Division United States Census Bureau, Washington,DC 20233-7400 Abstract The United States Census Bureau built its national geospatial digital data base, the T opologically I ntegrated G eographic ncoding and R eferencing (TIGER)System, during the 1980 ’ s to support the 1990 Decennial Census of Population and Housing.The TIGER data base is a seamless digital map of the United States in the public domain.Not only did it spawn the multibillion dollar Geographic Information Systems (GIS)industry that exists in the United States today,but perhaps more importantly,it provided a solid foundation for the United States National Spatial Data Infrastructure (NSDI)and a model for the NSDI vision. Background The proliferation of geospatial digital data in the United States (US)exploded in the early 1990 ’ s.Numerous federal agencies were building geospatial data sets,many times duplicating information already in existence or being created elsewhere in the country. The U.S.Office of Management and Budget (OMB)issued a document,Circular A-16, mandating that all U.S.federal agencies coordinate the development and maintenance of their geospatial data sets according to specific guidelines.The OMB established the Federal Geographic Data Committee (FGDC)to coordinate activities across the federal community to ensure the five basic tenets of A-16 would be met by all federal agencies creating geospatial data.The five key components for the US NSDI are:
When the FGDC was first established,only federal agencies were involved in defining the data themes for the Framework Data.The federal agencies had set ideas of what content and standards were necessary to support national programs.State,local,and tribal governments across the United States had their own notions as to how this should be accomplished and forced their way to the table.As federal agencies were looking to state, local,and tribal governments to assist in updating and populating the national data sets, their input and cooperation was vitally necessary. The TIGER/Line . files – the public version of the internally formatted TIGER data base --were a great success when the U.S.Census Bureau first released them to the public in 1989.Many state,local,and tribal partners quickly adopted the TIGER/Line . files as the foundation of new GIS systems to support their jurisdictions.The fact that TIGER data are in the public domain made its use the logical starting point for any local or tribal government entering the GIS arena.Users immediately began improving the geopositional accuracy of the coordinates and updating the network and attribute data as soon as TIGER information was loaded into their GISs.A natural progression in the maintenance of the TIGER data base was that state,local,and tribal partners using the data and improving it wanted to feed to the Census Bureau their corrections and improvements.They became stakeholders in the TIGER data base,its content,and its future. With all that was invested in updating and maintaining their coverage of the data,they insisted they be included in the processes defining standards and metadata.This was true of their relationships with a number of federal agencies.One of the most compelling factors that finally got nonfederal stakeholders to the table for defining the framework layers was the fact that,in many cases,the federal agencies were depending on them to assist in maintaining and updating the data for their jurisdiction.Unfortunately,in a number of cases,the federal government did not need the information at the fine level of resolution that the locals needed.Nonfederal stakeholders refused to maintain data at 1:100,000-scale because that is what the federal agencies needed for consistency,when they needed data at 1:12,000 for local needs.It became readily apparent that this was beyond a solely federal task --a National Spatial Data Infrastructure,including all potential players,was the only way to achieve this vision. The Framework Layers Seven initial data themes of common national significance and coverage were determined necessary to build the foundation for the NSDI.The layers are:
Subcommittees of the FGDC were created to define the content and standards for each framework layer.Each layer contains more than just the seven themes listed above for the subcommittees to define.Each subcommittee is lead by the federal agency with which the main responsibility for that theme lies.The U.S.Census Bureau leads the Cultural and Demographic Data Subcommittee.The governmental units layer is within the purview of the Cultural and Demographic Data Subcommittee data. The U.S.Census Bureau is considered to be the national repository of governmental unit boundaries for the federal government.Governmental unit boundary information is a natural fit with the Census Bureau ’ s mission,as it has been collecting boundaries from local and tribal governments through the Boundary and Annexation Survey (BAS)since the 1940 ’ s to support census and household survey data tabulation activities.The BAS requires local and tribal government officials to certify the correctness of the boundaries for their entity as shown on maps derived from the TIGER data base,or correct the boundaries and certify the corrections.Additionally,numerous statistical area boundaries are defined throughout the decade for various programs and operations,including school district boundaries,voting district boundaries,census tract boundaries,etc.Correct boundaries and concise criteria are crucial when it comes to census and statistical data;it is in the best interests of any governmental unit to have their boundaries represented correctly when over $85 billion dollars are disbursed for federal programs nationwide over the course of a decade. In addition to boundary information,the subcommittee also works on defining standards and metadata for other cultural and demographic themes.Currently,the subcommittee is working on the Governmental Unit Boundary Standard;it recently released the draft Address Data Content Standard for comment.Concensus among subcommittee members is more difficult to reach as time goes on;participants already have and maintain the data being defined,and every item different from what is in an existing data set requires work. Additional data will need to be collected to meet the standard,metadata will need to be documented,and data bases will need to be revised to accommodate the augmented data set. The National Spatial Data Clearinghouse According to Executive Order 12906,which directed the coordination of geographic data acquisition and access in the NSDI,“ clearinghouse ” means a distributed network of geospatial data producers,managers,and users linked electronically. Although the U.S.Census Bureau was not always linked to the world-at-large electronically,it certainly had a distributed network of geospatial data producers, managers,and users.For the 1980 decennial census,the Census Bureau supplied State Data Centers,census data and maps for data users.This was arranged as a partnership whereby the states received free copies of all products as a repository of census information,and also afforded the State Data Centers the opportunity to have input on data products produced by the Census Bureau.The setup was not sophisticated technologically according to today ’ s standards,but very effective in getting libraries of census data and maps out to every state for data users to access for free.In 1990,there were still State Data Centers,and information distribution improved as technology moved forward.Data and files were transmitted via floppy disk,CD-ROM,or tape.Now,after Census 2000,there are still State Data Centers --and now there is a parallel set of Census Information Centers (CICs)focused on various constituenty groups,but they can grab their TIGER/Line . files via FTP at a public ftp site or off the internet before their DVDs containing TIGER/Line . files arrive in the mail. U.S.Census Bureau data from surveys,population estimates,etc.are all a click away on the U.S.Census Bureau ’ s web site,accessible through the NSDI Clearinghouse,of course. In keeping with the mandate to make geospatial and statistical data accessible,the Census Bureau has been working with private sector contractors this decade to create what it calls the “ American FactFinder.” Beyond just dishing up data tables of statistics and TIGER/Line . files for downloading on the internet or on some other medium,when the first Census 2000 data are released this year,data users will actually have basic thematic mapping capabilities at the American FactFinder web site.Users will be able to select an area,select the data they wish to display for that area,and create thematic maps to send to their printer at home.They will not need a GIS at home to perform basic thematic mapping. Standards and Metadata The U.S.Census Bureau does not use commercial software to store or update the TIGER data base.There was no commercial software available at the time of its inception that would meet all of the needs of the TIGER System,and so the Geography Division built a “ home grown ” system.As such,the system nobody else in the world has inherently prohibits a “ black box ” answer to data exchange,but there was never any question that local geospatial digital data files would be used to update the data base.As more and more users adopted TIGER data for their own GIS base (in the form of TIGER/Line files)it became clear that the only way to exchange data would be to establish standards and define metadata that would allow us to understand each others data.The Geography Division knew it would need to be able to work with commercial GIS export formats to make exchanging geospatial files painless for its partners,and if not painless,at least possible.As early as 1991,the Geography Division conducted meetings at national geographic conferences to speak with TIGER data users to investigate how to best incorporate local geospatial data into the TIGER data base.Format,content,and topological changes were discussed in order to come to a common understanding and agreement of what was needed to make data exchange work. The TIGER data base is a planar graph,with all features intersecting in one layer.The GIS files that many any local partners maintain are based on separate layers for roads, hydrography,railroads,and boundaries.Some of the most basic differences between the internal TIGER data base and the geospatial files maintained by local and tribal partners occur in determining what constitutes a feature,how the data are stored,and issues involving topology.In the TIGER data base,topology must be preserved at all costs. Census blocks,the smallest geographic unit for which census data are collected,are bounded by intersecting features that are not necessarily the same type of feature.If a local government were to remove a feature that is a census block boundary,or realign a feature that changes the topology of a census block,it was unknown how to handle it in the TIGER system. Census blocks are maintained for ten years to provide the framework upon which the decennial census data lay.Discussion with partners heavily focused on these types of situations.At the urging of local partners,1980 geography was removed from the TIGER/Line . files in 1994 to make it easier for them to provide files to the U.S.Census Bureau and incorporate subsequent TIGER/Line . file information back into their own systems.The TIGER data base has standards and very specific rules for inclusion/exclusion of features,attributes,and content.Feature portrayal and geopositional accuracy,however,are not in accordance with any standard,because TIGER information is compiled from so many different sources.Since TIGER was initially built,numerous features were added to the data base by field personnel (who merely estimated their location or position)or from paper maps of unknown accuracy by relative positional accuracy.Many features can be,and are,many meters off from ground truth.When the TIGER/Line . files first became public,it was an accepted fact;TIGER was good enough to take a census but was not of cadastral accuracy.Now,however, with so many partners having a highly accurate GIS and much more invested in the TIGER data base,the U.S.Census Bureau is looking at setting coordinate accuracy standards for the TIGER data base and redesigning the TIGER data base to accommodate metadata for features and attributes. Updates come from a variety of sources just days apart and can cover the same governmental jusrisdiction.Having a complete inventory of streets with correct feature names and address information is crucial to taking a census.Perfect positional accuracy of those streets is not as crucial.With a redesign of the TIGER data base that can accommodate metadata describing attribute information,rather than just the feature,we will be able to tag the street name and address information as field verified.The feature itself will be tagged as having relative positional accuracy,that will be updated at some point with imagery,GPS,or a highly accurate local file.Non-census users of TIGER data would appreciate that kind of information,when emergency response is depending on a feature being exactly where it looks to be.Establishing standards and metadata are the only way to track where data came from and the veracity of its spatial and attribute accuracy.Standards and metadata will be an important mechanism in keeping U.S. Census Bureau partnerships intact. PARTNERSHIPS FROM THE BEGINNING There were many contributing partners to the construction of the U.S.Census Bureau ’ s TIGER data base,and indeed,it could not have been built without assistance.The United States Geological Survey (USGS)and numerous local and tribal governments contributed geospatial data files to build the TIGER data base.A number of local and tribal governments also helped the Census Bureau update and maintain the earlier predecessor, the Geographic Base File/Dual Independent Map Encoding (GBF/DIME)files used during the 1980 census and made their information available to incorporate into the TIGER data base.State,local,and tribal governments willingly provided reference sources covering their jurisdictions to add new features and attribute data to the feature base provided by the USGS. The concept of using existing data that met the U.S.Census Bureau ’ s needs to expedite build time,eliminate duplication of effort,and reduce costs were firmly ensconced in the model for building and maintaining the TIGER data base right from the beginning.The methods used to build the TIGER data base proved that multiple ways of building and populating a national data base could work and was a precursor to the NSDI vision.This partnership model is still firmly entrenched in the Census Bureau ’ s culture and is included all current and future plans for TIGER data base update and maintenance operations and census programs. Throughout the 1990 ’ s,the U.S.Census Bureau utilized literally hundreds of non-census geospatial digital files to update the TIGER data base,topological issues notwithstanding. The wealth of information available from partners was readily shared to update and maintain the TIGER data base.The Census Bureau recognizes that the best information to use for TIGER update and maintenance comes from partners at the local level.They live there,maintain their data,and work with it every day.Who better to provide updates for any given area than those responsible for creating the data in the first place? The next natural progression in its efforts to satisfy its partners was to make it easier for U.S.Census Bureau partners to participate in the many boundary delineation and street update programs.As previously mentioned,a major partnership program has been the Boundary and Annexation Survey (BAS).Paper maps traditionally were sent (with a color pencil!)to local and tribal government officials to certify the boundaries as shown on the maps,or correct them and certify the corrections.The paper map approach has strained some Census Bureau partnerships over the past decade.The positional accuracy of the coordinates in the TIGER data base are not at the accuracy level of the data bases used by many partners.Unless the local data base is from TIGER/Line . files with no coordinate improvement,the local file may not be used to correct the boundaries in the TIGER data base.Importing a local file with very accurate coordinates to overlay features in the TIGER data base would have nothing to do with reality and could easily result in misallocation of housing units to the wrong government.The dilemma is that the partners understand the problem,but they don ’ t like it.And they will not be willing to draft boundaries onto paper maps for very much longer. In addition to the TIGER data base,the U.S.Census Bureau maintains a Master Address File,a file that contains every address in the United States where people live or might live.Addresses are linked spatially to census blocks in order to allocate census data.A large program conducted prior to Census 2000 was the Local Update of Census Addresses Program.Local government participants were sent a copy of the Census Bureau ’ s address list prior to Census 2000 for verification of the completeness of the list and the location of each address.The Census Bureau learned rather quickly that addresses are all maintained differently,and any slight difference can make an address look like it is a new addition.Pitfalls of blindly adding addresses could introduce duplicate addresses:Different,but the same.We are hoping the new U.S.Address Data Content Standard can alleviate these vast differences in the future when conducting similar programs.The more “ high tech ” we get,the more standardized we must become. Machines do not have the ability to reason in situations such as the slightly different addresses previously mentioned.The sooner standards are adopted,the easier it becomes to exchange data,and hopefully get to a transactional system. NSDI Pattern Established The basic tenets of the NSDI have been easy for the U.S. Census Bureau to follow. Historically,the Census Bureau has followed these established practices for years. Partnerships are crucial.o one agency can do it all,and do it all correctly.Standards and metadata are the “ hook ” by which data can flow to other users and give them confidence in what the data mean.The Census Bureau is connected and is committed to distributing its data so that it reaches everyone that desires to access it.We are especially proud of enabling data users to graphically display their data without the need of GIS with the American FactFinder.The work with the subcommittee makes sense to coordinate through a federal agency — many times an agency has the national vision of all the levels of participants in the public and private sector,and that broad perspective of unique differential occurrences in data across a nation.The institutional culture at the U.S.Census Bureau has supported the NSDI every step along the way before the NSDI term was coined. |
|
|