Dynamic Linking of Arc View, XGobi and XploRe for Multivariate Spatial Data: Linked Brushing for Points, Polygons, and Lines
3 Epidemiological Example Using Polygon Data
The data set in this example has been taken from the Atlas of United States Mortality (Pickle et al. 1966) and consists of 798 Health Service Areas (HSAs) which are aggregations of adjacent US counties that share similar health service reporting characteristics. The variables used in this example are: a) the percentage of the population that is Hispanic; b) the per capita income; c) the percentage of households with a female head of household; d) the percentage of the population that is unemployed; and e) the mortality rate for all cancer types surveyed as deaths per 100,000. We want to investigate the relationships of these 5 variables in their spatial context.
First, we look at some basic features of the data. In the basic link between ArcView and XGobi, we can examine the univariate distribution of each variable and bivariate relationships between pairs of variables. We note that the median percentage of Hispanics is 1.06. HSA units where the percentage of Hispanics falls below the median percentage (brushed in light grey) are in the midwest or deep south as seen on the accompanying ArcView map (see Figure 1). The values above the 75th quantile are all in the western US Examining this relation further we note that when we divide the HSA units into two classes above and below the median, there is a positive relationship among areas with greater than median, Hispanic population, unemployment, and the percentage of female heads of household. There is also a negative relationship between the percentage of the population that is Hispanic and the per capita income. The areas with the highest unemployment and highest proportion of Hispanics are, not surprisingly,
southern Texas and central California.
Figure 1. Percentage of the Hispanic population in each of the 798 US HSAs. Values above the median percentage are brushed in dark grey while values below the median percentage are brushed in light grey in the XGobi view. The ArcView map view shows that areas where the percentage of Hispanics falls below the median percentage are in the Midwest or deep south.
The relationship between unemployment and the percentage of hispanics seems to have a very striking spatial relationship (see Figure 2). Lower unemployment is distributed through the central United Stated while the northern and southern United States have higher unemployment. The hispanic population seems to be distributed throughout both types of areas with more of the population spread through high unemployment areas in the western United States.
Figure 2. Relationship between unemployment and the percenteage of the population that is hispanic. For each of the two variable, a point is mapped is 0 if the value is below the corresponding median and to 1 if it is above. After jittering the points that constitute these four possible outcomes, i.e., assigning a small random error, we end up with four blocks that represent the possible outcomes. In areas where the hispanic population is above the median, above median unemployment is darkest grey and below median, unemployment is med dark grey. In areas where the hispanic population is below the median, above median unemployment is med light grey and below median enemployment is lightest grey.
To probe deeper into the relationships, we need to see the spatial pattern through the spatial noise. Using XGobi's built-in smoothing functions, we look at some of the bivariste relations with respect to our four previously created groups. The first panel in Figure 3 shows that cancer retes increase as the percentage of hispanics decreases both for high and low unemplyment, reinforcing our earlier assumption that cancer rates have some tendency to rise as the per capita income increases. We also note that areas with a
Figure 3. Spline smoothed plots bivariate relationships. For areas where the hispanic population is above the median, above median unemployment is dark grey and below median is med dark grey, while in areas where the hispanic population is below the median, above median unemployment is med light grey and below median is lightest grey. (Left) Percent hispanic against cancer rates. (Middle) Per capita income against cancer rates. (Right) Female head of household against per capita income.
higher percentage of Hispanics tend to have lower per capita incomes. The last panel shows the complex relationship between the percentage of female head of households and the per capita income. We note that there are opposite trends for the high and low unemployment classes and that there are further differences in the low unemployment classes between high Hispanics proportions and low Hispanics proportions.
In general, we have seen that the hispanic population tends to have lower per capita incomes and lower cancer rates.