GISdevelopment >
Education >
Papers / Articles
Institutions | Training | Online Education | Papers / Articles
An integration of stratified sampling designs and Geographic Information Systems - An application in Educational Research
Methodology
Compilation of Spatial Data
Digital thematic coverage showing the boundaries of Southern province, administrative boundaries such as Divisional Secretary (DS) divisions, distribution of Type A, B, C and D type roads, railway stations, hospitals, police stations and schools were obtained for the study. Locations of the schools were recorded from an extensive GPS survey carried out in the province as there were no proper records of the localities of schools available.
In addition, statistics available for the poverty status and population density were transferred to corresponding maps where DS divisional boundaries form the basic spatial unit for data integration.
The first task was to identify a suitable grid resolution where all these spatial variables can be presented in the proper spatial context. In this process, it was noted that the grid resolution should be sufficiently high to include a maximum of one schools in a grid cell in order to avoid the spatial aggregation of variables derived for each school.
A key spatial variable identified in this evaluation process is the distance to schools from A roads. In Sri Lanka, roads are categorized into four groups such as Type A, B, C and D. Type A roads are wide tarred main roads. Most of the resources are thought to be distributed around the main roads. A one km line buffers were created in order to identify the distance of each grid cell from the Type A roads. Further, distance to other infrastructure facilities were also identified based on point and line buffers and corresponding values for grid cells were identified. The calculated distances show the areal distances, however, in reality, on-road distances in terms of other types of connecting roads need to be considered and network analysis (Green et al, 1998, Naude, 1995) needs to be employed to get the true distance to reach a destination on roads.
Distribution of poverty and population densities was also transferred upon the grid maps. An addition, average year 5 scholarship examination marks were also computed and transferred to the cells where schools are located. Year 5-scholarship examination is a national level examination to assess the comparative performance of students at a common numerical scale. The thematic coverage of each variable was transferred to a common scale to identify the homogenized areas in terms of each variable. These processed grid values were then converted to a database file with the corresponding location identities for the multivariate stratification.
Attribute Variables
The database files from the Ministry of Education were acquired for the attribute data covering the information on the availability of libraries, computing facilities, drinking water, electricity and sanitary facilities, total revenues received, qualification of teachers and their professional background, number of students in year 5 grade and in all other grades and the total number of classes of year 5 grade and all other grades. Travel time from schools to the nearest urban centre was also considered as an attribute variable although it is truly a spatial variable.
Stratification Methodology
Traditionally, stratification is done with respect to one or two variables because of the administrative and computational convenience. However, this does not guarantee that stratification covers all important effects that could influence the variable under study. In this study, two methods of multivariate stratification techniques were employed without affecting the characteristics of stratified sampling.
Stratification Procedure – Multivariate Analysis
Both factor analysis and cluster analysis were employed in this study. After analyzing the correlations, variables proposed to be used in the stratification were submitted to factor analysis, which identified the variables explaining the patters of correlations within a set of observation variables. Factor analysis is commonly used in data reduction by identifying a small, number of factors which could account for the most of the variations observed in a much larger set of variables (Unwin et al., 1996). A sufficient number of factors were extracted and factor scores were computed and submitted for cluster analysis which identified relatively homogenous groups of cases based n factor scores. These groups were then used as strata. Application of these procedures for spatial variables and attribute data, the Southern province and the schools were stratified.
Stratification Procedure – Extended Ekman Rule
Ekman rule is extensively used for univariate stratification. It is used to stratify a population according to the values of the stratification variable minimizing the standard error of the estimate. Dan (2000) extends this rule to apply for the multivariate stratification.
A linear combination of several stratification variables instead of a single variable as the stratification variable was used here. Correlation analysis and regression analysis were used to identify the factor that is most significantly related to the performance of schools. This factor was then used as the stratification variable and stratum boundaries were determined by extended Ekman rule. This algorithm available in SAS software was used to determine the boundaries.
Results and Discussion
Grid Resolution
Number of schools falling into a cell decreases rapidly with the increase of grid resolution. In this study, a grid resolution of 1 km was chosen as this resolution results maximum of one school per cell. Further improvements of grid resolutions were not required as spatial segregation would not contribute for any further improvement of spatial scale of information into the process.
Stratification Scenarios
The results of the correlation analysis revealed that the average marks are highly correlated with the almost all the spatial variables. Spatial variables are significantly correlated with each other at 95% level of confidence although a few cases of exceptions were noted. Accordingly, the spatial variables were submitted for actor analysis. The variable of distance from the forest was excluded based on the commonalities and factor loading statistics.
Institutions | Training | Online Education | Papers / Articles
Related Sections
Applications |
Books |
Careers |
Glossary |
History |
News |
Publications |
Tutorials |
Technology
|
|
|