Sensor Orientation and Ortho-Rectification of High Resolution Satellite Images:
Review and Application with FORMOSAT-2 Thierry Toutin Natural Resources Canada, Canada Centre for Remote Sensing 588 Booth Street, Ottawa, Ontario, Canada K1A 0Y7 thierry.toutin@ccrs.nrcan.gc.ca INTRODUCTION Why ortho-rectify remote sensing images? Raw images usually contain such significant geometric distortions that they cannot be used directly with map base products in a geographic information system (GIS). Consequently, multi-source multi-format multi-date data integration for applications in geomatics geoscientific requires geometric and radiometric processing adapted to the nature and characteristics of the data in order to keep the best information from each image in the composite ortho-rectified products. One must admit that the new input data, the method and algorithms, the output processed data, their analysis and interpretation introduced new needs and requirements for geometric corrections, due to a drastic evolution with large scientific and technology improvements since the early time of earth observation (EO) satellites. In fact, each image acquisition system produces unique geometric distortions in its raw images, which vary considerably with different factors, mainly the platform (airborne versus satellite), its orbit, and the sensor (rotating or push-broom scanners, visible/infrared (VIR) or microwave, low to high resolution). Consequently to integrate different earth observation data but also cartographic vector data into a GIS, the geometry of each raw image must be separately converted to an ortho-image so that each component ortho-image of the data set can be registered, compared and combined into the end-user cartographic database. These geometric distortions (Table 1), including the map deformations, are predictable or systematic and generally well understood (Light et al. 1980). Some, especially those related to the instrumentation, are generally corrected at ground receiving stations and others, for example those related to the atmosphere, are generally not taken into account and corrected because they are too specific (acquisition time and location, lack of information on the atmosphere, etc.). All the remaining geometric distortions thus require mathematical models and functions to perform the geometric corrections of imagery (sensor orientation and ortho-rectification): either with heuristic and probabilistic models or with physical and deterministic models (Robertson, 2003; Toutin, 2004; Chen et al., 2006).
The 3D rational function model (RFM), as recently re-introduced by Madani (1999) for HR sensors, could be considered as a generalization of the heuristic and probabilistic models for EO sensor orientation. In fact, RFM included well-known solutions:
On the other hand, the physical and deterministic models fully reflect the physical reality of the viewing geometry: they should mathematically model all distortions of the platform (position, velocity, attitude for VIR sensors), the sensor (lens, view direction, panoramic effect), the Earth (ellipsoid and relief), and the deformations of the cartographic projection. The final mathematical functions, which integrate the ephemeris and attitude data or the global positioning system (GPS) and inertial navigation system (INS) data, differ depending on the sensor, the platform and its acquisition geometry (instantaneous acquisition systems, rotating or oscillating scanning mirrors, push-broom scanners, radar). However, they are generally based on the collinearity condition for VIR (Light et al. 1980) and Doppler/range or radargrammetric equations for radar (Curlander 1982; Leberl, 1990). In addition, other conditions with pseudo-observations by assigning either relax a priori variances to the satellite’s osculatory and sensor parameters in relation to the expected accuracy of the orbit or weighted constraint equations on model parameters to conform physical or geometrical characteristics of the modeling can be added to take into account the knowledge and the accuracy of the meta-data. SENSOR ORIENTATION AND ORTHO-RECTIFICATION Whatever the geometric models and its associated mathematical functions used, the method is more and less the same, and can be summarized in three processing steps:
With VIR-HR images, different types of image data with different levels of pre-processing can be obtained, but the different image providers unfortunately use a range of terminology to denominate the same type of image data. Table 2 gives the terminology of the different pre-processed products for some HR images. Table 2. Terminology of the different pre-processed products for some HR images
The raw level-1A images with only normalization and calibration of the detectors without any geometric correction are satellite-track oriented. For radar the equivalent is the slant range images. In addition, full metadata related to sensor, platform and image are provided. These level-1A images being the best and most preferred level were largely processed using physical models and achieved sub-pixel accurate results with all sensor types (Toutin, 2004). With empirical models, relatively bad results were achieved with QuickBird Basic images using 3D polynomial functions (Noguchi et al., 2004) but better results using RFM with QuickBird (Robertson, 2003) and Formosat (Chen et al., 2006), while Wolniewicz (2004) found better results using a physical model than RFM with QuickBird Basic images. Vendor-supplied RFM is also available with Cartosat and will be evaluated, as well as physical models, during the Cartosat-1 Scientific Assessment Program co-organized by ISPRS and ISRO. Finally, no RFM is supplied with SPOT-5 or RADARSAT-2/TerraSAR data due their inability to model high-frequency distortions inherent to raw images with large FOV (Madani, 1999). The georeferenced level-1B images corrected for systematic distortions due to the sensor, the platform and the Earth rotation and curvature are satellite-track oriented. For radar the equivalent is the ground range images. Generally, few metadata related to sensor and satellite are provided and some are related to 1B or slant-to-ground processing. Since they have been systematically corrected and georeferenced, the ‘level 1B’ images just retain the terrain elevation distortion, in addition to a rotation-translation related to the map reference system. A 3D 1st order polynomial model with two Z-elevation parameters could thus be efficient depending of the requested final accuracy but RFM supplied with QuickBird Standard data achieved also good results (Robertson, 2003). The map-oriented level-2A/B images are corrected for the same distortions as geo-referenced images are North oriented while ground control points (GCPs) were used more accurate positioning of 2B images. Generally, very few metadata related to sensor and satellite are provided; most of metadata are related to the 2A/B processing and the ellipsoid/map characteristics. Since these images only retain the elevation distortion a 3D 1st-order polynomial model with Z-elevation parameters in both axes can thus be efficient, even for SAR data, depending of the requested final accuracy (Ahn et al. 2001, Fraser et al. 2002, Vassilopoulou et al. 2002) but 3D 4th-order polynomial functions (Kersten et al. 2000, Vassilopoulou et al. 2002) without improvements. However, vendor-supplied RFM (Grodecki 2001) were also evaluated (Fraser et al. 2002, Tao and Hu 2002) using also GCPs either to remove bias (Fraser et al. 2002) or to improve the original RF parameters (Lee et al. 2002). Only few results were published in high relief terrain using 3D polynomial functions (Kersten et al. 2000, Vassilopoulou et al. 2002). Most of these experiments and results with pixel accuracy or better were achieved with very accurate cartographic data (0.10 m) in an academic environment where errors and processing are well controlled. On the other hand, operational studies obtained larger errors of few pixels (Davis and Wang 2001, Kristóf et al. 2002, Kim and Muller 2002, Tao and Hu 2002; Di et al., 2003). Finally although level-2 images are no more in their original geometry, physical models can be still approximated and developed for using basic information of the metadata (Toutin, 2003a) and has been proven to achieve better results than RFM in operational environments (Wolniewicz, 2004). Sensor orientation Depending of the type of images and the final requested accuracy, the sensor orientation should include the internal orientation (principal point displacement, focal length variation, radial symmetric and decentering lens distortions, line scale variation, line rotation) and the external orientation (position, attitude, line of sight), as well as the relative orientation when more than one image is processed. In some cases, the sensor does not need an internal orientation because the parameters have been determined with well-controlled test-range imagery (Grodecki and Dial, 2003) or because they have already been corrected, such as for level-1B and 2. In other cases, some of the parameters and their impact are negligible in relation to the sensor resolution and the final accuracy. The sensor orientation can be performed step by step or integrated into a combined modelling. In fact, it is better to consider the total geometry of viewing (platform + sensor + Earth + map) in the sensor orientation, because some of their distortions are correlated and have the same type of impact on the ground. It is theoretically more precise to compute one ‘combined’ parameter only than each component of this ‘combined’ parameter, separately, avoiding also over-parameterization and correlation between terms. Some examples of ‘combined’ parameters include:
When more than one image is processed, a spatio-triangulation, generally a block bundle adjustment, can be performed to reduce the number of GCPs. All model parameters of each image/strip are determined by a common least-squares adjustment so that the individual models are properly tied in and the entire block is optimally oriented in relation to the GCPs. With the spatio-triangulation process, the same number of GCPs is theoretically needed to adjust a single image, an image strip or a block. However, some tie points (TPs) between the adjacent images have to be used to link the images and/or strips. This process has been successfully applied using either physical models (Toutin, 2003b) or empirical models (Grodecki and Dial, 2003, Fraser et al. 2002). They thus prevent the adjustment from diverging and they also filter the input errors. Since there are always redundant observations to reduce the input error propagation in the geometric models a least-square adjustment is generally used. When the mathematical equations are non-linear, which is the case for physical and 2nd and higher order empirical models, some means of linearization (series expansions or Taylor’s series) must be used. A set of approximate values for the unknown parameters in the equations must be thus initialized:
Ortho-rectification The last step of the geometric processing is the image ortho-rectification. To rectify the original image into a map image, there are two processing operations:
- Geometric operation The geometric operation requires the mathematical functions of the sensor orientation with the previously-computed unknowns and terrain elevation information, such as a DEM, to create precise ortho-rectified images. But if no DEM is available, different altitude levels can be input for different parts of the image (a kind of ‘rough’ DEM) to minimize this elevation distortion. It is then important to have a quantitative evaluation of the DEM impact on the ortho-rectification process, both in term of elevation accuracy for the positioning accuracy and grid spacing for the level of details: a poor grid spacing when compared to the image spacing could generate artefacts for linear features (wiggly roads or edges). Figures 1 and 2 give the relationship between the DEM accuracy (including interpolation in the grid), the viewing angles with the resulting positioning error on VIR and SAR ortho-images, respectively (Toutin 2003a). One of the advantages of these curves is that they can be used to find any third parameter when the two others are known. It can be useful not only for quantitative evaluation of the ortho-rectification, but to forecast the appropriate input data, DEM or the viewing/look angles, depending of the objectives of the project. ![]() Figure 1. Relationship between the DEM accuracy (in metres) the viewing angle (in degrees) of the VIR image, and the resulting positioning error (in metres) generated on the ortho-image For example (Figure 1), with a QuickBird image acquired with a viewing angle of 27° and having a 5-m accurate DEM, the planimetric error generated on the ortho-image is 3 m. Inversely, if a 2-m final planimetric accuracy for a SPOT5 ortho-image is required and having a 10-m accurate DEM, the SPOT5 image should be acquired with a viewing angle less than 10°. The same error evaluation can be applied to SAR data using the curves of Figure 2. ![]() Figure 2. Relationship between the DEM accuracy (in metres), the look angle (in degrees) of the SAR image, and the resulting positioning error (in metres) generated on the SAR ortho-image. The different boxes at the bottom represent the range of look angles for each RADARSAT beam mode Finally, for any map coordinates (X, Y), with the Z-elevation extracted from a DEM, the original image coordinates (column and line) is computed from the equations of the sensor orientation model. However, the computed image coordinates of the map image coordinates will not directly overlay in the original image; in other word, the column and line computed values will be rarely, if never, integer values. - Radiometric operation To compute the DN to be assigned to the map image cell the radiometric operation uses a resampling kernel applied to original image cells: either the DN of the closest cell (called nearest neighbour resampling) or a specific interpolation or deconvolution algorithm using the DNs of surrounding cells. In the first case, the radiometry of the original image and the image spectral signatures are not altered, but the visual quality of the image is degraded and has a disjointed appearance. A geometric error of up to half pixel is also introduced. In the second case, different interpolation algorithms (bilinear interpolation or sinusoidal function) can be applied. The bilinear interpolation, which weights DNs of four surrounding cells as a function of the cell distance, creates a smoothing in the final image. The piecewise cubic convolution, as being an approximation of the sinusoidal function, which computes 3rd-order polynomial functions using a 4?4 cell window, does not smooth, but enhances and generates some contrast in the map image. The theoretically ideal deconvolution function is the sin(x)/x function using 8 x 8 or 16 x 16 cell window. The finale image is, of course, sharper with more details on features (Figure 3). ![]() Figure 3. Examples of applications of geometric resampling kernels used in the rectification process with a QuickBird image. The sub-images are 350 by 350 pixels with 0.10-m spacing. Letters A, B, C and D refer to different geometric resampling kernels (nearest neighbour, bilinear, cubic convolution, sin(x)/x with 16?16 window), respectively. QuickBird Image © Digital Globe, 2001 These previous functions are geometric resampling kernels, not very well adapted to SAR images. Instead, it is better to use statistical functions based on the characteristics of the radar, such as existing adaptive filters using local statistics. Combining the filtering with the resampling also avoids multiple radiometric processing and transformation, which largely degrades the image content and its interpretation (Figure 4). ![]() Figure 4. Examples of applications of geometric resampling kernels used in the rectification process with RADARSAT-SAR fine mode (F5) image. The sub-images are 600 by 600 pixels with 1.00-m spacing. Letters A, B, C and D refer to different geometric resampling kernels (nearest neighbour, bilinear, cubic convolution, sin(x)/x with 16?16 window), respectively and Letters E and F refer to statistical SAR filters (Enhanced Lee and Gamma), respectively. RADARSAT Images © Canadian Space Agency, 2001 - True ortho-image In urban context, surface heights (mainly buildings and bridges) generate discontinuities, leaning and hidden/shadowed areas. While there are different methods to correct them (Ettarid et al., 2005), the one generally used is (i) to separately generate two ortho-images with a DTM in conjunction with a 3D building model (Amhar et al., 1998) or a digital surface model generated from stereo HR images, and (ii) to merge them. Finally, hidden/shadowed areas can only be eliminated through the superimposition of other true ortho-images acquired from different viewpoint. APPLICATION TO FORMOSAT2 RSI In-track panchromatic stereo-pairs were acquired on a Quebec study site, Canada (47º N, 71º 30’ W) (Figure 5) by the Taiwanese Formosat2 Remote Sensing Instrument (RSI) as a courtesy of the Taiwanese National Space Program Office (NSPO) and SPOT-Image, France (as distributor). Because the CCRS geometric modelling used in the processing already performs the self-calibration of most of internal orientation parameters (principal point displacement, focal length and line-scale variation and CCD line rotation) and because the decentering lens distortion is negligible, only the radial symmetric distortions is additionally addressed. The mathematical functions of these remaining lens distortions are well known (Brown, 1966), and can be computed from the GCP residuals of the sensor orientation when a large number of GCPs regularly distributed in column direction are used (Figure 6). ![]() Figure 5. Forward panchromatic Formosat2 image (24 km by 24 km; 2-m pixel spacing) acquired December 28, 2004 on Quebec, Canada. Formosat2 ? National Space Program Office, Taiwan 2004; Courtesy of SPOT-Image, France ![]() Figure 6. Computation of the radial symmetric distortion for Formosat2 RSI lens by separating random errors (from GCPs) to systematic error (radial distortion) from the GCP residuals: the radius r (in pixel) in x-axis and residual/error (in pixel) in y-axis. In order to verify the impact of self-calibration, sensor orientation, DEM stereo-extraction and ortho-rectification were thus performed on RSI images. The DEM was generated using an area-based image correlation process and compared to 0.2-m accurate lidar data. Because the stereo-pairs and the Lidar data were acquired at different seasons with different planimetric resolutions, the compared elevations do not always exactly correspond to the same ground point and elevation; in fact the height, or a part, of the different surfaces (trees, buildings, etc.) is differently included in the elevation (Figure 8). The Linear Error with 68% level of confidence (LE68) was thus computed for the total overlap area and also for bare soils, where there are no height differences between the two compared ground-point elevations. Table 3. Impact of the self-calibration on sensor orientations for three Formosat2 RSI images with the RMS residuals at GCPs in the image space (x-column and y-row in pixel) and the improvement (in percentage)
Table 3 shows RMS errors of the sensor orientation of three RSI images: large improvements were obtained with the self-calibration, around 50% in the x-direction and more than 25% in the y-direction. Because the panchromatic CCDs are off-centred in y-axis of the focal plane results in radial symmetric distortion in both directions. These 1-2 pixel accurate results are mainly due to the 2-3 m error of the input data, which were not precise enough when compared to RSI sensor resolution (2 m). Even if sub-pixel accuracy could have been achieved, such as the results obtained using CCRS multi-sensor geometric model with other sensors (Toutin, 2003a), these 1.5-pixel accurate results are similar from previous experiments (2 pixels) (Chen et al., 2006). These results demonstrated (i) the range of the radial distortions to be around 1-2 and 0.5 resolutions of the RSI sensor in x-column and y-row directions respectively, and (ii) the necessity to apply lens calibration to achieve (sub-)pixel accuracy for the sensor orientation. Table 4 shows different results during the DEM generation process: improvements when using the self-calibration in the sensor orientation of three RSI images. Less mismatched areas are due to a better sensor orientation for generating the epipolar stereo-images. All elevations errors were also largely decreased with the self-calibration, and the largest improvement was for the bare soils comparison where no height difference occurred between the two compared elevations (stereo and lidar). Consequently without self calibration, the DEM error generated 8-m planimetric error on the ortho-image (Figure 1), which was acquired with 30º viewing angle, while the planimetric error was less than 3 m with self-calibration.
CONCLUDING REMARKS This review on sensor orientation and ortho-rectification of HR images has addressed different key issues: sources and modelling of geometric distortions during the image acquisition; types of delivered image-products; 3D physical end empirical modelling of these distortions and their applicability to the image-products; internal, relative and external sensor orientations; impacts of DEM and resampling kernel on the ortho-rectification. Applications were thus realized with stereo-images acquired on Quebec, Canada by the new Tawainese instrument RSI of Formosat2. Results show that accurate sensor orientation is a requisite to achieve sub-pixel accuracy in the geometric modelling, to stereo-extract 3-4 m accurate DEM and to generate 3-m accurate ortho-image. REFERENCES
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|