|
|
|
Using Programmable GPU Hardware for Increased Visual Effects and Photorealism in Large-Scale Terrain Visualization

Anupam Agrawal Indian Institute of Information Technology
Deoghat, Jhalwa,
Allahabad – 211 011.
anupam@iiita.ac.in
M. Radhakrishna Indian Institute of Information Technology
Deoghat, Jhalwa,
Allahabad – 211 011.
R.C. Joshi Dept. of Electronics & Computer Engg.,
Indian Institute of Technology,
Roorkee – 247 667.
Abstract:
Real-time photorealistic visualization of large terrain datasets is an active
area of research. We have developed an efficient hardware-accelerated view-dependent
level-of-detail rendering technique for out-of-core visualization of large terrain data. To
achieve photorealism in visualization, the corresponding satellite or aerial imagery
texture has been applied over the terrain geometry. Also, to display large collection of
static objects such as buildings, trees, roads etc. over the terrain while maintaining the
real-time frame rate, efficient object handling methods have been developed. Considering
the recent developments in graphics hardware, we propose a technique to use powerful
performance and programmable vertex and fragment features of current GPUs for
advanced visual effects and increased photorealism in terrain visualization. These ‘Cg’
based vertex and fragment programs are integrated with the OpenGL 3D API and Visual
C++ code and actually execute within the GPU of the computer. The above algorithms
have been successfully tested on different real-world height maps and satellite
phototextures of sizes upto 16K*16K coupled with thousands of static objects, polyline
vector overlays and special visual effects on a 2.4 GHz Pentium IV computer with 512
RAM and Nvidia’s XFX GeForce 6800GT 256MB graphics card.
1. Introduction
Terrain Visualization is used widely in many 3D applications, such as GIS, cartography,
games, flight simulation and virtual reality applications etc. The biggest problem in realtime
terrain visualization is how to manage the large-scale datasets, including geometry
(height map) and texture information. Multiresolution level-of-detail (LOD) models
provide a mechanism to represent and manipulate terrain data by optimizing the tradeoff
between complexity and accuracy of representation. Unfortunately, current LOD
algorithms [1,2,3] are highly CPU intensive and also the geometry throughput to the
graphics hardware remains a bottleneck. These algorithms try to produce ‘optimal’
simplified mesh with smallest number of triangles that has little perceptual difference to
the full mesh and hence are computationally expensive. There is a need for a simple and
effective solution that takes advantage of graphic boards or Graphic Processing Units
(GPUs) available in current consumer PCs and minimizes the CPU overhead.
To bring photorealism in visualization, it is required to drape corresponding geospecific
satellite or aerial phototexture over 3D digital terrain and also to place multiple
collections of point-location based static objects such as buildings, trees etc. and to
overlay polyline vector objects such as roads, district boundaries etc on top of the terrain
surface. It further complicates the requirement of interactive frame rates while navigation
over the terrain.
For more realistic 3D visualization in the close range, it is required to combine detailed
geotypical texture such as grass or stone with large geospecific texture. Furthermore, it is
useful to include some special effects such as fog, lighting etc. Adding fog has more than
one purpose. Besides trying to give the user the impression of actual fog, it can be used to
obscure objects in the distance and have them gradually become clearer as the player gets
closer. Graphics hardware has recently evolved from a fixed-function graphics pipeline to
a graphics pipeline with programmable vertex and fragment stages. High-level shading
languages such as Cg (from Nvidia), HLSL (from Microsoft) and GLSL (from OpenGL
ARB) provide higher-level abstraction for GPUs and make the GPU programs truly
portable. Developing algorithms and techniques for effective utilization of programmable
GPU capabilities for increased realism/ visual effects as well as in general-purpose
computation in real-time applications is currently an active area of research.
This paper summarizes our research work towards development of efficient algorithms
for real-time 3D visualization of large textured terrain and object management over the
terrain on desktop PCs. It further elaborates on the proposed approach to exploit the
programmable graphic pipeline features of current GPUs for advanced visual effects (e.g.
fog) and increased photorealism (e.g. multitexturing for detailed texturing) in
visualization by developing suitable shader programs. The proposed methodology and
algorithms have been tested on several real-world large terrain data sets with real-time
rendering performance.
2. Rendering Massive Terrain and Object Data in Real-time
In subsection 2.1, we briefly discuss our methodology for handling and real-time
visualization of large terrain raster data consisting of terrain geometry (height map or grid
DEM) and geospecific satellite imagery texture. In subsection 2.2, we provide the brief
details of our methodology for display of large collection of point-location based objects
such as buildings, trees etc. as well as overlay of polyline vector data over 3D LOD
terrain while maintaining the real-time frame rate. The references to earlier published
papers of the authors have been provided for further details.
2.1 Hardware-accelerated LOD Rendering Technique for Large Terrains
With multiresolution Level of Detail mesh simplification scheme, the close regions are
approximated more accurately than regions that are far away such that the resulting
image is without any noticeable visual difference. We have developed a view-dependent
dynamic block-based LOD modeling scheme for mesh simplification and have used
corresponding tiled geospecific textures to display the details of the high-resolution
satellite imagery with real-time rendering [4]. The terrain geometry and texture data are
organized in titles of size (2n +1)* (2n +1) and 2n *2n respectively, where n=8. One pixel
overlap is kept between adjacent geometry tiles to ensure proper stitching of tiles. At any
instance of time, only nine tiles are kept in main memory. The viewer is assumed to be
inside the center tile. Further both height map as well as texture image data are
preprocessed and stored at multiple resolutions to also support fly-through mode. In the
walk-through mode, the system chooses the highest resolution data. On the other hand,
during fly-through mode, the system chooses the appropriate resolution data based on the
viewer’s height from the terrain surface.
Each geometry tile data is organized with a quadtree with leaves corresponding to blocks
or patches of size (2m +1)* (2m +1) with m=4 to speed up the view-frustum culling.
Instead of manipulating individual triangles, the algorithm operates on clusters of
geometry called blocks of aggregate triangles. Hence the amount of work CPU must
perform is greatly reduced. Considering the multiresolution representation for each patch,
the algorithm employs a variable screen-space (t) threshold to limit the maximum error
of the projected image considering the terrain complexity, and viewer’s position as the
viewer navigates the terrain It is important to note that in a view-dependent framework,
the resolution of adjacent patches might change at every frame. Hence, cracks occur on
borders of adjacent patches of different levels of detail. Geometry of the boundary
triangles of higher resolution patch is modified to produce crack free joins. Further, the
formation of long triangle strips for LOD blocks solves the CPU-to-Card bandwidth
problem and avoids redundant 3D vertex T&L calculations [5].
2.2 Object Management over Digital Terrain
To display point-location based objects, an efficient object handling method has been
developed using paging technique and object instantiation [6]. We have considered two
categories of static objects. The first one is simple objects, those having simple geometry
and can be drawn with the help of OpenGL primitive functions. These objects do not
need to be loaded into memory. Examples of simple objects include sky-scrapper
buildings drawn using elongated cube with texture mapping over its exposed faces and
also objects created using billboarding technique. The second category is of complex
objects (drawn using AutoCAD 3D modeler) having complex geometry and large number
of triangles. These objects are required to be loaded into main memory containing their
vertices and topology information. Examples of such objects include complex 3D
geometrical models of buildings, trees etc. User is allowed to control the objects
locations, scales and orientations.
Display of polyline vector data over multiresolution 3D terrain has been accomplished
using an efficient geometry-based mapping approach [7]. Our algorithm for
multiresolution modeling of vector data allows the system to adapt the visual mapping
without rendering artifacts to the context and user need while maintaining interactive
frame rates.
3. GPU Programming for Increased Visual Effects and Photorealism
In recent years, the performance of graphics hardware has increased more rapidly than
that of CPUs. CPU designs are optimized for high performance on sequential code, and it
is becoming increasingly difficult to use additional transistors to improve performance on
this code. In contrast, programmable graphics hardware is optimized for highly parallel
vertex and fragment shading code. As a result, GPUs can use additional transistors much
more effectively than CPUs, and thus sustain a greater rate of performance improvement
as semiconductor fabrication technology advances.
Modern GPUs provide both vertex-level and pixel-level programmability. Such
capabilities permit increasingly realistic and accurate visuals to be rendered on consumer
level hardware for beyond what traditional fixed-function rendering pipelines can
provide. Under a fixed function pipeline, data can be sent to the pipeline and pipeline
states can be set, but no direct altering of the vertex and fragment-processing stages can
be specified. A common example is the lighting model used by most 3D pipelines. Both
DirectX and OpenGL use a Phong based lighting formula as it can be easily computed on
a per-vertex basis. Even though many other lighting models exist such as Global
Illumination and BRFD [8], developers were restricted to the one available model.
Similar examples of restrictions existed throughout the pipeline process.
A programmable pipeline allows for complete control over specific stages of the pipeline
through the creation of shaders; small snippets of code that when compiled, dynamically
replace a particular stage’s standard vendor supplied execution implementation. Vertex
shaders are programs that perform the vertex and normal transformations, texture
coordinate generation, and per-vertex lighting computations normally computed in the
geometry processing stage. Fragment or pixel shaders are programs that perform the
computations in the pixel processing stage of the graphics pipeline and determine exactly
how each pixel is shaded, how textures are applied, and if a pixel should be drawn or not.
A vertex program is executed on the vertices before primitive assembly, while a fragment
program is executed after rasterization. These small shader programs are sent to the
graphics hardware from the user program (see Figure 1), but they are executed on the
graphics hardware.
 Figure 1: Programmable Graphics Pipeline
Until recently, most GPU programs were written in assembly language. As the number of
assembly commands and features available increased, it was easily seen that the
assembly- like languages would prove inadequate as shaders continued to grow in size
and complexity. Presently, there are two primary graphical APIs used in real-time 3d
3D Application
CPU
GPU front end
Primitive assembly
Rasterization
Raster operations
Frame buffer
GPU
Vertex
processor
Shaders
3D API: OpenGL
or Direct3D
Fragment
processor
graphics; Microsoft's proprietary DirectX and OpenGL maintained by the Architecture
Review Board (ARB). Each API currently provides a high level shading language, HLSL
and GLSL respectively, with the graphic hardware vendor NVIDIA providing a third, Cg,
which can use either API. These high level shading languages are designed around a
familiar framework and syntax that closely resembles a subset of the C programming
language. However in contrast to conventional programming languages, above shading
languages is based on a data-flow computational model and exploit the stream aspect
(Single-Instruction Multiple-Data or SIMD control) of GPUs explicitly. In our research
work, we have explored Cg-based GPU programming approach [9] for multitexturing as
well as for the fog effects along with OpenGL 3D API [10]. The Cg language combines
an interesting number of features from both of the other two shading languages. Through
the use of an external runtime translator, Cg’s string based vertex and fragment shaders
are translated to either DirectX or OpenGL assembly language programs that are then run
through the respective API’s shader assemblers. Figure 2 details this approach.
 Figure 2: NVIDIA ‘Cg’ Compilation and Execution Process
This cross API support is accomplished through the use of translation “profiles” that
allow for the shaders to target a wide variety of hardware, including the standard DirectX
shader levels, OpenGL’s shading extensions, and specific NVIDIA hardware features.
When a shader is sent to the translator the user specifies a profile flag and the translator
will attempt to generate code that runs on hardware with the corresponding level of
physical support.
We have developed ‘Cg’ based vertex and fragment programs, which are integrated with
the OpenGL 3D API and Visual C++ code. They actually execute within the GPU of the
computer. GPU programming code consists of three parts: Initialization and Set Up,
Vertex program and Fragment Program. The first part initializes the variables, which
provide a connection between variables in the main program and variables in the shaders.
The vertex program computes a fog exponent from the shortest distance to the eye and
the uniform ‘fogDensity’ parameter. Finally, the fragment program samples a decal
texture; modulates the decal color with the interpolated color; and fogs the textured
fragment color, assuming an interpolated fog exponent and a uniform fog color. The
results are shown in the next section.
4. Results and Performance Analysis
The software TREND (acronym for Terrain Rendering) is developed in the Microsoft
Visual C++ using the OpenGL 3D API libraries for a Win32 environment. The Cg
language has been used for GPU programming. We have tested the software with 4K*2K
terrain raster dataset of Grand Canyon and 16K*16K terrain data set of Puget Sound area
obtained from Georgia Institute of Technology website. We have also generated the
height map of Dehradun (India) area using digitized contours on Survey of India (SOI),
India supplied topographic map. The corresponding geo-referenced IRS-1D FCC Satellite
imagery has been used for image draping.
The images in Figure 3(a) and Figure 3(b) show the Height map (DEM) of Grand Canyon
area and corresponding FCC Satellite imagery respectively. Figure 4(a) and 4(b) show
the overlay of polyline vector data on top of LOD phototextured view of terrain without
and with multiresolution modeling of polyline vector data respectively. Figure 4(c) shows
a view with point-location based simple objects (OpenGL and Billboard drawn objects)
over phototextured 3D terrain. A complex object (a building designed using AutoCAD) is
shown over the terrain in Figure 4(d). The Figure 4(e) shows the detailed geotypical grass
texture. The Figure 4(f) shows the effect of fog and multitexturing (used for combining
the texture of grass with satellite image texture) to increase photorealism in the scene.
 (a) DEM (Height Map) (b) Corresponding FCC Satellite Imagery
Figure 3: Terrain Raster data of Grand Canyon area (size 4K*2K)
 Figure 4: Display of vector polyline and point-location based objects
The performance of the software has been evaluated on a 2.4 GHz Pentium IV computer
with 512 RAM and Intel 82865G onboard Graphics Controller on 865GL motherboard.
Table 1 shows performance analysis of the Adaptive LOD Algorithm for 3D visualization
of the raster data set. The results of testing the same adaptive LOD algorithm using
triangle strip (with indexed vertex array) in conjunction with object management
algorithm are shown in Table 2. Further, the Nvidia’s XFX GeForce 6800GT 256MB
programmable graphics card was used to test the vertex and fragment programs for
increased visual effects and photorealism by incorporating the multitexturing and fog
effect features.
Table 1: Performance Analysis of the Adaptive LOD Algorithm (without Objects & Vector Data)
Table 2: Performance Analysis of the Adaptive LOD Algorithm (with Objects & Vector Data)
5. Conclusions and Future Work
The main objective of our multiresolution terrain rendering research has been to create
photorealistic visual simulations of very large terrains on inexpensive desktop PCs. The
rendering must be done in real-time ensuring frame rates above 30 frames per second. To
achieve the objective, we have developed an efficient tile-based out-of-core viewdependent
LOD mesh simplification algorithm for real-time rendering of large terrains. It
is augmented with an efficient object-handling algorithm to display large collection of
static objects such as buildings, trees, roads etc. over the terrain while maintaining the
real-time frame rate.
To achieve advanced visual effects and improved photorealism in 3D terrain
visualization, we have proposed to exploit the programmable graphic pipeline features of
current GPUs by developing suitable vertex and fragment shader programs. Currently we
are investigating to create several other visual effects such as environmental mapping,
animations etc. through GPU programming as well as to off-load some general purpose
computation from CPU to GPU considering its stream data processing model for
performance efficiency. Programmable processing on GPUs is the future of Graphics,
Visual and Cinematic Computing.
References
- M. Hesse and M.L. Gavrilova, “An Efficient Algorithm for Real-Time 3D Terrain
Walkthrough,” International Journal of CAD/CAM, Vol. 3, No. 2, pp. 111-117, 2003.
- M. Duchaineau et al., “ROAMing Terrain: Real-time Optimally Adapting Meshes,”
Proceedings IEEE Visualization ‘97, 1997, pp. 81-88.
- P. Lindstrom et al., “Real-Time, Continuous Level of Detail Rendering of Height
Fields,” Proceedings ACM SIGGRAPH ‘96, Aug 1996, pp. 109-118.
- Anupam Agrawal et al., “Dynamic Multiresolution Level-of-Detail Mesh
Simplification for Real-time Rendering of Large Digital Terrain Models”, Proceedings of
the IEEE INDICON-2004, IIT Kharagpur, India, Dec. 20-22, 2004, pp. 278-282.
- Anupam Agrawal et al., “An Approach to Improve Rendering Performance of Large
Multiresolution Phototextured Terrain Models using Efficient Triangle Strip Generation”,
Paper presented at IEEE IGARSS-2005 Conference held in Seoul, Korea, Jul. 25-29,
2005 (Proceedings on DVD).
- Anupam Agrawal et al., “Out-of-Core Construction and 3D Visualization of level-of-
Detail Terrains Populated with Large Collection of Heterogeneous Objects ”, Paper
accepted for International Conference on Computer Graphics Theory and Applications
(GRAPP-2006), 25 - 28 February, 2006, Setúbal, Portugal.
- Anupam Agrawal et al., “Geometry-based Mapping and Rendering of Vector Data
over LOD Phototextured 3D Terrain Models”, Paper accepted for 14th International
Conf. in Central Europe on Computer Graphics, Visualization and Computer Vision
(WSCG’2006), Jan 30 – Feb 03, 2006, Plzen, Czech Republic.
- Wolfgang Engel, “ShaderX3: Advanced Rendering with DirectX and OpenGL,”
Charles River Media, Nov. 2004.
- R. Fernando and M.J. Kilgard, “The Cg Tutorial: The Definitive Guide to
Programmable Real-Time Graphics”, Addison Wesley, 2003.
- Mason Woo et al., “OpenGL Programming Guide: The Official Guide to Learning
OpenGL, Ver 1.2, Third Ed.”, Addison-Wesley, 2000.
|
|
|