Home > Application > Natural Hazard Management > Flood & Cyclones





The use of Artificial Neural Networks and GIS for flood modeling


Purnama B Santosa
Department of Geodetic Engineering
Faculty of Engineering, Gadjah Mada University
Yogyakarta, Indonesia


Abstract
This research deals with flood prediction modeling using Artificial Neural Networks (ANNs) and Geographic Information Systems (GIS). The problem that needs to be solved relates to the need of completing data insufficiency, and also the necessity to evaluate the GIS-based hydraulic flood modeling process.

In this research, river discharges of the Mitchell River in Gippsland Australia were used as the main input data for the purpose of flood simulations. Due to insufficient river discharges data, ANNs were then applied to predict this data in order to fill the missing data during the certain period of time. Once the missing river discharge data has been completed, the return period floods of 10, 25, 50, and 100 years could be calculated. These return period floods were then treated as input data in hydraulic flood modeling. Furthermore, geometric input data required to perform the flood hydraulic model was developed in a GIS.

The results show that ANNs can be successfully used to predict river flows. Results from hydraulic modeling show that there are only slight differences in water surface elevation resulted from the four flood profiles (10, 25, 50 and 100 years return period floods). One of the possible causes of these results is the quality of the cross sections used in the model. The cross sections extracted from a topographic map at scale 1:25000 are not accurate to represent terrain conditions.

1. Introduction
Due to its devastating nature, flood poses serious hazards to human populations in many parts of the world and the economic damage from floods has increased considerably. Droegmeier et al (2000) states that among the many natural disasters that disrupt human and industrial activity in the United States each year, including tornadoes, hurricanes, extreme temperatures, and lightning, floods are among the most devastating and rank second highest in causing loss of lives.

Because of serious consequences resulting from floods, prediction and evaluation of possible floods that may occur in a particular area is important to be conducted to ease the potential damage caused. This process is not trivial since it involves complex methods and requires a lot of data. Consequently, flood modeling is very difficult to perform in a traditional way (manual methods), namely a method that is done without using computers and based on non-digital data. This research tries to build a model for constructing a flood model simulation using ANNs and GIS-based Hydraulics. ANNs will be applied to develop an upstream river flow-downstream river flow relationship from historic flow data, specifically for the purpose of predicting missing data of daily river flows during a period of time. The GIS-based Hydraulics was applied to conduct an important role of simulating water surface profiles along river channels, as well as for floodplain mapping and visualization.

The problem that needs to be solved relates to the necessity of adding more data by predicting and filling the missing data of daily river flows, especially the data before 24 April 1991. Problem solving can be approached by way of applying the ANNs method to develop a model of rainfall-runoff and runoff-runoff relationships from historic rainfall and runoff data. Since the missing data of daily river flows at station 224219 has been filled, T-year return period floods can be calculated. The results will then be used as the basis for the GIS based Hydraulic modeling environment to determine water surface profiles at specific locations along the stream network.

2. Basic Principles

2.1. Artificial Neural Networks

Artificial Neural Networks (ANNs) have been considered as systems or mathematical models that work in such a way that imitates the human brain (Lin and Lee, 1996; Thurston, 2002). They work in a way that resembles human intelligence in order to solve problems. ANNs learn by example to extract information within a data set. The model of the neural network is like synapses in the human brain which consists of a series of processing units which are collectively connected (Thurston, 2002).

ANN is being touted as the future wave in the computing (Anderson and McNeill, 1992). ANN is a unique system that is different to other systems. For example, some traditional AI solutions and statistical solutions rely on and require a priori information to be able to solve problems. However, unlike these two approaches, because ANNs work based on self learning mechanisms, they do not need any a priori assumptions to solve a problem. They do not require the traditional skills of a programmer. ANNs will learn any regularities or patterns that may exist in the available data set to form a relationship.

One of the aspects that differentiates a neural network from others is its architecture. This architecture represents the pattern of connection between nodes, its method of determining the connection weights, and the activation function (Haykin, 1999). Two ways of determining neural networks architecture are the number of layers (includes single layer, bi-layer, and multi-layer), and the direction of information flow and processing (includes feed-forward and recurrent). One network that provides good performance with regard to input-output function approximation such as forecasting is multi-layer perceptrons (MLPs). This network is a multi-layer feed-forward networks, which is trained with a back-propagation learning algorithm. A typical multi-layer feed-forward network (as seen in Figure 1) has input layers, hidden layers, and output layers (Lin and Lee, 1996). The input layers have the main function of receiving inputs and then buffering the input signals. The signals from the input layers are then transmitted to hidden layers, particularly hidden neurons or hidden units. The function of hidden neurons is to intervene between the external input and the network output in some useful manner.



Figure 1. Multi layers feedforward network with
one hidden layer and one output layer (Haykin, 1999).


The processing elements in each layer are called nodes or units. Each of the nodes is connected to the nodes of neighboring layers. The parameters associated with each of these connections are called weights. The architecture of a typical node (in the hidden or output layer) is shown in Figure 2.



Figure 2. Nonlinear model of a node (Haykin, 1999)


Specifically, a signal xj at the input of synapse j connected to neuron k is multiplied by the synaptic weight wkj. The adder is specifically for summing the input signals and then these signals are weighted by the respective synapses of the neuron. The activation function is for limiting the amplitude of the output of a neuron to some finite values.

2.2. Water Surface Profile Calculations
US Army Corps of Engineers (2002) describes the basic steady flow water surface profile calculations. Water surface profiles are computed from one cross section to the next by solving the energy equation with an iterative procedure called the standard step method. The energy equation is written as follows:





3. Procedure

3.1. Methods

An approach was employed to predict possible inundated areas caused by river floods. The ANN was applied to perform hydrologic modeling which develops the upstream river flows-downstream river flows relationships from historic river flows data. In particular, the ANN was used to fill missing data of daily river flow at station 224219 on the Mitchell River from 1978 until 1990, and 1998 until 2000. This predicted daily river flow data were then used to calculate return period floods at station 224219 using the Gumbel method.

Furthermore, spatial data preparation was conducted in the GIS environment. At this stage, data development process was conducted to create spatial input data for flood calculation. Together with the predicted river flow data, the developed GIS data were then processed in a GIS based Hydraulic modeling environment to determine water surface profiles at specific locations along the stream network. The simulation was based on four different values of return period floods, namely 10 year, 25 year, 50 year, and 100 year return period floods which were calculated earlier. Finally, water surface data generation process was performed to yield flood maps of the inundated areas caused by particular water level.

3.2. Case Study Site
The Mitchell River has been chosen as the case study area. It is situated in the Mitchell catchment in East Gippsland, Victoria, Australia. This river is the major river in the basin with many tributaries in the upper reaches of the catchment combining to form this river. This catchment is bounded between longitude 146.70 E to 147.70 E and latitude -370 S to -380 S. This catchment has an approximate area of 4700 km2. Most part of this catchment, around 75%, is mountainous areas which are mostly situated on the northern side of the catchment. The remaining, around 25%, are mostly lower land areas which are situated on the southern side of the catchment. Two of the biggest cities in this catchment namely Glenaladale and Bairnsdale are located in this area. Most of the settlements, as well as farming and horticultural activities are also located in the lower lands.

There are three meteorological stations within the catchment. These stations are 84012, 85050, and 85279 which measure daily rainfall. There are at least seven discharge gauging stations in the basin which measure daily river discharges. Stations 224201 and 224206 are located at the Wonnangatta River. Stations 224203, 224218, and 224219 are at the Mitchell River. Station 224213 is located at the Dargo River, and station 224214 is at the Wentworth River. The details of both rainfall and discharge stations are presented on the attachments.

4. Results

4.1. River Flow Prediction Results

From the data normalization method selection, it is found that Method 3 which uses activation function values ranging from 0.1 to 0.9 significantly yields the best results. From the seven tests that were conducted, it was found that the value of average RMS error of this method is 120.21. This value is much lower than that of the other two methods.

Network architecture selection process was conducted to find the most suitable network architecture, particularly to find the best number of hidden nodes in the network. In this study, seven tests were conducted to investigate five different network architectures. The five network architectures to be evaluated consist of 3, 5, 7, 9, and 11 hidden nodes. All of these networks have one hidden layer and one output layer. The results show the network with 7 hidden nodes yields the best prediction results, as indicated by minimum values of total, average, and standard deviation of RMS error for both training and verification results. This is followed by a network with 9 hidden nodes, whereas a network with 3 hidden nodes shows the worst results.

The training results of the network are shown in Figure 3. This figure depicts the observed river flows as well as predicted river flows obtained from the network training process, presented in time series plots. As can be seen, the predicted values are, in general, in good agreement with the observed ones. The extreme (very low and very high) values are well forecasted.



Figure 3. The observed and predicted daily river flow data from the training process.


The good agreement between the observed and forecasted series can also be revealed by plotting the scatter diagrams shown in Figure 4. This diagram reveals the quality of training results as is indicated by the Coefficient of Determinant (R2) that reveals the strength of the relationships between the observed and the predicted river flows.



Figure 4. Scatter plot of observed and predicted river flows for training data.


Once the network had been trained, the network was ready to be used for predicting river flows. This network was then used to predict missing river flow data from 1978 until 1990, and from 1998 until 2000. This was achieved by inputting rainfall data from three rainfall stations and river flow data from five river gauge stations recorded during that period of time into the network. For the purpose of calculating return period floods, the river flow prediction was made on annual basis. One of the graphical plots of the predicted river flows is presented on Figure 5.



Figure 5. Predicted daily river flows at station 224219 for the year 1979.


4.2. Flood Maps The flood maps that have been created in this study are aimed at predicting areas that are susceptible and vulnerable to being inundated by four probable flood events. These four flood events are 10 years, 25 years, 50 years, and 100 years return period floods. One of the maps is shown in Figure 6 as follows.



Figure 6. Map of 100 years flood probability.


Table 1 below explains the different of flood heights between the four flood events. For the upper reach, maximum increases of water surface elevation between profiles are only around 0.87 m, with average increase ranging from 0.258 m to 0.302 m. Considering that the terrain of the upper reach is mostly steep and mountainous, it is reasonable to assume that the slight increase of water surface elevations do not have much of an impact to the increase of flooded areas. For the lower reach on the other hand, maximum increases of water surface elevation between profiles are bigger than those of the upper reach, ranging from 1.1 m to 1.91 m, and with average increases ranging from 0.765 m to 1.314 m.

Table 1. Water surface differences between profiles.



PF1 = Profile 1 (10 years return period flood)
PF2 = Profile 2 (25 years return period flood)
PF3 = Profile 3 (50 years return period flood)
PF4 = Profile 4 (100 years return period flood)

5. Conclusion
For the flow prediction modeling using ANN, the results show that the neural network model has achieved an acceptable fit to its training and verification data sets, indicating that it is possible to predict river flows with reasonable accuracy. The strength of ANNs in predicting river flows has been shown. The process which was difficult to perform using traditional statistical techniques, could be achieved quite easily with this method.

The use of GIS to support hydraulic modeling software (HEC-RAS) for flood modeling has shown significant advantages. In the area of data preparation, GIS has shown its power in the data development process. For the purpose of results visualization, it was found to be very beneficial in generating floodplain maps as well as conducting analysis on the impact of the floods. Since the quality of the modeling to large degree depends on the quality of the topographic maps to generate cross section data, the use of more accurate topographic data is highly recommended. The slight different in water surface elevation between the four flood events indicates the inaccurate terrain model that was used in this study.

6. References
  • Anderson, D., & McNeill, G. (1992). Artificial neural networks technology. A DACS State-of-the-Art Report. New York: Kaman Sciences Corporation.
  • Droegemeier, K., Smith, J. D., Businger, S., Doswell III, C., Doyle, J., Duffy, C., et al. (2000, November). Hydrological aspects of weather prediction and flood warning: report of the ninth prospectus development team of the U. S. weather research program. Bulletin of the American Meteorological Society, 81(11), pp. 2665.
  • Haykin, S. (1999). Neural networks: A comprehensive foundation. 2nd ed. New Jersey: Prentice Hall Inc.
  • Lin, C. T., & Lee, C. S. G. (1996). Neural fuzzy systems. New Jersey: Prentice Hall Inc.
  • Thurston, J. (2002). GIS & artificial neural networks: Does your GIS think? Retrieved April 5, 2003, from: http://www.integralgis.com/articles/ Neural%20Networks.pdf
  • US Army Corps of Engineers. (2002). HEC-RAS, River analysis system, Hydraulic reference manual version 3.1. CA, USA: Hydrologic Engineering Centre.

7. Attachments




Figure 7. Study site location: Mitchell River catchment.


Table 2. Variation of RMS error of training data resulting from different normalization methods (scaled by 10 )



Table 3. Variation of RMS error of training data resulted from different hidden nodes number in m3/s



Table 4. Maximum flow of observed and predicted river flow at station 224219.



Table 5. Different areas between profiles (m2).



Page 1 of 1