Abstract
Drought Assessment in North-East Thailand using Correlation and Regression analysis of NDVI and Rainfall

W.Thavorntam1
Email: watsao@kku.ac.th, charat@kku.ac.th

C.Mongkolsawat. 1
Center of Geoinformatics for the Development of Northeast Thailand, Khon Kaen University.1
Abstract
North-East Thailand is frequently subject to drought due to erratic rainfall distribution patterns, critical dry periods within the rainy season, and low soil water holding capacities.
The Normalized Difference Vegetation Index (NDVI) is a powerful indicator to monitor the vegetation cover over wide areas so the frequent occurrence and persistence of droughts can be detected. The objectives of this study were to assess drought event and drought severity in North-East Thailand by correlation and regression analysis of NDVI and rainfall. Standardized Precipitation Index (SPI) was used to examine the severity and spatial pattern of droughts in the region. SPI is a meteorological-drought index based on the probability of rainfall deviating from the median of the long term rainfall record. Monthly rainfall data for the period 1976-2004 were collected from 308 rainfall stations distributed across the region and digitally encoded into a GIS database. Mean annual rainfall maps for the entire region were generated through spatial interpolation of station data based on kriging. Station-based 3-month, 6-month, and annual SPI values were similarly kriged to assess spatial patterns of drought severity. The correlations of changes in vegetation greenness with rainfall and SPI each year were examined through regression analysis. The result indicated that inter-annual variability in rainfall was greatest in the drier, southwest areas where a recent 5 year trend of decreasing rainfall was found. The SPI analysis revealed high and moderate drought risk areas in the southwest and extending to the northwest part of the region, while lower drought risk areas were found in the northern and northeastern parts of the region, along the Mekong river. A high correlation between NDVI and rainfall was found in high vegetation cover which could be used to identify drought events in the region.
Introduction
Drought has increasingly impacted to the Northeastern part of Thailand even the total amount of rainfall is relatively high. Water deficiency for agriculture, the major economic sector in the region, is the most serious negative consequence from drought. The rainfall period in Northeast Thailand is from May through October and is controlled by the Asia monsoons with two distinct rainfall periods, one from the southwest in the early wet season and the other from the northeast in the latter part. An erratic distribution of rainfall and low water holding capacity of the soils of the region are primary causes of drought, resulting in critical dry spells during the rainy season period, particularly from June to July and the last two weeks of September (Siriporn et al, 2000 ).
Assessments of rainfall and water resource availability are used to identify the pattern and intensity of drought. The standardized precipitation index (SPI), developed by Mckee et al. (1993), is used to quantify precipitation deficits at several time scales, based on the cumulative probability of recording certain amounts of precipitation at a station. Drought happens when rainfall is below normal and yields a low probability on the cumulative probability function. The output of SPI is in units of standard deviation from the median based on the time series record (Trnka et al., 2003 ). The SPI has been used to monitor the intensity and spatial extension of droughts at different time scales in South Africa (Rouault et al., 2003) and also in Europe (Llod-Huges and Sounders, 2002)
The Normalized Difference Vegetation Index (NDVI) is the spectral contrast between the reflected Near Infrared and Visible band where the contrast value of vegetation is higher than bare soil (Kassa, 1999). NDVI has been widely used for monitoring vegetation conditions such as in the central part of United States, Sudan, India and Mongolia (Albert et al, 2002; Kassa, 1999; Bhuiyan, 2000; Bayarjarrgal, et al, 2000).
There are some limitation of using NDVI such as a disturbance of radiometric from the atmosphere and sensor view and solar zenith angle. The relationship between NDVI and rainfall might different in each climate zone. However, there are many applicable studied indicated the meaningful relationship between NDVI and rainfall. In the arid Kalahari, Botswana, NDVI was used to estimate rainfall effectively on monthly and annual times scales(Grist et al, 1996). Linear regression relationships between NDVI and vegetation and rainfall in Namibia was estimated and results indicate the ability of NDVI and Maximum value composite (MVC) to predict green vegetation cover (W.P. du Plessis, 1999). In addition, monthly rainfall could be explained by linear correlation with NDVI and the best correlation was found when rainfall of the preceding two months was included (Hess et al, 1995).
The objectives of this study were to analyze the spatial and temporal patterns of drought severity in North-East Thailand using rainfall data and the meteorological-based standardized precipitation index (SPI), and to assess the results of relationship between NDVI and rainfall by linear regression analysis.
Study Area
The study area is in North-East Thailand, with an approximate area of 25,000 km2, between 15° 27´ N to 16° 49´ N and 103° 31´ E 104° 53´ E (Fig. 1). Most of the area is dominated by rice paddy fields and other crops, with isolated patches of remnant forest. The topography of the area is gentle undulating terrain and small hills, and the area is underlain by a thick sequence of Mesozoic rock, the Maha Sarakam Formation which consist of sandstone, siltstone and interbedded rock salts. The soils are inherently low in fertility and have sandy textures with low water holding capacities.

Figure 1. Study area.
3. Methodology
3.1 Rainfall analysis
Monthly rainfall data from 176 Thailand Meteorological Department stations, distributed throughout the study area, were collected for the years 1976 to 2004 and digitally encoded into a GIS database. Rainfall frequency distribution and spatial autocorrelation patterns over different locations in the North-East Thailand were examined. The mean annual rainfall of each station was calculated for the 29 years, and the median value of annual rainfall data was selected as the representative value of each station during the study period. Kriging was applied to spatially interpolate the mean annual rainfall from 308 stations and create a region-wide mean annual rainfall map for the 29 years of the study period. Cross validation was used to measure the residuals between the point-based and kriged-surface rainfall data.
3.2 SPI calculation
SPI is calculated by fitting a gamma distribution function to a given frequency distribution of precipitation totals for a given station, and then transforming the gamma distribution to a normal distribution with mean of zero and variance of one (Loukas et al., 2004). The SPI output values are in units of standard deviation from long term median and provide the corresponding probabilities of occurrence of each drought category relative to the normal probability density function (Rouault et al., 2003).
In this study, SPI values were derived both temporally and spatially for quantitative comparisons of drought incidence over the 29 year period and 176 different rain gage locations. SPI values in the wet monsoon season were calculated based on a 3-month SPI for the beginning (May to July) and 6-month SPI for the entire wet monsoon period (May to October). A 12 month SPI was also calculated to characterize long-term drought occurrences. SPI values equal to or less than -1.0 was used to define drought intensity according to Table 1. The frequency of occurrence of drought years was determined by looking at the number of rain gage locations with SPI < -1, since 1976. Spatial maps of SPI were derived through kriging of SPI values from each station across North-East Thailand.

Table 1. Drought classification based on SPI output.
3.3 NDVI from Landsat TM data
The NDVI data obtained from Landsat TM images acquired during dry season from January to February in 2001, 2002, 2003, 2004 and 2005. Band 3 (Red) and Band4 (NIR) with spectral ranges 0.63-0.69 µm and 0.75-0.90 µm were used to calculate NDVI. The formula for NDVI calculation is;
NDVI = (NIR- Red ) / (NIR + Red) (1)
From formula (1), vegetation can be mapped because the Red band has a higher susceptibility of absorption by chlorophyll than longer wavelengths. Hence, the brightness of vegetation in the Red band is less than NIR because of its lower reflectance (Williams, 1995). The intensity values were calculated pixel by pixel and the result from this formula ranges from -1 to 1. Radio metric and geometric correction was done to improve the accuracy of each image.
3.4 Correlation between NDVI data and Rainfall data
NDVI and rainfall data have been prepared for spatio-temporal regression analysis. For multi-temporal regression analysis, NDVI value and rainfall in every month in 2005 were examined. NDVI was calculated monthly for the North-East region and mean value was selected as a representative value of each month. Monthly rainfall in the every station cover the whole region was summed by the concurrent month with the 3 previous months. This periods revealed the highest in the coefficient of determination (R2) about 0.764 (Wattanakij, 2008).There was 12 pairs of these data to observe the correlation.
For Spatial regression analysis between NDVI and rainfall, the study sites were focused in the area about 25,000 km2 in the eastern part of North-East Thailand. Largely different vegetation covers were chosen as study site ranging from bare soil to high vegetation cover. Raster cell values of the NDVI gird from 2001 to 2005 in each study site were extracted into table. Median rainfall in rainy period in every station was selected and interpolates to get the spatial pattern of rainfall cover the eastern part the region. Raster cell values of rainfall grid were also extracted into table to compare with raster cell values of the NDVI.
The best correlation between NDVI and Rainfall was found when rainfall of the preceding months was included. In each study sites, NDVI was compare with median of rainfall in the previous year. For example, NDVI taken from Landsat TM acquired in January 2001 was compared with median rainfall in year 2000. The relationship between NDVI and rainfall was evaluated from the correlation coefficient calculated from this method.
4. Results & Discussion
4.1 Rainfall record for the 29 years period
The spatial and temporal variability in annual rainfall are shown in Fig. 2 for the stations in the southwest (SW), in the central (C) and in the northeast (NE) portions of the study area. In this period, rainfall amount in the northeast are mush higher than in the southwest as a result of heavy rain from the northeast during the September and October monsoon period. Precipitation in the southwest has strongly declined in the last 5 years from 2000 to 2004 and there are no significant long term trends in mean annual precipitation in all three zones. Overall, annual rainfall varied from minimums about 800 mm, in the southwest and western parts of the region, to a maximum of over 3000 mm in the northern part.
The spatial pattern of rainfall was studied by applied ordinary kriging as a stochastic method for interpolation rainfall. Form this interpolation method, rainfall amount in the 29 years period in the northeast of Thailand is increased significantly from the southwest to the northeast portion of the region.

Figure 2: Mean annual precipitation for 3 spatially distributed zones in N-E Thailand.

Figure 3. The spatial pattern of mean annual rainfall in 29 years period.
4.2 SPI calculation
4.2.1 Multi-temporal SPI
The SPI values deviate from the median were used to observe anomalous dry and
wet years over the study period. The 3, 6 and 12 month SPI values for three stations, in the northeast, the middle and the southwest portions of the study area are shown in Fig. 4. At the northeast station, the worst dry years occurred in 1982, 1983, 1987, 1988, 1991, 1998 and 2004, with dry periods appearing more frequently in the 3 month, early wet season, SPI values. The worst dry years occurred in 1977, 1992 and 1993 at the middle station. At the southwest station, the worst dry years were in 1981, 1992, 1998, 2002 and 2003 with drought periods occurring more often in the 6 month SPI. The 3 month SPI show a pattern of increasing droughts with the 2001-2002 period having the lowest SPI values over the 29 year time record.
4.2.2 Spatial SPI
The spatial pattern and intensity of drought of the 6 month SPI values is shown in Figs. 5 for the period 1976 through 2004. The areas most affected by drought primarily occurred in the western and southern parts of the region, with the worst drought years in 1981, 1986, 1993, 1997, and 2001.These are correlated with the El Niño events which occurred in 1982-83, 1986-87, 1991-92 and 1993-1994, and 1997-1998. The El Niño event of 1997-1998 was particularly strong, resulting in drought throughout the West Pacific (NOAA Office of Communications, 2007).

Figure 4: (a) 3, 6, 12 month SPI at the northeast station; (b) 3, 6, 12 month SPI at the southeast station; (c) 3, 6, 12 month SPI at the central station
4.3 Spatial analysis of correlation between NDVI and Rainfall
This study use Pearson’s correlation coefficient to test partial autocorrelation where the hypothesis is that the coefficient of correlation between the NDVI and rainfall is significantly different from zero. For the bare soil site, in 2001, mean and standard deviation of NDVI are 0.128 and 0.65 respectively. In addition, mean and standard deviations of rainfall this year are 237.2 mm and 2.84 respectively. An examination of the output for a p-value of .000, report it as p < .001 can be conclude that NDVI and Rainfall in the bare soil site are correlated [(r= 0.128, p <.001).] Table 2 shows correlation coefficients between NDVI and rainfall of each study site from year 2001 to 2005.
The correlation coefficient varies in different study site. In 2001, the highest correlation coefficient value was found in moderate vegetation cover while the lowest value was found in mountainous area about 0.19 and -0.05 respectively. The higher vegetation covers tend to have higher correlation coefficient values which the results have been observed from 2001 to 2005. In comparison between this study period, for moderate vegetation cover, the highest correlation coefficient value was in 2001 while the lowest values was in 2004. The highest negative correlation was found in moderate vegetation cover about -0.54. This study site was in the western part of the study area where the precipitation amount is relatively low but NDVI value is high because of irrigation is available for agricultural activity.

Figure 5: Spatial extension of the SPI 6 month since 1976
The NDVI and rainfall were compared in different study site using regression analysis. For example, in 2001 for bare soil site, mean and standard deviation value for NDVI are 0.120 and 0.659 while there are 237.211 mm and 2.846 for mean and standard deviation values of rainfall. Number of observation is the number of grid cell about 65536. The statistics and equation of the independent variable (NDVI) predicting the dependent variable (rainfall) for bare soil are listed below;
Rainfall = 5.509*NDVI +236.549 (2)

Table 2. Correlation coefficients (r) between NDVI and rainfall from 2001 to 2005
The coefficient of determination (R 2) value of NDVI prediction rainfall is 0.016. The standard error of the y-axis (rainfall) predicted by the x-axis of NDVI’s equation are 2.823. The Predictor variable, NDVI, is significantly different from zero with p < 0.001 thus the predictor, NDVI, is important for better prediction. The observed F-ratio of 1084.752in the ANOVA table is compared with the critical value of 3.84 found in the F-table, using a=0.05 and 1 and 65534 degree of freedom for numerator and denominator respectively. The conclusion is that NDVI explains a significant amount of the variability in rainfall because the observed value of F exceeds the critical value of 3.84. Durbin-watson Stat is 0.036 indicate the autocorrelation of residuals. From the scatter plot between standardized residuals and standardized predicted values, the residuals are randomly scattered according to the assumption of regression analysis that the standard deviation is the same everywhere is termed homoscedasticity. The residuals are tested of normality by testing the scatter plot between Studentized deleted residual and standardized predicted values. The result shows the normality of residuals with 95% of residual are in the range of -2 to 2 and standard deviation value is 1. Table 3 shows linear regression equation and statistics of NDVI compare with rainfall for different study site.

Table 3. Linear regression equation and statistics of NDVI compare with rainfall.
4.4 Multi-temporal analysis of correlation between NDVI and Rainfall
NDVI values were computed for whole region were plotted against precipitation
monthly in 2005 (Figure 6(a) and figure 6(b) ). From an observation the curve of NDVI and rainfall, NDV values are related to the amount of precipitation which is relatively high in rainy period from May to October.

Figure 6(a) Monthly NDVI Figure 6(b) Monthly rainfall
The regression analysis between NDVI and Rainfall in this year shows the correlation of these value is very high with the correlation coefficient (r) is 0.984. Mean and standard deviation value for NDVI are 0.5403 and 0.11095 while there are 446.397 mm and 368.658 for rainfall respectively. Number of observation is 12. The statistics and equation of the independent variable (NDVI) predicting the dependent variables (rainfall) are listed below;
Rainfall = 2855*NDVI-1548 (3)
The coefficient of determination (R 2) value of NDVI prediction rainfall is 0.966 The standard error of the y-axis (rainfall) predicted by the x-axis of NDVI’s equation are 68.44 The Predictor variable, NDVI, is significantly different from zero with p < 0.0010 The observed F-ratio of 309.168 in the ANOVA table is compared with the critical value of 4.96 found in the F-table, using a=0.05 and 1 and 10 degree of freedom for numerator and denominator respectively. From this result, NDVI explains a significant amount of the variability in rainfall because the observed value of F exceeds the critical value of 4.96. Residuals are independent because Durbin-Watson Stat is 1.959. The normality of residuals have been tested and show the normality with 95% of residual are in the range of -2 to 2 and standard deviation value is 1.
5. Conclusion
The decadal analysis of rainfall variability since 1976 showed that, on an annual time scale, drought in the region was most severe in the 10 year period from 1988 to 1999. The spatial patterns analyzed showed drought affected areas in the western and southern parts of the region, with drought severity, which associated with the pattern of rainfall, decreasing from the drier southwest to more humid northeast areas of this region. The spatial and temporal analysis of drought using SPI were found useful in characterizing spatial patterns and temporal frequencies of drought, and in evaluating drought affected areas. Multi-temporal SPI values at various time scales were useful in assessing drought occurrences and severity.
To study correlation between NDVI and rainfall can improve the evaluation of drought affected areas and drought severity. The correlation between NDVI and rainfall was observe by correlation coefficient value both multi-temporal and spatial analysis. For spatial analysis, from year 2001 to 2005, the highest correlation is negative correlation between NDVI and rainfall in moderate vegetation cover especially for irrigated area which could be interpret that less amount in rainfall has no influence to green vegetation cover. For the different land cover site, rainfall in rainy season still has an effect on vegetation development which there is more sensitive on bare soil and low vegetation cover. The correlation coefficient is much higher for multi-temporal analysis. The pattern and intensity of drought can be identified through assessments of rainfall which the result of this study shows a significant amount of variability in rainfall could be explained by NDVI.
The Author would like to acknowledge the advice and assistance on remote sensing and programming aspects by Nagon Wattanakij, lecturer at the department of computer Science, Khon Kaen University and Ukrit Khuenkaw, Computer system analyst at Computer center, Khon Kaen University. The Center of Geoinformatics for the Development of Northeast Thailand, Khon Kaen University, Thailand for the continuous support for this research.
References
Albert J. Peters, Elizabeth A. Walter-Shea, Lei Ji, Andres Vina, Michael Hayes and Mark D. Svoboda. ,2002, Drought Monitoring with NDVI-Based Standardized Vegetation Index. Photogrammetric Engineering & Remote Sensing, Vol. 68,No. 1, January 2002. pp.71-75.
Bayaijargal, Y., Adyasuraen, T. and Munkhtuya, S., 2000, Drought and Vegetation Monitoring in Arid and Semi-Arid Regions of the Mongolia using Remote Sensing and Ground data. Proceeding of the 211th Asian Conference on Remote Sensing.
Benjamin Lloyd-Hughes and Mark A. Saunders., 2002, A Drought Climatology for Europe. International Journal of Climatology. 22: 1571-1592.
Bhuiyan, C., 2004, Variouse Grought Indices For Monitoring Drought Condition In Aravalli Terrain of India. The XXth ISPRS Congress(Istanbul), July 12-23, 2004.
Kassa Alemayehu., 1999, Drought Risk Monitoring for the Sudan using NDVI 1982-
[Online Accessed 22 December 2003]
http://www2.soas.ac.uk/Geography/WaterIssues/OccasionalPapers/AcrobatFiles/OCC25. PDF.
Lloyd-Hughes, B. and Sounders, A.M., 2002, A Drought Climatology for Europe.
International Journal of Climatology, 22:1571-1592.
Loukas, A. and Vasiliades, L., 2004, Probabilistic analysis of Drought spatiotemporal characteristics in Thessaly region, Greece.Natural Hazards and Earth System Sciences, 4:719-731
Mckee, B.T., Doesken, J.N. and Kleist, J., 1993, The Relationship of Drought Frequency and Duration to Time Scales. Proceeding of the Eight Conference on Applied Climatology, 17-22 January 1993, Anaheim, California.
Plessis, du W.P. 1999.Linear regression relationship between NDVI vegetation and rainfall in Etosha National Park, Namibia. Journal of Arid Environment (1999) 42:235-260
Rouault, M. and Richard, Y., 2003, Intensity and Spatial Extension of Drought in South Africa at Different time Scale. Water SA, Vol.29 No.4 October 2003.
Siripon, K. and Mongkolsawat, C., 2000, Spatial and Temporal Analysis of Rainfall Pattern in Northeastern Thailand : Application of GIS. Journal of Remote Sensing and GIS Association of Thailand, Vol 1(1), 1-18.
Trnka M., Semeradova D., Eitzinger J., Dubrovsky M., Wilhite D., Svoboda M., Hayes M., Zalud Z., 2003, Selected methods of drought evaluation in South Moravia and Northern Austria. in: XI. International poster day ("Transport of water, chemicals and energy in soil-crop-atmosphere system"), Bratislava, Institute of Hydrology, Slovak Academy of Sciences, Slovakia.
Wattanakij, N.2088. Drought detection in northeast Thailand using vegetation indices of
multi-temporal satellite data. Master of Science Thesis in Remote Sensing and
Geographic Information system, The Graduate School, Khon Kaen University.