A Fast Method for Obtaining Spatial Distribution of a Variant Using Dendrograms of its Influencing Parameters using RSD


N. Balasubramanya Raju
N. Balasubramanya Raju
CGPL
Department of Aerospace Engineering
Indian Institute of Science,
Bangalore
raju@cgpl.iisc.ernet.in

Sunil B. Inamdar
Sunil B. Inamdar
CGPL
Department of Aerospace Engineering
Indian Institute of Science,
Bangalore
sunil@cgpl.iisc.ernet.in

G. S. Sheshagiri
G. S. Sheshagiri
CGPL
Department of Aerospace Engineering
Indian Institute of Science,
Bangalore
sheshagiri@cgpl.iisc.ernet.in

N. K. S. Rajan
N. K. S. Rajan
CGPL
Department of Aerospace Engineering
Indian Institute of Science,
Bangalore
nksr@cgpl.iisc.ernet.in


Abstract:
In some of the efforts of estimating the potential of Biomass Resource as a renewable, it is traced that the computer time in estimating them using RSD to a degree of comfort is fairly large to the extent that either the Computer has to be of a high–end or the user gets a low response.The Biomass Resource as the spatial variant has multiple influencing parameters that make the analysis more complex and time consuming. In the current approach, a method is successfully developed that reduces this time significantly with considerably less loss of accuracy that could be used for a user interaction to give a first cut estimates of the variant considered, in an affordable computer environment.

The concept used for the application considered lets the classification of Crops, the variant considered for the analysis, based on dendrogram in a parametric method to distribute it spatially. A dendrogram is drawn taking the probability values for each variant (crops, in this case) according to the influencing parameters (NDVI and Rainfall Ranges, in this case) attributed to different variants (crops). The lower and upper thresholds are obtained from the dendrogram. The method could be used as a quick first-cut estimate of the variant to be used further in the neural analysis of the classifications or could be used for a quick responding “on-line” query. The GIS tool developed is found to be remarkably significant in its usage in generating a dynamic query response in the GIS application developed for Biomass Resource Atlas for India. One of the most contributing outcomes of it is observed when it could render an on-screen graphic query that could provide the estimate of all the Biomass Residues generated within a selected radius of user’s choice at any selective point of the map. The method has found to be very useful tool in generating responses to the dynamic queries of the users, since the response time of the tool is quite short.

1 Introduction
Biomass as a renewable source of energy has to be assessed geographically across the country either for budgetary purposes or feasibility studies. Crop distribution involves mapping of crops; over different agricultural regions of known vegetation and rainfall in the GIS map. Remote sensed data from satellite containing the agricultural land use, rainfall and NDVI [Normalized Difference Vegetation Index] are available in different layers in the form of vector polygons. This is integrated into GIS and intersected to a single layer carrying all the related parameters.

Dendrogrammatic method of classifying agricultural polygons is used to distribute crops by using the influencing parameters such as NDVI and Rainfall provided by the agricultural layer. A dendrogram is formed considering the computed similarity (the degree of proximity) between the known values of the influencing crop parameter to form clusters of crops. The similarity is decided by a threshold distance for each influencing parameter. The known ranges of influencing parameters for the crops are picked up from surveys and analysis done. The unknown pattern provided by the GIS agricultural layer is compared with dendrogram by computing standard deviation for crop classification. This way the unknown pattern of NDVI and Rainfall is either attributed to a crop or a cluster of crops. This can be done in succession for both NDVI and Rainfall to resolve the conflict arising out of multiple crops selected due to two clusters. The common crop or crops in the selected two clusters is the one which is linked to the spatial polygon. The region considered for the purpose is the Kotanaduru taluk of East Godawari district in Andhra Pradesh.

NDVI (Normalized Difference Vegetation Index) measures the density of green on a patch of land. Healthy vegetation absorbs most of the visible light that hits it, and reflects a large portion of the near-infrared light and vice-versa. Vegetation Indices employ the difference formula based on this principle to quantify the density of plant growth on the Earth —

NDVI = (NIR — VIS)/ (NIR + VIS). (1)

Where (NIR) = near-infrared radiation and (VIS) = visible radiation.

Water content in the soil is the most decisive factor for a crop. Thus the amount of rainfall received in that particular area is a crucial parameter for identifying the crop that can be grown most efficiently in the given area. It is also applicable to irrigated areas as the dams are situated in places of good catchments based on rainfall.

2 The Methodology for the Classification using Dendrogram
Crop-NDVI Dendrogram is a computer generated structure of clusters defined by a threshold of NDVI among closer crops. Most commonly, dendrogram is drawn in a Cartesian layout, as an up-right tree. Dendrograms are often used for displaying relationships among clusters. A dendrogram shows the multidimensional distances between objects in a tree-like structure. Objects which are closest to each other in the multidimensional data space are connected by a horizontal line, forming a cluster which can be regarded as a "new" object. The new cluster and the remaining original data are again searched for the closest pair, and so on. The distance of the particular pair of objects (or clusters) is reflected in the height of the horizontal line.


Fig.1 Generation of Dendrogram


The agricultural polygons which can be used for cultivation of crops have to be classified using the influencing parameters as an input to the dendrogram classifier. The next focus is on the crops and their particular rainfall and ndvi ranges. The following lookup table gives the details of different crops and their known rainfall & ndvi ranges. The Dendrogram can be stored in an array on the basis of similarity given by computed averages of ranges of the parameters NDVI and Rainfall listed in look up tables.1 & 2.

Table.1 Lookup for NDVI


Table.2 Lookup for Rainfall


The Dendrogram for NDVI is formed by computing the distance among the crops with different thresholds. Accordingly a table will be created having different number of clusters for different threshold values. If the threshold is of narrowed value then more number of clusters with lesser number of crops in each cluster or group will be formed. To enable proper crop classification later the classifier algorithm will have to be run by comparing the unknown NDVI with its dendrogram at different levels of similarity. Similarly a Dendrogram considering Rainfall is constructed. The dendrogram is defined internally by Threshold, Cluster, parameter-average and the crop names.

Table 3. Cluster table for NDVI Dendrogram


Table 4. Cluster table for Rainfall Dendrogram


The Fig.2 below shows the flow diagram for dendrogrammatic spatial crop classification. All the unclassified agricultural polygons carrying crop influencing parameters- NDVI, Rainfall are selected into an array.


Fig.2 Flow diagram for dendrogrammatic spatial crop classification


If the two dendrograms are compared choosing the similarity levels in each case it is possible to resolve the unknown pattern of NDVI and Rainfall referring to a crop. For the purpose of biomass assessment where the unknown agricultural area has to be classified into crops, the standard deviation is computed for each of the clusters. The standard deviation is computed as per the equation (2) by considering the known value of the influencing crop parameter such as NDVI and the unknown value. The cluster with least deviation is chosen to classify the polygon for the crop / crops. This is resolved successively with rainfall. Higher dissimilarity means larger deviation resulting in larger crop cluster selection. This way conflicts will be more. This is the reason to use rainfall also as an influencing parameter for dendrogrammatic classification.



Crop distribution is done on a sample area under consideration- East Godawari district of Andhra Pradesh state, India. Each polygon representing small independent units of agricultural areas is identified in the Map Table by
  • Id – The polygon identifier.
  • State – State name
  • District – District name
  • Taluk – Which specifies which taluk the polygon belongs to.
  • Taluk code – A unique code to the taluk.
  • Area {kHa} – Specifies the geo-graphical area covered by the polygon.
  • LU_Code – A land use code which determines if the land is available for cultivation.
  • Rainfall – The rainfall in the polygon
  • Veg vigour – The ndvi value of the polygon
  • Crop – The crop name is assigned during distribution after classification.
For every unknown pattern represented by each polygon the dendrogrammatic classifier will select a crop cluster or a single crop. A crop color table is used as a look up to spatially color the map resulting in a graphical distribution of crops. The spatial Area attribute will hold the each unit area distributed in each taluk which can be aggregated to further compute biomass and power potential.

3 Results and Discussion
Following tables 6 & 7 show the intermediate results. The matching crop is best suitable for that particular geographical area according to the crop parameters. For the polygons bearing the Identifier from 2703378 and 2703388 Small Millet,Tobacco,Sunflower and Wheat are the crops predicted by NDVI processing. But rainfall predicts Sorghum,Tobacco,Sunflower, and Wheat. Therefore Sorghum and Small Millet are eliminated by the exclusive rule.

Table.6 Crop cluster selection based on standard deviation between the unknown and the known in the dendrogram clustering


Table.7 Rainfall Dendrogramm output


An analysis is done by comparing the crops predicted for the polygon with ground truths. If further conflicts or outliers are found as is true in the case shown, after analysis either the look up tables will have to be refined or have to be reclassified by considering additional influencing parameters such as societal habits, soil condition.

4 Concluding Remarks
This approach successfully classifies the crops with acceptable results that could be used for a quick user analysis of biomass availability to give first cut estimates of the variant considered, in an affordable computer environment. It also shows that extending the method to employ other crop influencing parameters can narrow down the polygon classification into a specific crop.

Bibliography
  1. Heywood, I., Cornelius, S., and Carver, S. 2002. An Introduction to Geographical Information Systems. Andison Wesley Longman. 2nd edition
  2. Pattern Recognition by Sergios Theodoridis & Konstantinos Kourtroumbas
  3. Biomass to Energy the Science and Techmology of the IISc, Bioenergy systems
    Authors: "The Team from CGPL IISc , 2003
  4. Michael Otey (2005), “Microsoft SQL Server 2005 New Features”, McGraw-Hill/Osborne.
  5. Jon Flanders, Ian Griffiths, Chris Sells (2003). “Mastering Visual Studio .NET”, O’Reilly Publication.
Web References
  1. http://earthobservatory.nasa.gov/Library/MeasuringVegetation/
  2. http://cgpl.iisc.ernet.in