[ieee 2014 third international conference on agro-geoinformatics - beijing, china...

4
Estimating Leaf Chlorophyll Concentration in Soybean Using Random Forests and Field Imaging Spectroscopy Jie Lv College of Geomatics Xi’an University of Science and Technology Xi’an, China [email protected] Zhenguo Yan School of Energy Engineering Xi’an University of Science and Technology Xi’an, China [email protected] Abstract—An accurate quantitative estimation of crop chlorophyll content is of great importance for a wide range of monitoring crop grow health condition and estimating biomass, since radiative transfer model are complex caused by the nonlinear relationship between crop spectral and chlorophyll content and the uncertainties in the land surface systems, traditional inversion techniques can not satisfied with the demand of accurate estimation of chlorophyll content. Alternatively, random forests are able to cope with the strong nonlinearity of the functional dependence between the biophysical parameter and the observed reflected radiance, it may therefore be more suitable candidates for estimating crop biochemistry parameters from inversion of radiative transfer model. It is crucial to apply random forests for inversion of radiative transfer model, so as to construct hyperspectral estimation model for crop chlorophyll content. The aim of this paper is to explore the feasibility of using random forests and field imaging spectroscopy for the estimating leaf chlorophyll concentration in soybean. Field spectroscopy was carried out with an ASD FieldSpec3 in summer 2009, at the farmlands of city of Chang’chun, Jinlin province. The measured spectral range was between 350–2500 nm with a sampling interval of 1.4 nm in the 350–1000 nm range and 2 nm in the 1000–2500 nm range, and the spectral range between 350-1250 nm was used for the retrieval of leaf chlorophyll concentration. Leaf chlorophyll concentration in soybean was measured by SPAD-502. Each sample sites was recorded with a Global Position System (GPS). Firstly, a training data set through PROSPECT was established to link soybean spectrum and the corresponding chlorophyll content. Secondly, random forests were adopted to train the training data set, in order to establish leaf chlorophyll content estimation model. Thirdly, a validation data set was established based on proximal hyperspectral data, and the leaf estimation model of chlorophyll concentration was applied to the validation data set to estimate leaf chlorophyll content of soybean in the research area. The estimation model yielded results with a coefficient of determination of 0.9317 and a mean square error (MSE) of 74.2569. The results indicate that model based on random forests and field imaging spectroscopy can estimate leaf chlorophyll content of soybean accurately, and it can solve soybean chlorophyll content inversion problem even with inadequate samples. Random forests and field imaging spectroscopy would be used as a new quickly and nondestructive method to estimate the chlorophyll content of the soybean. Future study will concentrated on scaled up the field estimation model to satellite remote sensing level, which will monitor the soybean’s health condition in a large scale. Keywords—random forests; chlorophyll concentration; soybean; hyperspectral reflectance; field imaging spectroscopy I. INTRODUCTION The chlorophyll content in leaf is an indicator of grow health condition and the soybean yield. Therefore, it is very important to estimate accurate chlorophyll content in soybean. The traditional methods for retrieval of chlorophyll content in plants are complex, time-consuming, and expensive. Field imaging spectroscopy collected data facilitates quantitative and qualitative characterization of both the surface and the atmosphere, using geometrically coherent spectral measurements [1]. Imaging spectroscopy has great potential for estimating chlorophyll content of soybean dynamic, rapidly. Chlorophyll content can be estimated through regression function or model based on the response relation between the hyperspectral reflectance and chlorophyll content of plant. In recent years, hyperspectral estimation models have been established to retrieve chlorophyll content of wheat [2-3], boreal trees [4], soybean [5-6], corn [7-9], vegetable [10] by some scholars. The current research are using vegetation index and chlorophyll content to build regression function, these statistical function are simple and fast, but the function are relay to the sampling site and sampling environment, these statistical models can change with the time and space, which is lack of universality. Radiative transfer model are based on the clear physical law, which can be used to explain the transmittance and interaction of light transfer in the canopy of plant. The commonly used algorithm in inversion of radiative transfer models is look up table (LUT) [11-13], numerical optimization algorithm [14], artificial neural network (ANN) [15-17], but

Upload: zhenguo

Post on 22-Mar-2017

221 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: [IEEE 2014 Third International Conference on Agro-Geoinformatics - Beijing, China (2014.8.11-2014.8.14)] 2014 The Third International Conference on Agro-Geoinformatics - Estimating

Estimating Leaf Chlorophyll Concentration in Soybean Using Random Forests and Field Imaging

Spectroscopy

Jie Lv College of Geomatics

Xi’an University of Science and Technology Xi’an, China

[email protected]

Zhenguo Yan School of Energy Engineering

Xi’an University of Science and Technology Xi’an, China

[email protected]

Abstract—An accurate quantitative estimation of crop chlorophyll content is of great importance for a wide range of monitoring crop grow health condition and estimating biomass, since radiative transfer model are complex caused by the nonlinear relationship between crop spectral and chlorophyll content and the uncertainties in the land surface systems, traditional inversion techniques can not satisfied with the demand of accurate estimation of chlorophyll content. Alternatively, random forests are able to cope with the strong nonlinearity of the functional dependence between the biophysical parameter and the observed reflected radiance, it may therefore be more suitable candidates for estimating crop biochemistry parameters from inversion of radiative transfer model. It is crucial to apply random forests for inversion of radiative transfer model, so as to construct hyperspectral estimation model for crop chlorophyll content. The aim of this paper is to explore the feasibility of using random forests and field imaging spectroscopy for the estimating leaf chlorophyll concentration in soybean.

Field spectroscopy was carried out with an ASD FieldSpec3 in summer 2009, at the farmlands of city of Chang’chun, Jinlin province. The measured spectral range was between 350–2500 nm with a sampling interval of 1.4 nm in the 350–1000 nm range and 2 nm in the 1000–2500 nm range, and the spectral range between 350-1250 nm was used for the retrieval of leaf chlorophyll concentration. Leaf chlorophyll concentration in soybean was measured by SPAD-502. Each sample sites was recorded with a Global Position System (GPS).

Firstly, a training data set through PROSPECT was established to link soybean spectrum and the corresponding chlorophyll content. Secondly, random forests were adopted to train the training data set, in order to establish leaf chlorophyll content estimation model. Thirdly, a validation data set was established based on proximal hyperspectral data, and the leaf estimation model of chlorophyll concentration was applied to the validation data set to estimate leaf chlorophyll content of soybean in the research area. The estimation model yielded results with a coefficient of determination of 0.9317 and a mean square error (MSE) of 74.2569. The results indicate that model based on random forests and field imaging spectroscopy can estimate leaf chlorophyll content of soybean accurately, and it can solve soybean chlorophyll content inversion problem even with

inadequate samples. Random forests and field imaging spectroscopy would be used as a new quickly and nondestructive method to estimate the chlorophyll content of the soybean.

Future study will concentrated on scaled up the field estimation model to satellite remote sensing level, which will monitor the soybean’s health condition in a large scale.

Keywords—random forests; chlorophyll concentration; soybean; hyperspectral reflectance; field imaging spectroscopy

I. INTRODUCTION The chlorophyll content in leaf is an indicator of grow

health condition and the soybean yield. Therefore, it is very important to estimate accurate chlorophyll content in soybean. The traditional methods for retrieval of chlorophyll content in plants are complex, time-consuming, and expensive. Field imaging spectroscopy collected data facilitates quantitative and qualitative characterization of both the surface and the atmosphere, using geometrically coherent spectral measurements [1]. Imaging spectroscopy has great potential for estimating chlorophyll content of soybean dynamic, rapidly.

Chlorophyll content can be estimated through regression function or model based on the response relation between the hyperspectral reflectance and chlorophyll content of plant. In recent years, hyperspectral estimation models have been established to retrieve chlorophyll content of wheat [2-3], boreal trees [4], soybean [5-6], corn [7-9], vegetable [10] by some scholars. The current research are using vegetation index and chlorophyll content to build regression function, these statistical function are simple and fast, but the function are relay to the sampling site and sampling environment, these statistical models can change with the time and space, which is lack of universality.

Radiative transfer model are based on the clear physical law, which can be used to explain the transmittance and interaction of light transfer in the canopy of plant. The commonly used algorithm in inversion of radiative transfer models is look up table (LUT) [11-13], numerical optimization algorithm [14], artificial neural network (ANN) [15-17], but

Page 2: [IEEE 2014 Third International Conference on Agro-Geoinformatics - Beijing, China (2014.8.11-2014.8.14)] 2014 The Third International Conference on Agro-Geoinformatics - Estimating

the calculation time of these inversion methods is too long, and easily over fit.

In recent years, random forests has been widely applied in the classification of remote sensing image [18-19]. The objective of this paper is to combine random forests and field imaging spectroscopy for estimating chlorophyll content in soybean at leaf scale.

II. MATERIALS AND METHODS

A. Study site The study site is located at Changchun City, Jilin Province,

China. Chlorophyll content of soybean was taken as researching target. Three fields of farmland were selected to measure soybean spectral data and the chlorophyll content of soybean in July 2009.

B. Field campaign data Field campaigns in the soybean field were carried out under

clear sky conditions. Reflectance of soybean was collected by ASD3 spectrometer, with a 350-2500nm spectral range. The measurements were acquired by using bare fiber optics with an angular field of view of 25° in the sunny windless weather conditions, for every measurement, 10 scans were averaged and stored as a single spectrum of the soybean.

Chlorophyll content of soybean were measured by SPAD-502 hand-held chlorophyll meters, 10 readings of soybean leaves were averaged and stored in computer disk.

C. PROSPECT model PROSEPCT is a radiative transfer model to simulate the

leaf reflectance spectra. PROSPECT model can simulate reflectance spectra with the range of 400 to 2500 nm. The input parameters of PROSEPCT are chlorophyll content, moisture, dry matter content and structure parameter. The PROSPECT model has been applied in a lot of plant type, and compared with other plant leaf models, it only needs a small amount of input parameters.

D. Random forests Random forests is a statistical machine learning method,

which is created by Breiman[20]. Random forests can achieve comparable results with boosting algorithms and support vector machines. Random forests has been applied in a large number of remote sensing researches for image classification of hyperspectral data [21-22], SAR data [23], LiDAR[24] and multi-source data [25].

A random forest is a classifier consisting of a collection of tree-structured classifiers {h(x, Θk),k=1,…} where the Θk

are independent indentically distributed random vectors and each tree casts a unit vote for the most popular class at input x.

There are three important parameters in random forests:Nodesize is the number of each node at each terminal;

treen is the number of decision trees constructed as part of the regression tree ensemble; trym is the number of predictor

variables randomly sampled as candidates at each decision tree node spit, trym is calculated as follow:

/ 3trym p= ⎢ ⎥⎣ ⎦ (1)

III. RESULTS Leaf chlorophyll content (leaf chlorophyll content, LCC) of

soybean in the sampling fields as illustrated in table 1, the mean of chlorophyll content at sampling field 2 is higher than that in sampling field 1 and in sampling field 3.

A. Abbreviations and Acronyms Define abbreviations and acronyms the first time they are

used in the text, even after they have been defined in the abstract. Abbreviations such as IEEE, SI, MKS, CGS, sc, dc, and rms do not have to be defined. Do not use abbreviations in the title or heads unless they are unavoidable.

TABLE I. STATISTICAL DESCRIPTION OF SOYBEAN AT SAMPLING SITES

sampling fields

crop Mean of LCC(

µg.cm-2) Standard deviation

1 soybean 47.2398 14.5461 2 soybean 49.1446 13.9745 3 soybean 48.7689 15.2591

B. Effect of number of trees for estimating chlorophyll content The feature vector was set to 126, the result effect of setting

different values of treen in RF-PROSPECT model is shown in Fig.1, the horizontal axis is the number of trees in random forests, the vertical axis is coefficient of determination between the estimated chlorophyll concentration and the actual measured chlorophyll concentration. As can be seen from Fig.1, when setting different number of trees RF-PROSPECT model, the coefficient of determination is between 0.8839 and 0.9317, which indicate that there is no over fit when estimating leaf chlorophyll concentration in soybean using random forests and PROSPECT.

Fig. 1. Effect of number of trees in RF-PROSPECT model for estimating chlorophyll content.

Page 3: [IEEE 2014 Third International Conference on Agro-Geoinformatics - Beijing, China (2014.8.11-2014.8.14)] 2014 The Third International Conference on Agro-Geoinformatics - Estimating

C. Result of estimated chlorophyll content

Fig. 2. Validation of the soybean chlorophyll content estimation model.

Fig.2 is the validation of the soybean chlorophyll content estimation model, which constructed with random forests. The coefficient of determination between the measured chlorophyll content and estimated chlorophyll content is 0.9317, and the soybean chlorophyll content estimation model achieved a MSE of 74.2569, which indicate that the soybean chlorophyll content estimation model achieved higher accuracy.

Fig. 3. Comparison of actual leaf chlorophyll content (LCC) and inverted eaf chlorophyll content (LCC).

Application of soybean chlorophyll estimation model on the validation data set for comparison of measured chlorophyll content (cab) and retrieved chlorophyll content (cab) (Fig.3), X axis is the sampling sites, Y axis is the inverted chlorophyll content. As can be seen from Fig.3, there is a small deviation between the retrieved chlorophyll content and the measured chlorophyll content at the vast majority of sampling points when using the soybean chlorophyll content estimation model based on random forests. But large error occurs at sampling point 2, 10, and17, because the estimation model’s inadequate training on the training data set, when the estimation model

was applied on these sampling sites, error occurs, which can be avoid by training on large data set.

IV. CONCLUSIONS This research established the soybean chlorophyll content

estimating model based on random forests and PROSPECT, the estimation model was applied to retrieved chlorophyll content of soybean. The estimation model can be used as an effective tool for rapid retrieval tool for estimation of soybean chlorophyll content, and can be adopted to precision agriculture management. Future study will concentrated on scaled up the field estimation model to satellite remote sensing level, which will monitor the soybean’s health condition in a large scale.

ACKNOWLEDGMENT This research was funded by Jiangxi Province Key Lab for

Digital Land (DLLJ201305), sponsored by Open Fund of Hunan Province Engineering Laboratory of Geospatial Information (2013GSIJJ002), Open Fund of Key Laboratory of Agricultural Information Technology, Ministry of Agriculture (2013006), Xi’an University of Science and Technology research engagement fund (201206), and the PhD Start-up Fund of Xi’an University of Science and Technology (A5030819).

REFERENCES

[1] M.E. Schaepman, S. L. Ustin, and A. J. Plaza, “Earth system science related imaging spectroscopy—An assessment, ”. Remote Sensing of Environment, vol. 113, pp. 123-S137, 2009.

[2] J. Delegido,L. Alonso, and L.G. Gonzalez, “Estimating chlorophyll content of crops from hyperspectral data using a normalized area over reflectance curve (NAOC)”. International Journal of Applied Earth Observation and Geoinformation, vol. 13, pp. 165-174, 2010.

[3] N.H. Broge, and J.V. Mortensen, “Deriving green crop area index and canopy chlorophyll density of winter wheat from spectral reflectance data“. Remote sensing of environment, vol. 81, pp. 45-57, 2002.

[4] K. Gökkaya, V. Thomas, T. Noland, H. McCaughey, and P. Treitz. “Testing the robustness of predictive models for chlorophyll generated from spaceborne imaging spectroscopy data for a mixedwood boreal forest canopy”. International Journal of Remote Sensing, vol. 35, pp.218-233, 2014.

[5] C.H. Koger, L.M. Bruce, and D.R. Shaw, “Wavelet analysis of hyperspectral reflectance data for detecting pitted morningglory (Ipomoealacunosa) in soybean (Glycine max) ”.Remote Sensing of Environment, vol. 86, pp. 108-119, 2003.

[6] C. Walthall,W. Dulaney, and M. Anderson, “A comparison of empirical and neural network approaches for estimating corn and soysoybean leaf area index from Landsat ETM+ imagery”. Remote Sensing of Environment, vol. 92, pp. 465-474, 2004.

[7] M. Schlemmer, A. Gitelson, J. Schepers, erguson, R. Peng, and Y. Shanahan, “Remote estimation of nitrogen and chlorophyll contents in maize at leaf and canopy level”. International Journal of Applied Earth Observation and Geoinformation, vol. 25, pp. 47-54, 2013.

[8] C.S.T. Daughtry, C.K. Walthall, and M.S. Kim, “Estimating corn leaf chlorophyll concentration from leaf and canopy reflectance”. Remote Sensing of Environment, vol. 74, pp. 229-239, 2000.

[9] M.F. Noomen, A.K. Skidmore, and F.D. van der Meer, “Continuum removed band depth analysis for detecting the effects of natural gas, methane and ethane on maize reflectance”. Remote Sensing of Environment, vol. 105, pp. 262-270,2006.

Page 4: [IEEE 2014 Third International Conference on Agro-Geoinformatics - Beijing, China (2014.8.11-2014.8.14)] 2014 The Third International Conference on Agro-Geoinformatics - Estimating

[10] L.H. Xue, and L.Z. Yang, “Deriving leaf chlorophyll content of green-leafy vegetables from hyperspectral reflectance”. ISPRS Journal of Photogrammetry and Remote Sensing, vol. 64, pp. 97-106, 2009.

[11] Darvishzadeh, A. Skidmore, M. Schlerf, and C. Atzberger. “Inversion of a radiative transfer model for estimating vegetation LAI and chlorophyll in a heterogeneous grassland”. Remote Sensing of Environment, vol. 112, pp. 2592-2604, 2008.

[12] V.C. Laurent, M.E. Schaepman, W. Verhoef, J. Weyermann, and R.O. “Chávez. Bayesian object-based estimation of LAI and chlorophyll from a simulated Sentinel-2 top-of-atmosphere radiance image”. Remote Sensing of Environment, vol. 140, pp. 318-329, 2014.

[13] M. Vohland, S. Mader, and W. Dorigo, “Applying different inversion techniques to retrieve stand variables of summer barley with PROSPECT + SAIL”. International Journal of Applied Earth Observation and Geoinformation, vol. 12, pp. 71-80, 2010.

[14] R. Houborg, H. Soegaard, and E. Boegh, “Combining vegetation index and model inversion methods for the extraction of key vegetation biophysical parameters using Terra and Aqua MODIS reflectance data”. Remote Sensing of Environment, vol. 106, pp. 39-58, 2007.

[15] G. Duveiller, M. Weiss, F. Baret, and P. Defourny, “Retrieving wheat Green Area Index during the growing season from optical time series measurements based on neural network radiative transfer inversion. Remote Sensing of Environment, vol. 115, pp. 887-896, 2011.

[16] C. Bacour, F. Baret, D. Beal, M. Weiss, and K. Pavageau, “Neural network estimation of LAI, fAPAR, fCover and LAI×Cab, from top of canopy MERIS reflectance data: Principles and validation”. Remote Sensing of Environment, vol. 30, pp. 313-325, 2006.

[17] Verger Aleixandre, Baret Frédéric, Camacho Fernando. “Optimal modalities for radiative transfer-neural network estimation of canopy

biophysical characteristics: Evaluation over an agricultural area with CHRIS/PROBA observations”. Remote Sensing of Environment, vol. 115, pp. 415-426, 2011.

[18] A Stumpf, and N Kerle, “Combining Random Forests and object-oriented analysis for landslide mapping from very high resolution imagery”. Procedia Environmental Sciences,vol. 3, pp. 123-129, 2011.

[19] X.W. Yu, J. Hyyppä, and M. Vastaranta, “Predicting individual tree attributes from airborne laser point clouds based on the random forests technique”. ISPRS Journal of Photogrammetry and Remote Sensing, vol.66, pp. 28-37, 2011.

[20] L. Breiman, “Random forests”. Machine learning,vol. 45, pp. 5-32, 2001.

[21] M. Pal, “Random forest classifier for remote sensing classification”. International Journal of Remote Sensing, vol. 26, pp. 217-222, 2005.

[22] J. Ham, Y. Chen, and M. Crawford, “Investigation of the random forest framework for classification of hyperspectral data”. IEEE Transactions on Geoscience and Remote Sensing, vol. 43, pp. 492-501, 2005.

[23] B. Waske, and J. Benediktsson, “Fusion of support vector machines for classification of multisensor data”.IEEE Transactions on Geoscience and Remote Sensing, vol. 45, pp. 3858-3866, 2007.

[24] L. Guo,N. Chehata, and C. Mallet, “Relevance of airborne lidar and multispectral image data for urban scene classification using Random Forests”. ISPRS Journal of Photogrammetry and Remote Sensing, vol. 66, pp. 56-66, 2011.

[25] P.O. Gislason, J.A. Benediktsson, and J.R. Sveinsson, “Random Forests for land cover classification”.Pattern Recognition Letters, vol. 27, pp. 294-300, 2006.