exploiting synergies of global land cover products for ... · exploiting synergies of global land...

20
Exploiting synergies of global land cover products for carbon cycle modeling Martin Jung a, , Kathrin Henkel a , Martin Herold b , Galina Churkina a a Max Planck Institute for Biogeochemistry, Jena, Germany b GOFC-GOLD Land Cover Project Office, Department of Geography, FSU Jena, Germany Received 16 September 2005; received in revised form 18 January 2006; accepted 19 January 2006 Abstract Within the past decade, several global land cover data sets derived from satellite observations have become available to the scientific community. They offer valuable information on the current state of the Earth's land surface. However, considerable disagreements among them and classification legends not primarily suited for specific applications such as carbon cycle model parameterizations pose significant challenges and uncertainties in the use of such data sets. This paper addresses the user community of global land cover products. We first review and compare several global land cover products, i.e. the Global Land Cover Characterization Database (GLCC), Global Land Cover 2000 (GLC2000), and the MODIS land cover product, and highlight individual strengths and weaknesses of mapping approaches. Our overall objective is to present a straightforward method that merges existing products into a desired classification legend. This process follows the idea of convergence of evidence and generates a best-estimatedata set using fuzzy agreement. We apply our method to develop a new joint 1-km global land cover product (SYNMAP) with improved characteristics for land cover parameterization of the carbon cycle models that reduces land cover uncertainties in carbon budget calculations. The overall advantage of the SYNMAP legend is that all classes are properly defined in terms of plant functional type mixtures, which can be remotely sensed and include the definitions of leaf type and longevity for each class with a tree component. SYNMAP is currently used for parameterization in a European model intercomparison initiative of three global vegetation models: BIOME-BGC, LPJ, and ORCHIDEE. Corroboration of SYNMAP against GLCC, GLC2000 and MODIS land cover products reveals improved agreement of SYNMAP with all other land cover products and therefore indicates the successful exploration of synergies between the different products. However, given that we cannot provide extensive validation using reference data we are unable to prove that SYNMAP is actually more accurate. SYNMAP is available on request from Martin Jung. © 2006 Elsevier Inc. All rights reserved. Keywords: Land cover; Carbon cycle; Fuzzy logic; Remote sensing; Global 1. Introduction Assessing and monitoring the state of the Earth surface is a key requirement for global change research. A suite of global land cover maps have been produced from the remote sensing community (Friedl et al., 2002; Hansen et al., 2000; JRC, 2003; Loveland et al., 2000) and are readily available for a variety of applications. The use of such satellite derived land data sets in modelling studies constitutes a major advance in Earth system science either for improving spatially explicit model parame- terization or for model evaluation. However, intercomparisons of land cover products (Fritz & See, 2005; Giri et al., 2005; Hansen & Reed, 2000; Latifovic & Olthof, 2004) show significant disagreements among them and reveal that the products contain uncertainties. At present, potential users have little guidance which data set to use and why. Such problems have been recognized and are currently being addressed especially by initiatives like GOFC-GOLD (Global Observation of Forest and Land Cover Dynamics). Driven by international conventions and implementation activities (GCOS, 2004; GEOSS, 2005), GOFC-GOLD in conjunction with the Food and Agricultural Organizations (FAO) and the Global Terrestrial Observing Systems (GTOS) have fostered land cover harmo- nization and strategies for interoperability and synergy between existing and future land mapping products (Herold et al., in press). See and Fritz (in press) proposed to generate an improved hybrid land cover map by fusion of GLC2000 and the Remote Sensing of Environment 101 (2006) 534 553 www.elsevier.com/locate/rse Corresponding author. Hans-Knöll, Str. 10, 07745 Jena, Germany. E-mail address: [email protected] (M. Jung). 0034-4257/$ - see front matter © 2006 Elsevier Inc. All rights reserved. doi:10.1016/j.rse.2006.01.020

Upload: hoangtruc

Post on 03-Apr-2019

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Exploiting synergies of global land cover products for ... · Exploiting synergies of global land cover products for carbon cycle modeling Martin Jung a,⁎, Kathrin Henkel a, Martin

t 101 (2006) 534–553www.elsevier.com/locate/rse

Remote Sensing of Environmen

Exploiting synergies of global land cover products for carbon cycle modeling

Martin Jung a,⁎, Kathrin Henkel a, Martin Herold b, Galina Churkina a

a Max Planck Institute for Biogeochemistry, Jena, Germanyb GOFC-GOLD Land Cover Project Office, Department of Geography, FSU Jena, Germany

Received 16 September 2005; received in revised form 18 January 2006; accepted 19 January 2006

Abstract

Within the past decade, several global land cover data sets derived from satellite observations have become available to the scientificcommunity. They offer valuable information on the current state of the Earth's land surface. However, considerable disagreements among themand classification legends not primarily suited for specific applications such as carbon cycle model parameterizations pose significant challengesand uncertainties in the use of such data sets.

This paper addresses the user community of global land cover products. We first review and compare several global land cover products, i.e.the Global Land Cover Characterization Database (GLCC), Global Land Cover 2000 (GLC2000), and the MODIS land cover product, andhighlight individual strengths and weaknesses of mapping approaches. Our overall objective is to present a straightforward method that mergesexisting products into a desired classification legend. This process follows the idea of convergence of evidence and generates a ‘best-estimate’ dataset using fuzzy agreement. We apply our method to develop a new joint 1-km global land cover product (SYNMAP) with improved characteristicsfor land cover parameterization of the carbon cycle models that reduces land cover uncertainties in carbon budget calculations.

The overall advantage of the SYNMAP legend is that all classes are properly defined in terms of plant functional type mixtures, which can beremotely sensed and include the definitions of leaf type and longevity for each class with a tree component. SYNMAP is currently used forparameterization in a European model intercomparison initiative of three global vegetation models: BIOME-BGC, LPJ, and ORCHIDEE.

Corroboration of SYNMAP against GLCC, GLC2000 and MODIS land cover products reveals improved agreement of SYNMAP with allother land cover products and therefore indicates the successful exploration of synergies between the different products. However, given that wecannot provide extensive validation using reference data we are unable to prove that SYNMAP is actually more accurate. SYNMAP is availableon request from Martin Jung.© 2006 Elsevier Inc. All rights reserved.

Keywords: Land cover; Carbon cycle; Fuzzy logic; Remote sensing; Global

1. Introduction

Assessing and monitoring the state of the Earth surface is akey requirement for global change research. A suite of globalland cover maps have been produced from the remote sensingcommunity (Friedl et al., 2002; Hansen et al., 2000; JRC, 2003;Loveland et al., 2000) and are readily available for a variety ofapplications. The use of such satellite derived land data sets inmodelling studies constitutes a major advance in Earth systemscience either for improving spatially explicit model parame-terization or for model evaluation. However, intercomparisonsof land cover products (Fritz & See, 2005; Giri et al., 2005;

⁎ Corresponding author. Hans-Knöll, Str. 10, 07745 Jena, Germany.E-mail address: [email protected] (M. Jung).

0034-4257/$ - see front matter © 2006 Elsevier Inc. All rights reserved.doi:10.1016/j.rse.2006.01.020

Hansen & Reed, 2000; Latifovic & Olthof, 2004) showsignificant disagreements among them and reveal that theproducts contain uncertainties. At present, potential users havelittle guidance which data set to use and why. Such problemshave been recognized and are currently being addressedespecially by initiatives like GOFC-GOLD (Global Observationof Forest and Land Cover Dynamics). Driven by internationalconventions and implementation activities (GCOS, 2004;GEOSS, 2005), GOFC-GOLD in conjunction with the Foodand Agricultural Organizations (FAO) and the Global TerrestrialObserving Systems (GTOS) have fostered land cover harmo-nization and strategies for interoperability and synergy betweenexisting and future land mapping products (Herold et al., inpress). See and Fritz (in press) proposed to generate animproved hybrid land cover map by fusion of GLC2000 and the

Page 2: Exploiting synergies of global land cover products for ... · Exploiting synergies of global land cover products for carbon cycle modeling Martin Jung a,⁎, Kathrin Henkel a, Martin

535M. Jung et al. / Remote Sensing of Environment 101 (2006) 534–553

MODIS product by taking individual strengths and weaknessescarefully into account. The release of the ENVISAT-basedGLOBCOVER data set will further enhance the availability ofaccurate and precise land cover data sets.

Although the land cover community is working hard tosupply more data sets with an increasing accuracy, theirproducts are not optimized for direct use in dynamic vegetationand biogeochemical models. The vegetation modellers face ageneral problem: the classes of the land cover product cannotalways be translated into what the models need withoutintroducing uncertainties. Some of the land classes in theclassification legends have ambiguous definitions and have tobe adjusted before these classes can be parameterized in themodels. For example, the essential properties of the land coverclasses necessary for vegetation model parameterization includedegree of woodiness, leaf type, canopy seasonality, andphotosynthetic path (C3 or C4). Except different photosyntheticpathways of grasses, these properties are usually definable fromremotely sensed data. The land cover legends of existing landcover products, however, have classes that are not easilytranslated into this scheme. Essential information is missing,e.g. the IGBP-DISCover class ‘woody savanna’ states that it has30–60% trees, while it is neither defined which leaf type andphenology is present nor what other land cover class issubdominant. The information about the leaf type andseasonality in particular is crucial for the vegetation model.Since ecophysiological parameters driving the exchange ofmass and energy are associated with land cover class in themodel, it is vital to minimize uncertainty related to land coverparameterization.

In this paper, we present a synergetic land cover product(SYNMAP) with improved characteristics for land coverparameterization of the terrestrial carbon cycle models.SYNMAP provides a relatively simple solution to bothproblems: disagreements and unsuitable classification legendsof existing data sets.

2. Aims and objectives

The first goal of our study is to emphasize individualadvantages and limitations of available land cover products andshow that none of them is perfectly suited for carbon cyclemodel parameterization. A review the individual land covermapping approaches will highlight their major strengths andshortcomings in Section 3. Next, we compare the different landcover products in Section 4 using agreement maps and indicateconsiderable disagreement between the data sets that cannot beexplained as an artefact of the different legends or acquisitionperiods. The review and comparison of land cover productsshows that none of them is much better than another and servesas justification for producing a land cover product to improvethe signal-to-noise ratio by exploring synergies of different landcover mapping approaches. Section 5 introduces our methodthat produces a best-estimate map with a user-defined legendbased on land cover products from AVHRR, MODIS, andVEGETATION satellite sensors using a fuzzy logic approach.The legend we choose is currently optimized for the

biogeochemistry process model BIOME-BGC (Churkina etal., 2003; Running & Hunt, 1993; Thornton, 1998) and adaptedto the dynamic vegetation models LPJ (Sitch et al., 2003) andORCHIDEE (Krinner et al., 2005). Our new data set SYNMAPis evaluated in Section 6 by corroboration with existing landcover products in conjunction with their published validationresults. Section 7 discusses remaining limitations of our datafusion method and emphasizes advantages of our derivedSYNMAP land cover for carbon cycle modelling. In Section 8,we propose briefly what the land cover community can do tobetter satisfy users of global land cover products. Section 9summarizes the main findings of this paper.

3. Overview of existing global land cover data sets

3.1. General principle

Mapping land cover on global scales is a complex challengeand profited from recent developments in computer science,digital image processing, and satellite technology. So far, high-resolution global land cover data sets are available from threeoptical satellite sensors: NOAA-AVHRR (Hansen et al., 2000;Loveland et al., 2000), TERRA-MODIS (Friedl et al., 2002),and SPOT-VEGETATION (JRC, 2003). The general approachof global land cover mapping is to produce temporal, usuallymonthly composites from daily or weekly mosaics to minimizecloud cover and data noise due to e.g. atmospheric or viewingangle distortions. Core information originates from multi-temporal spectral reflectance measurements and especiallyvegetation indices (Normalized Difference Vegetation Index,NDVI; Enhanced Vegetation Index for MODIS, EVI) thatcapture the cycle of plant productivity throughout the year.Monthly composites are then used in conjunction with ancillarydata to produce land cover categories according to a definedclassification scheme on a regional, e.g. continental windowbasis or for the whole globe. Major differences between theabove mentioned achievements exist that are related to:

(1) Sensor capabilities, i.e. spatial and spectral properties andresolution, repetition rate, and recording of informationfor data correction and calibration;

(2) Raw data processing, i.e. algorithms for image compos-iting including cloud detection, directional reflectancecalibration, corrections for atmospheric distortions, view-ing angle, and geographic position;

(3) Acquisition year(s);(4) Classification system (land cover legend);(5) Selection of input data for classification;(6) Classification procedure;(7) Validation of the final product.

The next sections will give a very brief overview andevaluation of the development of the different land coverproducts that are available at the global scale with a spatialresolution of 1×1 km and used in this paper (Table 1). Pleaseconsult the references for detailed information and the review of(Cihlar, 2000) for general issues of large-scale land cover

Page 3: Exploiting synergies of global land cover products for ... · Exploiting synergies of global land cover products for carbon cycle modeling Martin Jung a,⁎, Kathrin Henkel a, Martin

Table 1Global land cover products with 1-km spatial resolution used in this study

Product Version Satellite andsensor

Acquisition period Download

GLCC 2.0 NOAA-AVHRR April 1992–March 1993

http://edcdaac.usgs.gov/glcc/glcc.asp

GLC2000 1.0 SPOT-VGT November 1999–December 2000

http://www-gvm.jrc.it/glc2000/

MODIS V003 TERRA-MODIS January 2001–December 2001

http://duckwater.bu.edu/lc/mod12q1.html

536 M. Jung et al. / Remote Sensing of Environment 101 (2006) 534–553

mapping. Validation issues are separately considered in Section3.6.

3.2. NOAA-AVHRR (GLCC)

The development of the Global Land Cover CharacterizationData Base (GLCC) pioneered global land cover mappingmotivated by the International Geosphere-Biosphere Program(IGBP) in 1992. Global 10-day 1-km resolution AVHRRcomposites for the period April 1992–March 1993 wererecomposited to monthly NDVI data sets (Loveland et al.,2000). Due to the navigation properties of the satellite, thegeometric accuracy is only in the order of ∼3km. Masks fornon-vegetated areas (barren, snow, and ice) were producedusing thresholds for the maximum NDVI values; water andurban classes were not mapped by the satellite but overlaid fromthe Digital Chart of the World (DCW, Danko, 1992).Unsupervised clustering of the multitemporal NDVI data wasused to separate vegetated areas in 961 land cover regionsglobally reflecting properties of similar plant productivity andphenological behaviour. All 961 clusters were assignedmanually to 1 of the 94 classes of the Olsen's Global EcosystemLegend by local experts using a suite of ancillary data such asland use, elevation, and ecoregion maps or high-resolutionsatellite images. Where necessary, individual clusters were splitby overlay analysis with additional data, on-screen digitizing, orspectral reclustering. To ensure objectivity, several interpretersworked within the same area. The final map on Olsen's GlobalEcosystems legend was reclassified into six additional classi-fication schemes including IGBP-DISCover and AndersonUSGS Land Use/Land Cover to meet individual needs ofintended applications.

3.3. SPOT-VEGETATION (GLC2000)

The Global Land Cover 2000 (GLC2000) project is aEuropean initiative, coordinated by the Joint Research Institute(JRC) in Ispra, Italy. The project strategy was to define 19spatial windows for the globe, while for each window a separategroup of regional experts were responsible for the mapping. Theindividual groups were unrestricted for how they produced theirproduct except that the VEGA2000 data set had to be used andthe classification scheme had to follow the Land CoverClassification System (LCCS) developed by the FAO (Bartho-

lomé & Belward, 2005; Fritz et al., 2003). LCCS is ahierarchical classification structure (Di Gregorio & Jansen,2000) that allows straightforward and flexible class definitionsand aggregation from the regional to the global legend with 22classes. The VEGA2000 data set consists of daily 1km SPOT4-VEGETATION data (blue, red, near infrared, and short waveinfrared bands, NDVI) from November 1999 to December2000. They have been radiometricaly, atmospherically, andgeometrically corrected, while no standardization of bidirec-tional reflectance had been applied. Because of a lack ofextensive training data, unsupervised clustering in conjunctionwith various ancillary data was widely used for classificationpurposes (see e.g. Bartalev et al., 2003; Eva et al., 2004;Mayaux et al., 2004). Regional land cover mapping strategiesand the mosaicing to the global product are described in Fritz etal. (2003). Detailed information is available at the GLC2000web page (http://www-gvm.jrc.it/glc2000/publications.htm).

3.4. TERRA-MODIS

The MODIS land cover product is based on monthlycomposites from MODIS Levels 2 and 3 data between Januaryand December 2001, and include EVI and spectral bands 1–7;spatial texture, land surface temperature, snow cover, andelevation will be used additionally in upcoming versions (Friedlet al., 2002; Strahler et al., 1999). The categorization algorithm(MLCCA, MODIS land cover classification algorithm) is basedon supervised artificial neural network classification inconjunction with decision tree classifiers. A global network oftraining sites (STEP database, Muchoney et al., 1999) is used totrain 10 decision trees until maximal accuracies are attained. All10 decision trees are then used as experts to vote on classesbased on the input data so that a classification probability can beestimated that a pixel belongs to each class. Additionally to thevoting process prior knowledge is included in the classificationprocedure that specifies how likely a class is to appear at ageographic location based on ancillary land cover maps, andstatistics of class distribution of training sites and overall globaldistribution of classes. Class label assignment combinedclassification and prior probabilities to posterior probabilitieswhile the class with highest probability wins. Prior knowledgebecomes only decisive if the spectral signature is ambiguous(Friedl et al., 2002). The MODIS land cover product is availablewith different legends including IGBP-DISCover and isintended to be updated annually.

3.5. Advantages and shortcomings of land cover mappingapproaches

Individual advantages and limitations of the differentapproaches relate to the points listed in Section 3.1 and arebriefly evaluated here based on the consultation of the literature.In terms of the quality and amount of used satellite data, werecognize a progression from GLCC to GLC2000 and MODISland cover product. GLCC is based on poorly or uncorrectedraw data, using only monthly NDVI composites that also havesome geometric problems. The VEGA data set of GLC2000

Page 4: Exploiting synergies of global land cover products for ... · Exploiting synergies of global land cover products for carbon cycle modeling Martin Jung a,⁎, Kathrin Henkel a, Martin

537M. Jung et al. / Remote Sensing of Environment 101 (2006) 534–553

with daily composites of calibrated spectral bands and NDVIoffers significantly improved data and more flexibility forclassification. A further advantage of the VEGA2000 is theeffective geometric correction procedures (Bartholomé &Belward, 2005). The input data sets of the MODIS productsupersede GLCC and GLC2000 in terms of the spectralproperties of the MODIS instrument, specifically designed forland surface mapping. Also, the MODIS data are based onhigher spatial resolution of the raw data (250m/500m) andcomprehensive strategies of data correction and calibrationusing additional data collected by the instrument as well asincluding more spectral bands and additional information(Strahler et al., 1999).

Regarding the applied classification methods, MLCCAclearly seems the most sophisticated algorithm. In contrast toGLCC and GLC2000, it is purely objective, reproducible andoperational for the whole globe, thus seems most suitable forchange detection. The main limitation of MLCCA is itssensitivity to the training data. Friedl et al. (2002) note:“classification results produced from MODIS data are heavilydependent on the integrity and representation of global landcover in the site data, and substantial ongoing efforts aredevoted to maintaining and augmenting the STEP database” (p.300). However, given the large variances and sometimesambiguous signatures of land cover classes in a global contextmanual cluster assignment and manipulation by experts as donefor GLCC and GLC2000 may produce a better map product.The bottom-up approach of GLC2000, i.e. the individual choiceof data pre-processing, usage of ancillary data (e.g. othersatellite data), classification method and regional classificationlegend by project participants, allows accounting for region-specific characteristics and landscape complexity at the expenseof internal consistency of the final global product. Therefore,the quality of the regional products of GLC2000 also variesaccording to the quality of the regional experts and so do theareas where the different areas were merged. However, forregional studies, the individual tiles of GLC2000 seem to offerthe most elaborate representation of land coverage.

In relation to the classification legends and classificationsystems of land cover products, two aspects are important:availability of the products with different legends to meet needsof diverse applications and consistency of the classificationsystem itself. GLCC offers the most flexibility for users in termsof available reclassifications including the Olsen classificationwith 94 classes. MODIS is also available in different legends,which is not the case for GLC2000. LCCS of GLC2000 is themost advanced and flexible classification system with a clearrationale and standardized definition of the classes. But none oflegends of all three global land cover products are easilytranslated into the land cover classes of vegetation modelswithout introducing uncertainty due to poor definition of mixedclasses or a lack of information about leaf type and phenology.For example, the BIOME-BGC model distinguishes betweenseven vegetation classes: evergreen needleleaf trees, evergreenbroadleaf trees, deciduous needleleaf trees, deciduous broadleaftrees, shrubs, C3 and C4 grasses. In terms of carbon cyclemodelling, an accurate representation of tree coverage and its

leaf characteristics is required given that the trees largelydetermine the carbon budget of an area. In this respect, it is notsensible to regard a pixel as forest if it is covered by only 15%trees as defined in LCCS.

In conclusion, the main advantage of GLCC is its availabilitywith many different legends and high thematic resolution, whilethe relatively poor input data used for classification constituteits major shortcoming. Although GLC2000 benefits stronglyfrom the use of LCCS and its regional bottom-up approach, itsglobal map lacks some internal consistency associated with theindividual mapping initiatives by different project participants.The advantage of the MODIS product is its base on a largeamount of high-quality earth observation data and an advancedand operational classification algorithm, but, in contrast to theregionally tuned GLC2000 approach, heavily relies on thequality of a comprehensive global training database. However,while it is worth evaluating mapping approaches, a map productshould be judged according to how good it actually is.Therefore, the next section deals with validation efforts ofland cover products.

3.6. Validation of global land cover products

Determining the accuracy of land cover maps is essential buta poses a challenge to be performed at global scales. Fourapproaches are used to quantitatively estimate the accuracy ofland cover classifications: confidence values of the classifier,comparison with other maps, cross-validation with training datasets, and statistically robust spatial sampling and acquisitionground reference information. The latter is regarded as mostreliable and will be discussed briefly in the next section. Athorough review of accuracy issues of land cover maps isavailable in Foody (2002) and this author emphasizes that:“Despite the apparent objectivity of quantitative metrics ofaccuracy, it is important that accuracy statements be interpretedwith care” (p. 196).

Validation reference data for global data sets usually originatefrom the interpretation of high-resolution satellite images (e.g.Landsat and SPOT). The common approach is to calculate error(or contingency) matrices between reference and map data.Three measures are commonly used to describe the map- andclass-specific accuracies: overall, user's and producer's accura-cy. Overall accuracy is simply the percentage of correctlyclassified pixels, commonly calculated as area weighted esti-mates for the different classes. Class-specific accuracies can bereported from two points of view, either from the map or thevalidation side. The producer's accuracy of a class is defined asthe percentage of validation sites classified correctly, while theuser's accuracy of a class relates to the percentage of map areaclassified correctly. Both neglect either omission or commissionerrors, i.e. whether pixels that should belong to that class areclassified as another class or misclassified as that classrespectively. Problematic here is that the area of the individualland cover class affects the class-specific accuracies (MODISland cover team, 2003). The classification scheme and classdistance (which depends on application) should be consideredbut usually is not (Mayaux et al., in press). For the purpose of this

Page 5: Exploiting synergies of global land cover products for ... · Exploiting synergies of global land cover products for carbon cycle modeling Martin Jung a,⁎, Kathrin Henkel a, Martin

Table 2Table with translation between the SIMPLE legend and the IGBP-DISCover (forGLCC and MODIS) and LCCS (GLC2000) legends

SIMPLE IGBP-DISCover LCCS

Trees Evergreenneedleleaf forest

Tree cover, broadleaved,evergreen

Evergreenbroadleaf forest

Tree cover, broadleaved,deciduous, closed

Deciduousneedleleaf forest

Tree cover, broadleaved,deciduous, open

Deciduousbroadleaf forest

Tree cover, needleleaved,evergreen

Mixed forests Tree cover, needleleaved,deciduous

Woody savannas Tree cover, mixed leaf typeSavannas Mosaic: tree cover/other

natural vegetationTree cover, burnt

Shrubs Closed shrublands Shrub cover, closed-open,evergreen

Open shrublands Shrub cover, closed-open,deciduous

Grasses Grasslands Herbaceous cover, closed-openWetlands Permanent wetlands Tree cover, regularly flooded,

fresh waterTree cover, regularly flooded,saline waterRegularly flooded shrub and/orherbaceous cover

Barren Barren or sparselyvegetated

Sparse herbaceous or sparseshrub coverBare areas

Snow Snow and ice Snow and iceCrops Croplands Cultivated and managed areasCrops/naturalvegetation mosaic

cropland/naturalvegetation mosaic

Mosaic: cropland/tree cover/other natural vegetationMosaic: cropland/shrub orgrass cover

Urban Urban andbuilt-up

Artificial surfaces andassociated areas

538 M. Jung et al. / Remote Sensing of Environment 101 (2006) 534–553

study, confusion between ‘trees’ and ‘barren’ is more crucialthan between ‘woody savanna’ and ‘savanna’ but it counts thesame in classical confusion statistics. Accuracy statements aretherefore sensitive to the level of class aggregation and mapswith a high thematic resolution aremore likely to be less accurateaccording to confusion calculations.

The validation initiatives for GLCC (Scepan, 1999),GLC2000 (Mayaux et al., in press), and MODIS (MODISland cover team, 2003) land cover products have reportedoverall area weighted accuracies of 67%, 69%, and 71%,respectively. However, since different databases and approacheswere used, it must be emphasized that reported accuracymeasures are not comparable and should not be regarded astruly robust quantitative estimates. While the GLCC andGLC2000 validation used design-based sample schemes, theMODIS product accuracy assessment is based on a cross-validation—i.e. using several subsets of the training data (whichhave not been used for the training process) as referenceinformation. It is therefore not possible to judge which productis better than another in overall or class-specific performance.

Given the daunting and expensive task of global land coverproduct validation, the working group on calibration andvalidation of CEOS (Committee Earth Observation Satellites)have recently prepared a ‘best practise document’ to providethorough validation standards for global land cover data setsCEOS (in press). In this respect, any consistent and operationalvalidation algorithm has to consider the standardized acquisi-tion of reference information to allow for comparativeassessment of the validity, strengths, and weaknesses ofindividual data sets (Herold et al., in press).

In the next section, we show to what extent GLCC,GLC2000, and MODIS agree or disagree to each other bypresenting agreement maps. We will show that there issignificant disagreement between the products, which is relatedto their different land cover mapping procedures rather then dueto different legends or periods of satellite data acquisition.

4. (Dis)Agreements of GLCC, GLC2000, and MODIS landcover products

4.1. Methods

To facilitate overlaying of the different maps, the data setsneed to be co-registered and homogenized (“cross-walked”) to acommon legend. As base projection, we chose simple geo-graphic (latitude/longitude, Plate Carrée) projection with aspatial resolution of 30×30″ (0.008333°) since all data sets wereavailable with this projection. GLC2000 has a slightly deviatingspatial resolution of 0.008929° and therefore had to beresampled to 0.008333° using nearest neighbour. We checkedthe georeference information of the products and found them tobe precise—additional co-registration would not improve thespatial match. We clipped all maps to a subset with 43,200columns and 17,500 rows to exclude Antarctica, which is notcovered by GLC2000. Resampling and subsetting data was donein ENVI 4.0; the remaining image processing and datamodelling outlined below was coded in IDL 6.0.

We have defined a simplified legend with nine major classesthat accommodate all land cover categories on an aggregatedlevel according to the occurrence of major life forms (SIMPLElegend). A legend translation table for the original legends isgiven in Table 2. Lumping classes such as combining all forestand savanna types is necessary to account for the diverseclassification schemes since, e.g., GLC2000 defines forest with>15% tree cover, while IGBP-DISCover distinguishes betweensavannas (10–30%), woody savannas (30–60%), and forest(>60%). Equally, LCCS splits shrublands (>15% shrubcoverage) according to leaf longevity into deciduous and ever-green; in contrast, IGBP-DISCover separates between open(10–60% coverage) and closed (>60%) shrubland. This lum-ping of classes increases the agreement between the data sets atthe expense of thematic precision.

The reclassified data sets of GLCC, GLC2000, andMODIS are then overlaid to produce a map of agreement,revealing where all three, two or none of the maps showequal representation of the land surface. To assess howmuch of the discrepancy between the maps may be relatedto land cover change during the acquisition periods of

Page 6: Exploiting synergies of global land cover products for ... · Exploiting synergies of global land cover products for carbon cycle modeling Martin Jung a,⁎, Kathrin Henkel a, Martin

539M. Jung et al. / Remote Sensing of Environment 101 (2006) 534–553

1992–1993 and 2000/2001, the case that MODIS agrees withGLC2000 but both disagree with GLCC is treated separatelyand named ‘potential land cover change’ with a strongemphasis on ‘potential’. This exercise aims to identify whe-ther land cover change is an important factor in explainingdiscrepancy between maps. To get further indication if landcover change is responsible for the discrepancy between theland cover products, we produce an agreement map of theGLCC-IGBP and MODIS-IGBP data sets. This is indepen-dent from potential artefacts of the reclassification procedureincluding its artificial increase of accuracies. The ‘potentialland cover change’ case from the comparison on theSIMPLE legend basis is used as a mask and overlaid,which allows an approximation of the impact of varyingacquisition periods on the discrepancy between GLCC-IGBPand MODIS-IGBP.

Fig. 1. Maps of agreement and disagreement between land cover products. (a) GLCC,IGBP-DISCover legend. The pie charts and numbers therein give percentages of theanalysis is based on Plate Carrée projection.

4.2. Results

The agreement of the GLCC, GLC2000, and MODIS landcover maps, reclassified to the SIMPLE legend, is presented inFig. 1a. All three maps equal each other to 41%, mainly in areaswith extensive tree coverage (e.g. tropical and boreal forestzones), barren (e.g. Sahara and Gobi desert), cropland (e.g.central Europe, India), and snow/ice coverage (Greenland).Further 45% is related to the agreement of only two maps, whilethe contribution of the ‘potential land cover change’ case to thatnumber is 12%. Still 14% remain where all three land covermaps disagree. Areas where all maps disagree or only two mapsagree seem to be associated with mainly transitional ecozoneswith mixtures of the three main components trees, shrubs andgrasses such as tropical savannas including the Sahel,Mediterranean Europe, and tundra.

GLC2000, and MODIS converted to SIMPLE legend. (b) GLCC andMODIS onindividual cases. Please note that these numbers are not area estimates since the

Page 7: Exploiting synergies of global land cover products for ... · Exploiting synergies of global land cover products for carbon cycle modeling Martin Jung a,⁎, Kathrin Henkel a, Martin

540 M. Jung et al. / Remote Sensing of Environment 101 (2006) 534–553

The direct comparison of the GLCC andMODIS products onthe IGBP-DISCover legend in Fig. 1b reveals that both mapsagree for 51% of land pixels only. Their disagreement cannot beexplained by land cover change between the acquisition periods1993 and 2001 since the ‘potential land cover change’ case fromFig. 1a overlaid on the IGBP-DISCover agreement mapcontributes only 12%.

4.3. Discussion

While large scale homogeneous areas seem to berepresented reliably in all land cover products, largediscrepancies are apparent in heterogeneous landscapes.There are several reasons for that. Firstly, mapping acontinuum of, e.g., trees, shrubs, and grasses into discretecategories is problematic (Foody, 2002). For instance, if onemap shows grassland and another shrubland, they disagreealthough both may be right. This ‘mixed unit’ problem seemsa major challenge for all coarse scale land cover mappingefforts because the heterogeneity of the landscape structure ismore detailed than the resolution of the satellite sensor (Smithet al., 2002). When several land cover types are presentwithin a pixel, the signature becomes ambiguous and theclassification very sensitive to the method. The sensors withfine spatial resolution (e.g. IKONOS or even LANDSAT) arecapable to resolve the landscape structure, but are still lesseffective for mapping large land areas. Perhaps, it is hopedthat consistent global land cover information of Landsat-typedata may be developed and made available. For mapintercomparison, geographic misregistration also becomescrucial in heterogeneous terrain (Foody, 2002). Anothergeneral problem is the low separability between certainclasses such as, e.g., grass- and shrublands, whose statisticalsignatures overlap in the multidimensional space, and if bothland cover types are present it becomes even morechallenging.

From this simple comparison of land cover products, wedraw two conclusions. First, there is significant discrepancybetween them that cannot be explained by different classifica-tion schemes or acquisition dates. We therefore disagree withGiri et al. (2005) who relate the main disagreements betweenGLC2000 and MODIS land cover products to differentclassification schemes. Land cover change between 1993 and2000 cannot explain the discrepancies between GLCC andGLC2000 or MODIS. There is common agreement within theland cover mapping community that the accuracies of theindividual land cover products are insufficient to detect landcover change reliably. The deviation of the three products seemslargely to be caused by the methodology and input data. Second,a land cover map derived by blending the original productswould be a significant improvement, given that muchinformation and confidence is gained in areas where at leasttwo maps agree. Land cover change between the acquisitionperiods is not an issue as demonstrated, especially not becausetwo third of the land cover information would originate from2000/2001, with GLCC from 1993 as an additional contributorof important information.

5. Land cover data fusion—exploring synergies betweenland cover products

5.1. General principle

Since the aim is to combine several land cover classifications toa best estimate land cover map for a new user-defined legend, aflexible method is needed capable of handling differingclassification schemes and their fuzziness. For each product,land cover has been classified into a limited number of classeswhile the boundaries between thematically adjacent classes are, tosome extent, arbitrary drawn and a question of definition andaccuracy, which vary between land cover products. Also, theseparability of adjacent or mixed classes is very limited due tooverlapping signatures so that class assignment becomes verysensitive to the classification algorithm. The basic idea of fuzzylogic here seems a welcome rationale by blurring the boundariesof land cover classes (Ahlqvist et al., 2003). In principle, ourmethod consists of two steps: the definition of the desiredclassification legend and, secondly, to link the defined legendclasses with the legend classes of the original maps by assigningaffinity scores between them. This provides a score for each pixeland defined class, while the class with the highest score wins thatcan in principle be understood as a voting process of the input datasets. The next sections describe how this principle has beenimplemented to produce a land cover map with an optimizedlegend for terrestrial carbon cycling modelling—SYNMAP.

5.2. Definition of the target legend

There are two important requirements for the definition ofthe target legend. First, it is dependent on the application of theland cover map, in our case land cover parameterization of theglobal vegetation model where it is important to have leaf typeand longevity of trees specified. Second, the information for alldesired classes must be available from the input data sets.

We have chosen the target legend to consist of the vegetationtypes used by BIOME-BGC, because it is both directly trans-latable into the ecophysiological parameters crucial for carboncycle and can be easily classified using remotely sensed data(Running et al., 1995) To account for the co-occurrence of vege-tation types, we define a class as single or a combination of ma-ximal two vegetation life forms (16 categories). For each landcover class,which has a tree component, leaf type and longevity areto be specified. It results in 48 classes, while 36 of them are asso-ciated with tree coverage (see Table 3). We assume that the indi-cated SYNMAP class covers more than 50% of the pixel; in thecase of mixed classes the indicated class combination is maximalrelative to all other class possibilities. The legend of SYNMAP is,in contrast to existing products, flexible and ideal for upscaling to acoarser grid cell size with fractional estimates of the vegetationtypes used by BIOME-BGC as demonstrated in Section 7.

5.3. Selection and pre-processing of input data sets

To allow for most suitable land cover characterization, it isadvantageous to use different classification schemes to

Page 8: Exploiting synergies of global land cover products for ... · Exploiting synergies of global land cover products for carbon cycle modeling Martin Jung a,⁎, Kathrin Henkel a, Martin

Table 3SYNMAP legend defined by dominant life form assemblage and tree leafattributes

Life form Tree leaf type Tree leaf longevity SIMPLE

Trees Needle Evergreen TreesTrees Needle DeciduousTrees Needle MixedTrees Broad EvergreenTrees Broad DeciduousTrees Broad MixedTrees Mixed EvergreenTrees Mixed DeciduousTrees Mixed MixedTrees and shrubs Needle EvergreenTrees and shrubs Needle DeciduousTrees and shrubs Needle MixedTrees and shrubs Broad EvergreenTrees and shrubs Broad DeciduousTrees and shrubs Broad MixedTrees and shrubs Mixed EvergreenTrees and shrubs Mixed DeciduousTrees and shrubs Mixed MixedTrees and grasses Needle EvergreenTrees and grasses Needle DeciduousTrees and grasses Needle MixedTrees and grasses Broad EvergreenTrees and grasses Broad DeciduousTrees and grasses Broad MixedTrees and grasses Mixed EvergreenTrees and grasses Mixed DeciduousTrees and grasses Mixed MixedTrees and crops Needle Evergreen Crops/natural

vegetationmosaic

Trees and crops Needle DeciduousTrees and crops Needle MixedTrees and crops Broad EvergreenTrees and crops Broad DeciduousTrees and crops Broad MixedTrees and crops Mixed EvergreenTrees and crops Mixed DeciduousTrees and crops Mixed MixedShrubs and crops – –Grasses and cropsCrops – – CropsShrubs – – ShrubsShrubs and grasses – –Shrubs and barren – –Grasses – – GrassesGrasses and barren – –Barren – – BarrenUrban and built-up – – UrbanPermanent snow and ice – – Snow

The last column gives a translation to the SIMPLE legend.

541M. Jung et al. / Remote Sensing of Environment 101 (2006) 534–553

perform cross-mapping of classes. We chose five differentdata sets: the USGS and IGBP legend for GLCC and the PFTand IGBP legend for the MODIS product; GLC2000 is onlyavailable with the LCCS legend but goes twice in thecalculation. We decided not to use the land cover productfrom the University of Maryland (UMD) based on AVHRR1992–1993 data (Hansen et al., 2000) because we wanted tokeep the majority of information from 2000/2001 andpreliminary visual inspection and statistical comparison withother land cover products suggested that only little informa-tion can additionally be gained from this data set. The UMD

classification has further not been validated and is less widelyused in the research community.

IGBP-DISCover and LCCS have 17 and 23 classes,respectively. The USGS classification scheme has 24 classeswith several ‘mixed’ classes, while the PFT legend consists ofonly 11 classes, defined clearly by the dominant life form andleaf attributes for forest classes, which contributes importantinformation in cases where several possibilities of desired‘mixed classes’ are possible. For example, if the GLCC andMODIS maps on IGBP legend indicate a ‘savanna’ type landcover, the PFT map may specify whether it is a ‘trees andshrubs’ or ‘trees and grasses’ mosaic by indicating ‘shrubland’or ‘grassland’, respectively.

In addition to the land cover products, we use AVHRR-CFTC (Continuous Fields of Tree Cover) data sets to contributeinformation on leaf type and phenology for tree classes. TheCFTC products give fractional information of leaf type and leaflongevity for each pixel with tree coverage with two layers forboth attributes respectively: needleleaf/broadleaf and evergreen/deciduous, while each pair sums up to percentage tree cover.They have been derived from monthly AVHRR NDVIcomposites of the 1992–1993 period (see Section 3) usingspectral unmixing (DeFries et al., 1999). Using CFTC data inaddition to the land cover classification products has theadvantage that information on leaf type and longevity can beestimated for areas with tree coverage that are below the forestthreshold (‘savanna’ or ‘woodland’ classes) and hence lackimportant information on leaf characteristics of present trees.Therefore, AVHRR CFTC were converted into a leaf type andleaf longevity map by dividing the data into three discreteclasses, i.e. needleleaf, broadleaf, mixed and evergreen,deciduous, mixed, respectively. The data were rescaled to100% so that, e.g., percentage of needleleaf plus percentage ofbroadleaf equal 100% and not percentage of tree cover as in theoriginal data layers. The three classes are equally spaced; hence,mixed is assigned if neither needleleaf nor broadleaf orevergreen nor deciduous exceed 66%.

For overlaying all data sets were prepared at 30″ spatialresolution in Plate Carrée projection as outlined in Section 3 andthe final product (SYNMAP) has equally a resolution of 30″.

5.4. Definition of affinity scores

Affinity scores link our defined legend classes with thelegend classes of the original products and therefore approx-imate the thematic distance of the classes. Affinity scores aredefined for life form, leaf type, and leaf longevity separately.Each class of each original land cover data set is assigned to oneor more ‘target’ classes with a score between zero and fouraccording to semantic rules (Table 4). A land cover class X getsassigned a zero to target class Y if both are independent fromeach other such as ‘barren’ and ‘trees’. It contributes two pointsif land cover class X is a component of class Y such as‘grassland’ and ‘shrubs and grasses’, or ‘evergreen needleleafforest’ and ‘mixed leaf type’. Four points are given if class Xmatches class Y, for instance ‘evergreen needleleaf forest’ and‘trees’ or ‘open shrubland’ and ‘shrubs and grasses’. One or

Page 9: Exploiting synergies of global land cover products for ... · Exploiting synergies of global land cover products for carbon cycle modeling Martin Jung a,⁎, Kathrin Henkel a, Martin

Table 4Definition of affinity scores according to semantic rules

Land cover classexample

Semantic rule Affinityscore

Target classexample

Woody savanna ‘Is not’ 0 ‘Barren’‘Has minor partsof’

1 ‘Grasses’

‘Has parts of’ 2 ‘Trees’‘Has major partsof’

3 ‘Trees and grasses’

‘Is’ 4 ‘Trees and shrubs’

The example uses the IGBP-DISCover class ‘woody savanna’.

Table 5Calculation example for the best estimate of life form assemblage along the pixelvector of the land cover data sets

Data set Original landcover class

Trees andshrubs

Trees andgrasses

Shrubs Grasses Shrubsandgrasses

GLCC-IGBP

‘Openshrubland’

1 0 2 2 4

GLCC-USGS

‘Mixedshrubland/grassland’

1 1 2 2 4

MODIS-IGBP

‘Woodysavanna’

4 3 1 1 1

MODIS-PFT

‘Shrub’ 2 0 4 0 2

GLC2000 ‘Herbaceouscover’

0 2 0 4 2

GLC2000 ‘Herbaceouscover’

0 2 0 4 2

Total score 8 8 9 13 15

GLC2000 is taken twice in the calculation to be consistent that each productsupplies the same amount of information. The class indicated by each layercontributes scores to the target legend classes. The target legend class with thehighest score wins, here ‘shrubs and grasses’.

542 M. Jung et al. / Remote Sensing of Environment 101 (2006) 534–553

three points are given in some cases when flexibility is neededand indicate minor or major components of class X in class Y.The definition of scores requires knowledge of the originalclassification schemes and is to some degree subjective. Alltables with affinity scores are presented in Appendix A.

5.5. Calculation of SYNMAP

The calculation is done in two steps: the first step isrelated to dominant life forms, the second step performsestimation of leaf attributes if a tree component is present inthe life form assemblage. Applying the score tables for allfive data sets a total score is calculated for each target landcover class for each pixel. To be consistent that each productcontributes the same amount of information, the GLC2000data set was used twice. In addition to GLCC, GLC2000, andMODIS land cover products, information on leaf type andleaf longevity is added from the reclassified CFTC maps. Apixel gets assigned the class for which the total score ismaximal. The concept is illustrated in Fig. 2 in conjunctionwith Tables 5 and 6.

Global Land CoverData Sets

GLCC-IGBP

GLCC-USGS

MODIS-IGBP

MODIS-PFT

GLC2000

GLC2000

Leaf type from AVHRR-CFTC

1 2 3 4 5 6 71 2 3

4 2 0 0 0 2 02 4 0 0 2 0 00 0 0 4 0 0 2

Affinity scores linclasses with targe

classes

1 2 3 4 5 6 71 2 3

4 2 0 0 0 2 02 4 0 0 2 0 00 0 0 4 0 0 2

1 2 3 4 5 6 71 2 3

4 2 0 0 0 2 02 4 0 0 2 0 00 0 0 4 0 0 2

Leaf longevity fromAVHRR-CFTC

Fig. 2. Principle of the data fusion method. The legends of land cover products are lilongevity. Leaf type and leaf longevity maps from CFTC data contribute additionalpresent in the life form assemblage. Each land cover product contributes the same amused, while GLC2000 is only available with one global legend and therefore counts d3×3 pixel window with the center pixel being weighted by eight according to Eq. (

Instead of calculating the total scores for each target classalong the pixel vector across the different land cover products(six addends, seven addends for leaf attributes), we place a 3×3window around each pixel. Hence, information for each targetclass is accumulating from 54 addends (3×3×6) (63 for leafattributes), while the center of the pixel is weighted by eight tomake it equally important to the sum of all the remainingwindow pixels. Including neighbouring pixels in the calculationhas several advantages. It accounts to some extent for potentialinaccuracies of the georeferencing of the original products and

king LCPt legend

Fuzzy agreementcalculation

Affinity scores –life form

Affinity scores –leaf type

Affinity scores –leaf longevity

SYNMAP

If trees

If trees 11

1

1 111

18

nked with the target legend using affinity scores for life forms, leaf type and leafly to the calculation of leaf attributes. Leaf attributes are calculated if trees areount of information; for MODIS and GLCC, two different reclassifications are

ouble in the calculation. Fuzzy agreement of the different maps is calculated for a1).

Page 10: Exploiting synergies of global land cover products for ... · Exploiting synergies of global land cover products for carbon cycle modeling Martin Jung a,⁎, Kathrin Henkel a, Martin

Table 6Calculation example for leaf type

Data set Original land cover class Needle Broad Mixed

GLCC-IGBP ‘Mixed forest’ 2 2 4GLCC-USGS ‘Mixed forest’ 2 2 4MODIS-IGBP ‘Evergreen needleleaf forest’ 4 0 2MODIS-PFT ‘Evergreen needleleaf forest’ 4 0 2GLC2000 Tree cover, broadleaf, deciduous 0 4 2GLC2000 Tree cover, broadleaf, deciduous 0 4 2CFTC Broadleaf 0 4 2Total score 12 16 18

Note that CFTC data contribute information in addition to the land coverproducts. The calculation for leaf longevity (not shown) operates the same way.

Table 7Decision matrix for leaf type (below diagonal) and longevity (above diagonal) incase two leaf classes receive the same score

Leaf longevity/leaf type ‘Mixed’ ‘Deciduous/broadleaf’

‘Evergreen/needleleaf’

‘Mixed’ – ‘Deciduous’ ‘Evergreen’‘Deciduous/broadleaf’ ‘Broadleaf’ – ‘Mixed’‘Evergreen/needleleaf’ ‘Needleleaf’ ‘Mixed’ –

543M. Jung et al. / Remote Sensing of Environment 101 (2006) 534–553

artefacts from resampling of the original sensor data from whichthe land cover maps were produced. Secondly, it improves thereliability of the resulting map in very heterogeneous areas suchas the Mediterranean (see Section 4). It further reduces thechance that two or more classes receive the same maximal scoreand therefore acts to force a decision.

The choice of the SYNMAP class is therefore madeaccording to the following equation that calculates the totalscore for each life form (T) of the SYNMAP legend for grid cellwith coordinates i and j of SYNMAP:

STTotalði; jÞ ¼ 8X6

M¼1

STM ði; jÞ þX6

M¼1

STM ði−1; jÞ þX6

M¼1

STM ðiþ 1; jÞ

þX6

M¼1

STM ði; j−1Þ þX6

M¼1

STM ði; jþ 1Þ

þX6

M¼1

STM ði−1; j−1Þ þX6

M¼1

STM ðiþ 1; jþ 1Þ

þX6

M¼1

STM ði−1; jþ 1Þ þX6

M¼1

STM ðiþ 1; j−1Þ

ð1ÞEq. (1): calculation of total score for each life form of theSYNMAP legend.

STotalT (i,j) total score for life form T of SYNMAP legend

SMT (i,j) affinity score for the life form in the grid cell (i,j) of the

existing land cover map M assigned to class T ofSYNMAP legend (Appendix A)

T life form of SYNMAP legend (Table 3)M existing land cover maps (GLCC-USGS, GLCC-

IGBP, MODIS-PFT, MODIS-IGBP, 2xGLC2000)i current row of pixelsj current column of pixels.

The life form with the maximum total score STotalT (i,j) is

chosen as the best estimate of the life form in grid cell (i,j) ofSYNMAP. In case ‘trees’ are present in the life formassemblage, leaf type and leaf longevity are estimated accordingto the same principle, but integrating over seven land covermaps because CFTC data are included.

In case two or more life form classes receive the samemaximal score, the decision which life form class wins is made

by a priority rule; in our case it is simply the ascending order ofclass values. If leaf type or leaf longevity cannot be assignedbecause more than one leaf class has the same maximal score adecision matrix defines the winning leaf attributes (Table 7). Ifno information for leaf attributes is available (i.e. maximal scoreis zero), both leaf type and leaf longevity are set to ‘mixed’. Thiscompromise introduces uncertainty, which is fortunately smallsince this case is very rare and applies only to vegetationmosaics with some tree coverage so that only part of the leafattribute information of that class is biased.

The next section presents the SYNMAP data set that we havederived from our fuzzy logic-based method and evaluates thesuccess of the data fusion process.

6. SYNMAP evaluation

On a qualitative basis, SYNMAP agrees with what we wouldexpect, reproducing the present day ecological zones on Earth(Fig. 3). The indication of mixed leaf types of the tree and grasssavanna at the fringe of the Sahel to the Sahara is, however,erroneous and should be deciduous broadleaf (raingreen trees).Here, the life form assemblage ‘trees and grasses’ with noinformation about leaf attributes was indicated by the input datasets, so that leaf type and longevity were set to mixed tominimize the error. For model parameterization, the differencebetween ‘deciduous broadleaf trees and grasses’ and ‘evergreenneedleleaf and deciduous broadleaf trees and grasses’ is rathersmall and acceptable. We assess the success of the data fusionmethod and hence the reliability of SYNMAP quantitativelythrough intermap comparison with its input data GLCC,GLC2000, and MODIS land cover data sets and link the resultsto the validation efforts of the original data sets.

To make GLCC, GLC2000, MODIS, and SYNMAP landcover products comparable in terms of classification schemes,we reclassified the data sets into the SIMPLE legend (seeTables 2 and 3). Minor bias due to different originalclassification legends for some classes remains after thisstep since, e.g., the SYNMAP class ‘shrubs and barren’ isconsidered as ‘open shrubland’, and hence included in‘shrubs’ while it also overlaps with the class ‘sparselyvegetated’, which is included in ‘barren’. We then calculatepixel-based confusion matrices between each combinationpair; water and wetland areas have been excluded here. Fromconfusion matrices, we derive overall consistency for lifeforms and leaf attributes separately, which we define as thepercentage of pixels where both maps agree on the class. Toget an estimate of the mean overall consistency of a map, wesimply average the calculated consistency estimates where the

Page 11: Exploiting synergies of global land cover products for ... · Exploiting synergies of global land cover products for carbon cycle modeling Martin Jung a,⁎, Kathrin Henkel a, Martin

Fig. 3. The SYNMAP data set. (a) Life form assemblages. ‘Shrubs and crops’, ‘grasses and barren’, and ‘urban’ have too little extent and are invisible on that scale. (b)Leaf attributes of trees. All areas are displayed where a tree component is present. ‘Needle leafed, mixed longevity’, ‘mixed leaf types, evergreen’, and ‘mixed leaftypes deciduous’ have too little extent and are invisible on that scale.

544 M. Jung et al. / Remote Sensing of Environment 101 (2006) 534–553

considered map was a comparison partner (Eq. (2)). Theconsistency for the tree leaf attributes (evergreen needleleaf,evergreen broadleaf, deciduous needleleaf, deciduous broad-leaf, and mixed) is related to confusion within forest classesonly (where both maps indicate forest) that have informationon leaf type and longevity since forests were definedaccording to different tree cover threshold by IGBP-DISCover(>60%) and LCCS (>15%). We use the term ‘consistency’between land cover products to avoid ‘accuracy’, which

Fig. 4. Overall consistency between GLCC, GLC2000, MODIS, and SYNMAP basedconsistencies of the maps are given along the diagonal.

would not be entirely correct given that we do not providevalidation against reference data. The consistency measure iscalculated analogous to accuracy from confusion tables as incommon remote sensing practise.

Mean Ca ¼ ðCab þ Cac þ CadÞ=3 ð2Þ

Eq. (2): calculation of mean consistency for map a. Indicesa–d are the maps (GLCC, GLC2000, MODIS, SYNMAP).

on major life forms (SIMPLE legend) (a) and leaf attributes (b). Average overall

Page 12: Exploiting synergies of global land cover products for ... · Exploiting synergies of global land cover products for carbon cycle modeling Martin Jung a,⁎, Kathrin Henkel a, Martin

545M. Jung et al. / Remote Sensing of Environment 101 (2006) 534–553

Fig. 4 presents the calculated consistency between data setsfor major life forms and leaf attributes. Among all map pairs,SYNMAP agrees best with each original land cover productregarding the occurrences of major life forms and leaf attributes.Average map-specific consistencies are presented along thediagonal.

We further calculate class-specific consistencies for eachland cover product based on the SIMPLE legend, which wedefine here as the percentage of ‘right’ classified pixels of aclass (where both maps agree) relative to the number ofpixels that have been classified as that class by either of themaps (Eq. (3)). Therefore, the class consistency includesboth omission and commission errors and will be muchlower than usually reported user's or producer's accuracies.

Fig. 5. Average class-specific consistencies for GLCC, GLC2000, MODIS, andtruth’ class accuracies derived from published confusion matrices for GLCC, GLextent (in lat/lon pixels) within SYNMAP is given as percentages in brackets. ‘Sthat no area weighting has been done to calculate overall accuracies from the poverall accuracies up.

Average class-specific consistencies are calculated accordingto Eq. (2).

CCab ¼ nabna þ nb−nab

� 100 ð3Þ

Eq. (3): calculation of class consistency.

CCab is class consistency of a class between the two maps aand b

nab is number of pixels mapped as respective class by bothmaps (a and b)

na is number of pixels mapped as respective class by mapa

SYNMAP derived from intermap comparison (filled markers) and ‘groundC2000, and MODIS based on validation data (not filled markers). The areanow’ and ‘urban’ classes were not sampled by all validation exercises. Noteublished confusion matrices with ground truth. Area weighting would shift

Page 13: Exploiting synergies of global land cover products for ... · Exploiting synergies of global land cover products for carbon cycle modeling Martin Jung a,⁎, Kathrin Henkel a, Martin

546 M. Jung et al. / Remote Sensing of Environment 101 (2006) 534–553

nb is number of pixels mapped as respective class by mapb.

Indices a–d are the maps (GLCC, GLC2000, MODIS,SYNMAP).

In addition to the consistencies derived from intermapcomparison, we calculate ‘reference’ accuracies for GLCC,GLC2000, and MODIS products from published confusionmatrices in their individual validation analysis papers (Mayauxet al., in press; MODIS land cover team, 2003; Scepan, 1999).We determine these land cover product-specific referenceaccuracies in the same way and using the same SIMPLEreclassification as in the map corroboration analysis. Linkingthe accuracies from the original validation data of the land coverproducts to the consistency estimates derived from intermapcomparisons gives insights whether it is possible to approximatethe class accuracies of the land cover products by map to mapcorroborations and further provides a benchmark of thereliability of a particular class.

Fig. 5 shows class-specific consistencies derived fromintermap comparison and reference accuracies calculated frompublished validation exercises of GLCC, GLC2000, andMODIS.The map corroborations suggest that SYNMAP has the highestaverage class consistencies for all classes except ‘deciduousneedleleaf’ where MODIS is slightly higher. For ‘trees’, ‘crops’,‘grasses’, and ‘deciduous broadleaf’, and to a lesser extent‘shrubs’ and ‘crops/natural vegetation mosaic’ classes, SYNMAPexhibits particular enhanced agreement with the other maps.

For major life forms, the range of class consistenciescalculated from map to map comparisons and class accuraciesfrom validation agrees well on a qualitative basis, except for‘barren’, indicating that map corroboration approximatesoverall and class accuracies. The accuracies for leaf attributesderived from validation data for GLCC, GLC2000, and MODISappear to be higher than the corresponding consistencies fromintermap comparisons. The discrepancy for ‘barren’ and leafattributes may be an artefact of the sensitivity of ground truthsites to their location (see Section 3.6).

Fig. 5 also reveals a general pattern of class reliability ofcurrent global remote sensing-based land cover maps. Areaswith trees, snow and barren are accurately recognized. Crop-lands can be considered to be ok. Shrublands, grasslands, andurban area are problematic; croplands in association withnatural vegetation are very problematic as well. Regarding leafclasses of trees, evergreen broadleaf is very well represented inmaps, evergreen and deciduous needleleaf are well reproduced,while deciduous broadleaf and mixed seem more uncertain.

7. Limitations and advantages of SYNMAP

A major concern about SYNMAP is that we cannot providerigorous validation against reference data at the moment.However, we have shown that our data fusion method wassuccessful so that SYNMAP represents the best agreementbetween GLCC, GLC2000 and the MODIS land cover productthat all had been validated individually. We further cannot statestrict definitions of the SYNMAP classes involving thresholds

such as at what exact percentage of tree coverage a forest classis mapped. This is a critical point, but according to ourknowledge and experience this is more of a theoretical issue; inpractise the current capabilities of global land cover mappingcannot provide such precise information accurately anyway.One may also see the definition of affinity scores as a furtherweak point. We assigned the affinity scores between originalland cover product and SYNMAP classes ad hoc according tosemantic rules and our knowledge; they have not been derivedin a purely objective, quantitative way, and do not takeindividual strengths and weaknesses of the products intoaccount.

However, we have provided a simple and useful solution to acommon problem with land cover data sets. The proposed datafusion method can be applied to produce a map for a specificpurpose with a desired legend from existing maps. Essential ishere that the legends of the existing products are studied andappropriately linked to the target legend via affinity scores usingsemantic rules. If one is confident that a certain input data set isof significantly higher quality, then a higher weight can be givento that product, e.g. it goes twice into the calculation of class-specific scores. However, giving weight to individual classes ofinput products is very dangerous and should not be appliedsince higher weighted classes have a higher probability to bemapped in the product and thus will erroneously occur toooften. Knowledge about class-specific accuracies of theproducts can, in conjunction with the calculated fuzzyagreement, be used to generate a map of confidence of thefinal product. Especially, when the target legend containsclasses that are different from the legends of the input data,several target legend classes may get an equal score for a pixel,i.e. there is no unambiguous class to be mapped. To tackle thisproblem, it is advantageous to use many data sets with differentlegends and to include neighbouring pixels with a smallerweight into the calculation to increase the number of estimatesand therefore confidence.

Despite the remaining limitations of SYNMAP, we areconfident that we have produced a data set that is better suitableto parameterize carbon cycle models than existing ones. TheSYNMAP legend is well suited and uncertainties resulting fromcross-walking the map classes to the model vegetation classesare reduced. Particularly important is that, in contrast to existingproducts, leaf characteristics are defined for mixed forest andmosaic classes of trees with other vegetation or cropland giventhat biophysical parameters are associated with these traits inthe model. We believe SYNMAP to be more accurate thanexisting land cover products since it makes use of synergiesbetween different land cover mapping approaches that all havetheir individual strengths and limitations and a blend of thedifferent maps should enhance the signal-to-noise ratio, whichis indeed indicated by intermap comparisons.

While SYNMAP is specifically developed for carbon cyclemodelling, it may be suitable for other applications. Disregard-ing the accuracy of a land cover product, its applicability for aspecific purpose depends on which classes are considered inrespect to the application requirements. SYNMAP resemblesthe information content of the legends of its input data sets but

Page 14: Exploiting synergies of global land cover products for ... · Exploiting synergies of global land cover products for carbon cycle modeling Martin Jung a,⁎, Kathrin Henkel a, Martin

547M. Jung et al. / Remote Sensing of Environment 101 (2006) 534–553

has more distinct definitions of which vegetation types arepresent in mixed classes. Thus, SYNMAP can be aggregated toa coarser model grid cell, which contains fractions of theglobally most relevant plant functional types without havingmixed classes: evergreen needleleaf tree, evergreen broadleaftree, deciduous needleleaf tree, deciduous broadleaf tree, shrub,grass, crop, and also urban, barren, and water. Hence, if suchinformation is sufficient for a particular purpose, SYNMAPmay be used for this. If, however, information with greaterthematic detail is needed other sources such as the regional tilesof GLC2000 with region-specific legends or data sets fromhigher resolution satellite imagery (e.g. Landsat) should beconsulted. Currently, there is no ‘wetland’ class in SYNMAPbecause global vegetation models do not deal with wetlandsexplicitly at present. Furthermore, the ‘wetland’ class in landcover products from optical remote sensing is rather uncertainsince the sensor is sensitive to vegetation coverage rather thanwetness.

The SYNMAP data set is currently used in a carbon cyclemodel intercomparison initiative of the CARBOEUROPE-IPproject that also investigates the effect of using different landcover products for parameterizations. Preliminary analysis ofthe model results show that SYNMAP-based calculations for allcarbon budget variables plot in between those for MODIS andGLC2000, which gives us with further confidence of the qualityof SYNMAP (Jung et al., in preparation).

8. The way forward from a user's perspective

The common problem is that a single product will never beperfectly suited for all applications either in terms of spatialcoverage, accuracy and/or in terms of the legend. Therefore, it isimportant that the land cover community generates more andincreasingly accurate products. But what would be reallydesirable and would constitute a great contribution to Earthsystem modelling is an online tool where users can design theirown legends of the land cover product they need. This is verychallenging but it may possible for the future. Major steps thatneed to be taken include:

(1) fostering interoperability of land cover products anddeveloping a common land cover language such as LCCSthat links product legends in a quantitative or semi-quantitative way

(2) studying the accuracy and individual strengths andweaknesses of existing products thoroughly, and identi-fying problem areas and classes that need to be remapped

(3) developing a sophisticated data fusion method that makesuse of (1) and (2)

(4) compiling an extensive data base of reference data andproviding operational validation.

(5) setting up storage and computing facilities, collecting alldata, and providing a web-based interface.

As an increasing array of land cover classifications andcontinuous fields data is accumulating an operational methodthat merges these data in a desired classification legend and

provides a validation against reference data straight away wouldresult in more accurate products that ultimately best satisfy theusers. Especially the ‘difficult areas’, i.e. the transitional andheterogeneous zones would be better mapped when varioussources are used with their individual strengths and weaknessestaken into account and when the user himself can decide in whathe is more interested by designing the target legend in ahierarchical way (e.g. according to LCCS). The work of Heroldet al. (in press), Fritz and See (2005), and See and Fritz (inpress) is promising and already goes in that direction.

9. Summary and conclusions

Initiatives of global land cover mapping have used diverseapproaches and data from different satellite sensors withvarying degrees of raw data corrections and manual manipula-tion during the classification process. It is not surprising thatthey produced different results and it is currently not possible tojudge which map is more suitable for a specific purpose.

Based on the individual validation efforts and intermapcomparison of GLCC, MODIS, and GLC2000 land coverproducts we identified problematic areas. Trees (woodlands),snow covered as well as bare areas seem reliably mapped, whilediscrepancies exist within forest classes such as confusionbetween deciduous broadleaf and mixed forest. Croplands arewell represented, as long as they are not grouped with naturalvegetation. The ‘cropland/natural vegetation mosaic’ class is theleast reliable land cover category. In addition, significantuncertainties are associated with grass- and shrubland classes.

Spatially, regions of disagreement between data sets areprimarily related to transitional zones with mixed classes; landcover change between acquisition periods is found to be ofsecond order significance. Problem areas and problem classesare connected and mainly related to two issues: classseparability and spatial heterogeneity, as well as mapping acontinuum transition with a discrete classification scheme. Thefirst issue may be regarded as a general problem of opticalremote sensing in discriminating certain categories that havelarge intraclass variance of their multitemporal spectralsignatures that overlap with other categories such as shrub-and grasslands or wetlands. The second issue is one ofcartographic standards. When different maps give variousestimates for areas with mixed classes, all may be right andwrong to some extent since maps are forced to fit the real worldinto categories being very sensitive to classification algorithmsand representation of mixed cartographic units.

Classification schemes and class definitions are problematicin several ways. More or less arbitrary thresholds are applied todistinguish between classes such as open and closed forest.Mixed classes especially lack clear definitions, partly because itis not possible given the limitations of global land covermapping. Different initiatives used classification schemes,which make intermap comparison challenging and onlypossible on an aggregate class level. Especially problematicfor users of land cover products is that classification schemesmay be not flexible enough for their application becauseimportant information, e.g. the specification of leaf attributes for

Page 15: Exploiting synergies of global land cover products for ... · Exploiting synergies of global land cover products for carbon cycle modeling Martin Jung a,⁎, Kathrin Henkel a, Martin

548 M. Jung et al. / Remote Sensing of Environment 101 (2006) 534–553

‘savanna’ type for carbon cycling modelling parameterization,is missing.

Motivated by the disagreement and classification legendsunsuitable for terrestrial carbon cycle modelling of existingproducts we developed a method that generates a newglobal land cover map (SYNMAP) dedicated for terrestrialcarbon cycle modelling with biogeochemistry and dynamicvegetation models. SYNMAP has been already successfullyapplied for parameterization of the BIOME-BGC, LPJ, andORCHIDEE models. The data fusion process blendsdifferent global land cover products based on fuzzyagreement and allows the definition of a desired targetlegend.

Pixel-based intercomparison of SYNMAP, GLCC,GLC2000, and MODIS land cover products at an aggregatedclass level reveals highest overall and class-specificconsistency for SYNMAP and therefore indicates thesuccessful exploration of synergies between products.Although we believe that our data set is more accuratethan existing land cover products, a thorough test ofSYNMAP is only possible using a comprehensive databaseof ground truth information. The current developments formore operational and harmonized land cover observations,both in situ and satellite-based, will provide consistent andcontinuous representations of the Earths land cover: in morespatial detail, with more flexible legends, and with robustand comparable accuracy statements (Herold et al., in press).

The general problem of using land cover data sets incarbon cycle modelling is the subject of ongoing research.Categorical maps such as SYNMAP are straightforward touse in many global models but a limited number of discreteclasses is problematic for two main reasons. Firstly, it

Appendix A. Affinity scores for life forms, leaf types, and lea

Water Trees Treesandshrubs

Treesandgrasses

Treesandcrops

Shrubs Shrubandgrasse

IGBP-DISCover legend (GLCC and MODIS)Water 4 0 0 0 0 0 0Evergreenneedleleaveforest

0 4 2 2 2 0 0

Evergreenbroadleafforest

0 4 2 2 2 0 0

Deciduousneedleleafforest

0 4 2 2 2 0 0

Deciduousbroadleafforest

0 4 2 2 2 0 0

Mixed forest 0 4 2 2 2 0 0Closed shrublands 0 0 2 0 0 4 2Open shrublands 0 0 1 0 0 2 4Woody savannas 0 2 4 3 2 1 1Savannas 0 1 3 4 2 2 2

depends on how land cover is perceived (i.e. definition ofclasses) and there is likely to be a mismatch between theproduct and the model assumptions even if the class iscalled the same. Secondly, actual land cover is a continuumand usually a composition of different vegetation types ispresent in a grid cell (usually about 0.5 degree for largescale carbon cycle modelling). Vegetation continuous fieldsproducts are an attempt to tackle this problem and for somemodels it is sensible to implement such data sets. Beer(2005) applied MODIS vegetation continuous fields data forthe boreal region in LPJ model and found significantimprovements in comparison to discrete modelled naturalvegetation coverage. Another alternative is to go beyond acomplete physiognomic land cover description and focus onmeasurable traits and biophysical variables, which requireother types of carbon cycle models. For instance, theTurgor-Mesic-Sclerophyll scheme is a framework that linkscanopy leaf property with vegetation structure and resourceavailability (Berry & Roderick, 2002). This schemedescribes properties of leaves, not vegetation types orspecies. It has been already successfully applied for theAustralian vegetation. It is highly desirable to conduct morestudies on alternative methods to carbon cycle modelling.

Acknowledgements

We would like to thank Martin Heimann and reviewers fromRSE for constructive input on an earlier draft of the manuscript,and Sabine Zinn for some help with writing Eq. (1). This paperis a contribution to the International Max Planck ResearchSchool on Earth System Modelling (IMPRS-ESM), Hamburg,where the first author is enrolled in a doctoral programme.

f longevities

s

s

Shrubsandcrops

Shrubsandbarren

Grasses Grassesandcrops

Grassesandbarren

Crops Barren Urban Snow

0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 02 2 0 0 0 0 0 0 02 4 2 1 1 0 2 0 00 0 1 0 0 0 0 0 01 1 2 1 0 0 0 0 0

Page 16: Exploiting synergies of global land cover products for ... · Exploiting synergies of global land cover products for carbon cycle modeling Martin Jung a,⁎, Kathrin Henkel a, Martin

Water Trees Treesandshrubs

Treesandgrasses

Treesandcrops

Shrubs Shrubsandgrasses

Shrubsandcrops

Shrubsandbarren

Grasses Grassesandcrops

Grassesandbarren

Crops Barren Urban Snow

IGBP-DISCover legend (GLCC and MODIS)Grasslands 0 0 0 2 0 0 2 0 0 4 2 2 0 0 0 0Permanent

wetlands0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Croplands 0 0 0 0 2 0 0 2 0 0 2 0 4 0 0 0Urban and built-up 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 0Cropland/natural

vegetation mosaic0 2 2 2 4 2 2 4 1 2 4 1 2 0 0 0

Snow and ice 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4Barren or sparsely

vegetated0 0 0 0 0 1 2 0 3 1 0 3 0 4 0 0

PFT legend (MODIS)Water 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0Needleleaf

evergreen tree0 4 2 2 2 0 0 0 0 0 0 0 0 0 0 0

Broadleafevergreen tree

0 4 2 2 2 0 0 0 0 0 0 0 0 0 0 0

Needleleafdeciduous tree

0 4 2 2 2 0 0 0 0 0 0 0 0 0 0 0

Broadleafdeciduous tree

0 4 2 2 2 0 0 0 0 0 0 0 0 0 0 0

Shrub 0 0 2 0 0 4 2 2 2 0 0 0 0 0 0 0Grass 0 0 0 2 0 0 2 0 0 4 2 2 0 0 0 0Cereal crop 0 0 0 0 2 0 0 2 0 0 2 0 4 0 0 0Broadleaf crop 0 0 0 0 2 0 0 2 0 0 2 0 4 0 0 0Urban and built-up 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 0Snow and ice 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4Barren or sparsely

vegetated0 0 0 0 0 1 2 0 3 1 0 3 0 4 0 0

USGS legend (GLCC)Urban and built-up

land0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 0

Dryland croplandand pasture

0 0 0 0 2 0 0 2 0 0 2 0 4 0 0 0

Irrigated croplandand pasture

0 0 0 0 2 0 0 2 0 0 2 0 4 0 0 0

Mixed dryland/irrigated croplandand pasture

0 0 0 0 2 0 0 2 0 0 2 0 4 0 0 0

Cropland/grasslandmosaic

0 0 0 0 0 2 1 2 0 2 4 1 2 0 0 0

Cropland/woodlandmosaic

0 2 1 1 4 0 0 1 0 0 1 0 2 0 0 0

Grassland 0 0 0 2 0 0 2 0 0 4 2 2 0 0 0 0Shrubland 0 0 2 0 0 4 2 2 2 0 0 0 0 0 0 0Mixed shrubland/

grassland0 0 1 1 0 2 4 1 1 2 1 1 0 0 0 0

Savanna 0 2 4 4 2 2 2 1 1 2 1 1 0 0 0 0Deciduous broadleaf

forest0 4 2 2 2 0 0 0 0 0 0 0 0 0 0 0

Deciduous needleleafforest

0 4 2 2 2 0 0 0 0 0 0 0 0 0 0 0

Evergreen broadleafforest

0 4 2 2 2 0 0 0 0 0 0 0 0 0 0 0

Evergreen needleleaveforest

0 4 2 2 2 0 0 0 0 0 0 0 0 0 0 0

Mixed forest 0 4 2 2 2 0 0 0 0 0 0 0 0 0 0 0Water bodies 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0Herbaceous wetland 0 0 0 2 0 0 2 0 0 4 2 2 0 0 0 0Wooded wetland 0 4 3 2 1 1 1 0 1 1 0 0 0 0 0 0

(continued on next page)

Appendix A (continued)

549M. Jung et al. / Remote Sensing of Environment 101 (2006) 534–553

Page 17: Exploiting synergies of global land cover products for ... · Exploiting synergies of global land cover products for carbon cycle modeling Martin Jung a,⁎, Kathrin Henkel a, Martin

(continued)

Water Trees Treesandshrubs

Treesandgrasses

Treesandcrops

Shrubs Shrubsandgrasses

Shrubsandcrops

Shrubsandbarren

Grasses Grassesandcrops

Grassesandbarren

Crops Barren Urban Snow

USGS legend (GLCC)Barren or sparselyvegetated

0 0 0 0 0 1 2 0 3 1 0 3 0 4 0 0

Herbaceous tundra 0 0 0 2 0 0 2 0 0 4 2 2 0 0 0 0Wooded tundra 0 2 4 3 1 2 2 1 1 2 1 1 0 0 0 0Mixed tundra 0 2 3 4 1 2 3 1 1 2 1 1 0 0 0 0Bare ground tundra 0 0 0 0 0 0 0 0 2 0 0 2 0 4 0 0Snow or ice 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4

LCCS legend (GLC2000)Tree cover, broadleaved,evergreen

0 4 2 2 2 0 0 0 0 0 0 0 0 0 0 0

Tree cover, broadleaved,deciduous, closed

0 4 2 2 2 0 0 0 0 0 0 0 0 0 0 0

Tree cover, broadleaved,deciduous, open

0 4 3 3 2 1 1 0 0 1 0 0 0 0 0 0

Tree cover, needleleaved,evergreen

0 4 2 2 2 0 0 0 0 0 0 0 0 0 0 0

Tree cover, needleleaved,deciduous

0 4 2 2 2 0 0 0 0 0 0 0 0 0 0 0

Tree cover, mixed leaf type 0 4 2 2 2 0 0 0 0 0 0 0 0 0 0 0Tree cover, regularlyflooded, fresh water

0 4 2 2 2 0 0 0 0 0 0 0 0 0 0 0

Tree cover, regularlyflooded, saline water

0 4 2 2 2 0 0 0 0 0 0 0 0 0 0 0

Mosaic: tree cover/othernatural vegetation

0 2 4 4 2 2 2 1 1 2 1 1 0 0 0 0

Tree cover, burnt 0 4 2 2 2 0 0 0 0 0 0 0 0 0 0 0Shrub cover, closed-open,evergreen

0 0 2 0 0 4 3 2 3 0 0 0 0 0 0 0

Shrub cover, closed-open,deciduous

0 0 2 0 0 4 3 2 3 0 0 0 0 0 0 0

Herbaceous cover,closed-open

0 0 0 2 0 0 2 0 0 4 2 2 0 0 0 0

Sparse herbaceous orsparse shrub cover

0 0 1 1 0 2 2 1 4 2 1 4 0 3 0 0

Regularly flooded shruband/or herbaceous cover

0 0 2 2 0 4 4 2 2 4 2 0 0 0 0 0

Cultivated and managedareas

0 0 0 0 2 0 0 2 0 0 2 0 4 0 0 0

Mosaic: cropland/treecover/other naturalvegetation

0 2 1 1 4 1 1 2 1 1 2 1 2 0 0 0

Mosaic: cropland/shruband/or grass cover

0 0 1 1 1 2 2 4 1 2 4 1 2 0 0 0

Bare areas 0 0 0 0 0 0 0 0 2 0 0 2 0 4 0 0Water bodies 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0Snow and ice 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4Artificial surfaces andassociated areas

0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 0

Affinity scores for leaf types and leaf longevities

Needleleaved Broadleaved Mixed leaf type Evergreen Deciduous Mixed leaf longevity

IGBP-DISCover legend (GLCC and MODIS)Water 0 0 0 0 0 0Evergreen needleleaved forest 4 0 2 4 0 2Evergreen broadleaf forest 0 4 2 4 0 2Deciduous needleleaf forest 4 0 2 0 4 2Deciduous broadleaf forest 0 4 2 0 4 2Mixed forest 2 2 4 2 2 4

Appendix A (continued)

550 M. Jung et al. / Remote Sensing of Environment 101 (2006) 534–553

Page 18: Exploiting synergies of global land cover products for ... · Exploiting synergies of global land cover products for carbon cycle modeling Martin Jung a,⁎, Kathrin Henkel a, Martin

Closed shrublands 0 0 0 0 0 0Needleleaved Broadleaved Mixed leaf type Evergreen Deciduous Mixed leaf longevity

USGS legend (GLCC)Open shrublands 0 0 0 0 0 0Woody savannas 0 0 0 0 0 0Savannas 0 0 0 0 0 0Grasslands 0 0 0 0 0 0Permanent wetlands 0 0 0 0 0 0Croplands 0 0 0 0 0 0Urban and built-up 0 0 0 0 0 0Cropland/natural vegetation

mosaic0 0 0 0 0 0

Snow and ice 0 0 0 0 0 0Barren or sparsely vegetated 0 0 0 0 0 0

PFT legend (MODIS)Water 0 0 0 0 0 0Needleleaf evergreen tree 4 0 2 4 0 2Broadleaf evergreen tree 0 4 2 4 0 2Needleleaf deciduous tree 2 0 2 0 4 2Broadleaf deciduous tree 0 4 2 0 4 2Shrub 0 0 0 0 0 0Grass 0 0 0 0 0 0Cereal crop 0 0 0 0 0 0Broadleaf crop 0 0 0 0 0 0Urban and built-up 0 0 0 0 0 0Snow and ice 0 0 0 0 0 0Barren or sparsely vegetated 0 0 0 0 0 0

USGS legend (GLCC)Urban and built-up land 0 0 0 0 0 0Dryland cropland and pasture 0 0 0 0 0 0Irrigated cropland and pasture 0 0 0 0 0 0Mixed dryland/irrigated

cropland and pasture0 0 0 0 0 0

Cropland/grassland mosaic 0 0 0 0 0 0Cropland/woodland mosaic 0 0 0 0 0 0Grassland 0 0 0 0 0 0Shrubland 0 0 0 0 0 0Mixed shrubland/grassland 0 0 0 0 0 0Savanna 0 0 0 0 0 0Deciduous broadleaf forest 0 4 2 0 4 2Deciduous needleleaf forest 4 0 2 0 4 2Evergreen broadleaf forest 0 4 2 4 0 2Evergreen needleleave forest 4 0 2 4 0 2Mixed forest 2 2 4 2 2 4Water bodies 0 0 0 0 0 0Herbaceous wetland 0 0 0 0 0 0Wooded wetland 0 0 0 0 0 0Barren or sparsely vegetated 0 0 0 0 0 0Herbaceous tundra 0 0 0 0 0 0Wooded tundra 0 0 0 0 0 0Mixed tundra 0 0 0 0 0 0Bare ground tundra 0 0 0 0 0 0Snow or ice 0 0 0 0 0 0

LCCS legend (GLC0000)Tree cover, broadleaved,

evergreen0 4 2 4 0 2

Tree cover, broadleaved,deciduous, closed

0 4 2 0 4 2

Tree cover, broadleaved,deciduous, open

0 4 2 0 4 2

Tree cover, needleleaved,evergreen

4 0 2 4 0 2

(continued on next page)

Appendix A (continued)

551M. Jung et al. / Remote Sensing of Environment 101 (2006) 534–553

Page 19: Exploiting synergies of global land cover products for ... · Exploiting synergies of global land cover products for carbon cycle modeling Martin Jung a,⁎, Kathrin Henkel a, Martin

(continued)

Needleleaved Broadleaved Mixed leaf type Evergreen Deciduous Mixed leaf longevity

LCCS legend (GLC0000)Tree cover, needleleaved,deciduous

4 0 2 0 4 2

Tree cover, mixed leaf type 2 2 4 2 2 4Tree cover, regularly flooded,fresh water

0 0 0 0 0 0

Tree cover, regularly flooded,saline water

0 0 0 0 0 0

Mosaic: tree cover/othernatural vegetation

0 0 0 0 0 0

Tree cover, burnt 0 0 0 0 0 0Shrub cover, closed-open,evergreen

0 0 0 2 0 1

Shrub cover, closed-open,deciduous

0 0 0 0 2 1

Herbaceous cover,closed-open

0 0 0 0 0 0

Sparse herbaceous or sparseshrub cover

0 0 0 0 0 0

Regularly flooded shruband/or herbaceous cover

0 0 0 0 0 0

Cultivated and managed areas 0 0 0 0 0 0Mosaic: cropland/tree cover/other natural vegetation

0 0 0 0 0 0

Mosaic: cropland/shruband/or grass cover

0 0 0 0 0 0

Bare areas 0 0 0 0 0 0Water bodies 0 0 0 0 0 0Snow and ice 0 0 0 0 0 0Artificial surfaces andassociated areas

0 0 0 0 0 0

Leaf type map from AVHRR CFTCNeedleleaved (>66%) 4 0 2 0 0 0Broadleaved (>66%) 0 4 2 0 0 0Mixed leaf type (33–66%) 2 2 4 0 0 0

Leaf longevity map from AVHRR CFTCEvergreen (>66%) 0 0 0 4 0 2Deciduous (>66%) 0 0 0 0 4 2Mixed leaf longevity(33–66%)

0 0 0 2 2 4

Appendix A (continued)

552 M. Jung et al. / Remote Sensing of Environment 101 (2006) 534–553

References

Ahlqvist, O., Keukelaar, J., & Oukbir, K. (2003). Rough and fuzzy geographicaldata integration. International Journal of Geographical InformationScience, 17, 223−234.

Bartalev, S. A., Belward, A. S., Erchov, D. V., & Isaev, A. S. (2003). A newSPOT4-VEGETATION derived land cover map of northern Eurasia.International Journal of Remote Sensing, 24, 1977−1982.

Bartholomé, E., & Belward, A. S. (2005). GLC2000: A new approach to globalland cover mapping from Earth observation data. International Journal ofRemote Sensing, 26(9), 1959−1977.

Beer, C. (2005). Klimatische Ursachen der Entwicklung der Biomasse derWälder Russlands von 1981 bis 1999—Eine Analyse mittels eines an dieboreale Zone angepassten und durch Satellitenprodukte beschränktendynamischen globalen Vegetationsmodells. University of Jena, PhDdissertation.

Berry, S. L., & Roderick, M. L. (2002). Estimating mixtures of leaf functionaltypes using continental-scale satellite and climatic data. Global Ecology andBiogeography, 11, 23−40.

CEOS. (in press). Global land cover validation: Recommendations forevaluation and accuracy assessment of global land cover maps.

Churkina, G., Tenhunen, J., Thornton, P. E., Elbers, J. A., Erhard, M., Falge, E.,et al. (2003). Analyzing the ecosystem carbon dynamics of four Europeanconiferous forest using a biogeochemistry model. Ecosystems, 6, 168−184.

Cihlar, J. (2000). Land cover mapping of large areas from satellites: Status andresearch priorities. International Journal of Remote Sensing, 21,1093−1114.

Danko, D. M. (1992). The digital chart of the world project. PhotogrammetricEngineering and Remote Sensing, 58, 1125−1128.

DeFries, R. S., Townshend, J. R. G., & Hansen, M. C. (1999). Continuousfields of vegetation characteristics at the global scale at 1-kmresolution. Journal of Geophysical Research, [Atmospheres], 104,16911−16923.

Di Gregorio, A., & Jansen, L. J. M. (2000). Land Cover Classification System(LCCS): Classification concepts and user manual. FAO.

Eva, H. D., Belward, A. S., Miranda, E. E., Bella, C. M., Gond, V., Huber, O., etal. (2004). A land cover map of South America. Global Change Biology, 10,731−744.

Page 20: Exploiting synergies of global land cover products for ... · Exploiting synergies of global land cover products for carbon cycle modeling Martin Jung a,⁎, Kathrin Henkel a, Martin

553M. Jung et al. / Remote Sensing of Environment 101 (2006) 534–553

Foody, G. M. (2002). Status of land cover classification accuracy assessment.Remote Sensing of Environment, 80, 185−201.

Friedl, M. A., McIver, D. K., Hodges, J. C. F., Zhang, X. Y., Muchoney, D.,Strahler, A. H., et al. (2002). Global land cover mapping from MODIS:Algorithms and early results. Remote Sensing of Environment, 83,287−302.

Fritz, S., Bartholomé, E., Belward, A., Hartley, A., Stibig, H. -J., Eva, H., et al.(2003). Harmonisation, mosaicing and production of the Global LandCover 2000 database (Beta Version). Ispra: European Commission, JointResearch Centre (JRC).

Fritz, S., & See, L. (2005). Comparison of land cover maps using fuzzyagreement. International Journal of Geographical Information Science, 19,787−807.

GCOS. (2004). Implementation plan for the global observing system for climatein support of the UNFCCC (p. 135). Geneva: WMO.

GEOSS. (2005). The Global Earth Observation System of Systems (GEOSS) 10-Year Implementation Plan and Reference Document.

Giri, C., Zhu, Z., & Reed, B. (2005). A comparative analysis of the Global LandCover 2000 and MODIS land cover data sets. Remote Sensing ofEnvironment, 94, 123−132.

Hansen, M. C., Defries, R. S., Townshend, J. R. G., & Sohlberg, R. (2000).Global land cover classification at 1km spatial resolution using aclassification tree approach. International Journal of Remote Sensing, 21,1331−1364.

Hansen, M. C., & Reed, B. (2000). A comparison of the IGBP DISCover andUniversity of Maryland 1km global land cover products. InternationalJournal of Remote Sensing, 21, 1365−1373.

Herold, M., Woodcock, C., Di Gregorio, A., Mayaux, P., A., B., Latham, J. et al.(in press). A joint initiative for harmonization and validation of land coverdatasets. IEEE Transactions on Geoscience and Remote Sensing.

JRC. (2003). Global Land Cover 2000 database. European Commission, JointResearch Centre.

Krinner, G., Viovy, N., de Noblet-Ducoudré, N., Ogée, J., Polcher, J.,Friedlingstein, P., et al. (2005). A dynamic global vegetation model forstudies of the coupled atmosphere–biosphere system. Global Biogeochem-ical Cycles, 19(1).

Latifovic, R., & Olthof, I. (2004). Accuracy assessment using sub-pixelfractional error matrices of global land cover products derived from satellitedata. Remote Sensing of Environment, 90, 153−165.

Loveland, T. R., Reed, B. C., Brown, J. F., Ohlen, D. O., Zhu, Z., Yang, L., et al.(2000). Development of a global land cover characteristics database and

IGBP DISCover from 1km AVHRR data. International Journal of RemoteSensing, 21, 1303−1330.

Mayaux, P., Bartholomé, E., Fritz, S., & Belward, A. (2004). A new land-covermap of Africa for the year 2000. Journal of Biogeography, 31, 861−877.

Mayaux, P., Strahler, A., Eva, H., Herold, M., Shefali, A., Naumov, S., et al. (inpress). Validation of the Global Land Cover 2000 Map. IEEE Transactionson Geoscience and Remote Sensing.

MODIS land cover team. (2003). Validation of the consistent year 2003V003 MODIS land cover product. Available at: http://geography.bu.edu/landcover/userguidelc/consistent.htm

Muchoney, D., Strahler, A., Hodges, J., & LoCastro, J. (1999). The IGBPDISCover confidence sites and the system for terrestrial ecosystemparameterization: Tools for validating global land-cover data. Photogram-metric Engineering and Remote Sensing, 65, 1061−1067.

Running, S. W., & Hunt, E. R. J. (1993). Generalization of a forest ecosystemprocess model for other biomes, BIOME-BGC, and an application for theglobal scale. In J. R. Ehleringer, & C. B. Field (Eds.), Scaling physiologicalprocesses: Leaf to globe (pp. 141−158). San Diego: Academic Press.

Running, S. W., Loveland, T. R., Pierce, L. L., Nemani, R., & Hunt, E. R.(1995). A remote-sensing based vegetation classification logic for globalland-cover analysis. Remote Sensing of Environment, 51(1), 39−48.

Scepan, J. (1999). Thematic validation of high-resolution global land-cover datasets. Photogrammetric Engineering and Remote Sensing, 65, 1051−1060.

See, L. & Fritz, S. (in press). Towards a global hybrid land cover map for theyear 2000.

Sitch, S., Smith, B., Prentice, I. C., Arneth, A., Bondeau, A., Cramer, W., et al.(2003). Evaluation of ecosystem dynamics, plant geography and terrestrialcarbon cycling in the LPJ dynamic global vegetation model. Global ChangeBiology, 9, 161−185.

Smith, J. H., Wickham, J. D., Stehman, S. V., & Yang, L. (2002). Impacts ofpatch size and land-cover heterogeneity on thematic image classificationaccuracy. Photogrammetric Engineering and Remote Sensing, 68, 65−70.

Strahler, A. H., Muchoney, D., Borak, J., Friedl, M., Gopal, S., Lambin, E.(1999). MODIS Land Cover Product: Algorithm Theoretical BasisDocument (ATBD) Version 5.0, Boston.

Thornton, P. E. (1998). Regional ecosystem simulation: Combining surface- andsatellite-based observations to study linkages between terrestrial energy andmass budgets. University of Montana, Missoula. PhD dissertation.