spatial dependence and multivariate stratification for improving soil carbon estimates in the...

Spatial Dependence and Multivariate Stratification for Improving Soil

Carbon Estimates in the Piedmont of Georgia

by Luke WorshamM.S. Candidate

Major Professor: Daniel MarkewitzCommittee: Nate Nibbelink

Larry T. West

The Global Carbon Cycle

http://earthobservatory.nasa.gov/Library/CarbonCycle/carbon_cycle4.html

Carbon Cycle

•Soils represent largest terrestrial pool of C-1500 PgC (twice the atmospheric pool; 300 times annual atmospheric release)

•~ 2 PgC unaccounted for missing sink (β-factor)

•= 2 PgC annual oceanic uptake

• Current rate of atmospheric increase determined by relative rate of ocean uptake

-Uptake rate increasing more slowly than rate of emissions

Why estimate soil carbon?

•Soil carbon sequestration ameliorates excess greenhouse gases from the atmosphere

•Carbon registries: Chicago Climate Exchange (2002) California Climate Action Registry (2001) Georgia Carbon Sequestration Registry (2004) Climate Registry (2007) Regional Greenhouse Gas Initiative (2009)

What influences soil carbon?• Input

- Leaf litter deposition and decomposition

- Belowground biomass

• Soil attributes- Clay content- Mineralogy

• Environmental covariates- Temperature- Moisture- Microorganism nutrient cycling

• Look-up tables• USFS developed regional set of look-up tables– 40 biomes: compositions, age, productivity,

history

• Models• FORCARB (Century; Roth-C) estimates– Carbon budget: land-use change and harvest

predictions

• Direct Sampling• Registries require small area estimates; often

heterogeneous landscapes direct sampling

How do we currently estimate soil carbon content and

change?

Objectives

• How can we improve the efficiency, accuracy, and precision of direct soil sampling?

– Does the spatial dependence and structure of soil C content depend on landcover? (Part I)

– Does incorporating ancillary landscape attributes affect our ability to estimate soil C content? (Part II)

Part I:

Using Spatial Dependence to Estimate Soil Carbon Contents

Under Three Different Landcover Types in the Piedmont of Georgia

Does the spatial dependence and structure of soil C content depend on landcover?

Hypothesis: Pasture would demonstrate largest range and spatial structure.

What is spatial dependence and structure?

Tobler’s Law 1st Law of Geography:Everything is related to everything else, and near things are more related than distant things.

-W. Tobler (1970)• Spatial Autocorrelation: Quantified by a semivariogram

• Separated into bins by lag distances (h)

Methods

• Samples collected in BF Grant Memorial Forest during summer 2006

• 6 Plots Total; 2 for landcover, grouped into 2 blocks– Hardwood: > 70 years; Oak-hickory type– Managed pine plantation: ~25 years old– Grazed pasture: > 70 years

• All plots occurred on Davidson loam or clay loam ultisols (Kaolinitic Kandiudults)

• …with the exception of pine plot in block 2 Vance (Typic Hapludult) and Wilkes sandy loam (Typic Hapludalf)

Methods

(Cecil Series) (Wilkes Series)

• 64 Sampling Locations per plot in cyclical pattern (Burrows et al., 2002) to facilitate semivariograms

Field Methods

• At each location:– 1 bulk density core (7.5 cm depth)– 3 samples augered (7.5 cm depth), composited– Leaf litter collected from each of 4 locations,

composited

Field Methods

• Bulk densities dried before weighing; corrected for roots, rocks

• Soil composites dried, sieved (2mm), ground, and dry combusted to determine C & N contents

• Same for leaf litter

Lab Methods

• C & N concentrations (%) multiplied by bulk density (g/cm-3) yield content to 7.5cm

• These data, along with leaf litter, were added to GIS attribute table for samples

• Generated semivariograms for each soil property at each plot using ArcGIS Geostatistical Analyst and SAS (Proc VARIOGRAM)

• Semivariograms were fit using spherical models to express spatial dependence and structure

• Averaged plot data and range/nugget:sill ratio were analyzed with One-way ANOVA (n=2)

Analysis

0

100

200

300

Mi dpoi nt of I nt er val

2. 1 8. 3 16. 6 25. 0 33. 3 41. 6 49. 9 58. 2 66. 6 74. 9 83. 2 91. 5 99. 8 108. 1 116. 5 124. 8 133. 1 141. 4

Distribution of Pairwise Distances

Midpoint of Intervals

Fre

que

ncy

Co

unt

64 Samples Number of Pairs = 64! / (2!(64 – 2)!) = 2016

• Correlation threshold- Major range

• Overall variability - Sill

• Degree of measurement error or micro-scale variation- Nugget effect

• Strength, or amount of spatial dependence- Nugget:Sill ratio

What does a semivariogram tell us?

Results

Bulk Density (g cm-3)• Landcover* (p < 0.01)• Block* (p < 0.03)• Highest for pasture

across blocks• Major range and spatial

structure were not affected by landcover or block

0.0191

0.0180

0.0160

0.0083

0.0176

0.0179

Sill

Pasture

Pine

Hardwood

Pasture

Pine

Hardwood

Landcover

1.37*

1.11

0.98

1.52*

1.19

1.09

g·cm-3

Mean

0.02

0.02

0.02

0.01

0.02

0.02

St. Error

77.9

98.8

98.8

79.5

28.8

39.4

m

Range

0.0106

0.0089

0.0136

0.0058

0.0111

0.0144

Nugget

0.56

0.49

0.85

0.70

0.63

0.81

Nugget:Sill

2

1

Block

C concentration (%)• Landcover* (p < 0.03)• Block (p < 0.92)• Highest for hardwood

across blocks; lowest for pasture

• Major range and spatial structure were not affected by landcover or block

0.257

1.215

0.600

0.119

0.422

0.552

Sill

Pasture

Pine

Hardwood

Pasture

Pine

Hardwood

Landcover

1.62

2.31

4.10

1.95

2.34

3.67

-------%-------

Mean

0.04

0.10

0.09

0.04

0.08

0.09

St. Error

98.8

98.8

98.8

42.9

98.8

98.8

m

Range

0.042

0.251

0.204

0.066

0.354

0.332

Nugget

0.28

0.29

0.34

0.56

0.84

0.60

Nugget:Sill

2

1

Block

Total C Content (kg ha-1)

• Landcover (p < 0.28)• Block* (p < 0.06)• Highest for hardwood

across blocks (p < 0.04)• Major range and spatial

dependence were not affected by landcover or block

1.32·107

4.61·107

4.19·107

1.73·107

2.28·107

5.41·107

Sill

Pasture

Pine

Hardwood

Pasture

Pine

Hardwood

Landcover

16558

18932

29865

22216

20694

29884

kg ha-1

Mean

428

839

774

611

609

896

St. Error

77.9

98.8

98.8

79.5

28.8

64.0

m

Range

6.27 ·106

2.86 ·107

2.53 ·107

9.62 ·106

2.18 ·107

3.95 ·107

Nugget

0.47

0.62

0.60

0.56

0.96

0.73

Nugget:Sill

2

1

Block

asd

Leaf Litter C Concentration (%)• Landcover (p <

0.13)• Block (p < 0.76)• Major range and

spatial dependence were not affected by landcover or block

ffff

-

0.0307

75.1

-

65.6

33.3

Sill

Pasture

Pine

Hardwood

Pasture

Pine

Hardwood

Landcover

-

33.00

30.33

-

33.96

29.91

kg ha-1

Mean

-

1.15

0.95

-

1.19

0.72

St. Error

-

39.7

98.8

-

80.4

94.6

m

Range

-

0.0259

22.2

-

59.8

33.3

Nugget

-

0.84

0.30

-

0.91

1.00

Nugget:Sill

2

1

Block

Summary

• Landcover • only significant for bulk density & soil C

concentration, but not soil C content

• Major range (for soil C content)• Within the scale of the plots only for pasture in both

blocks (medium structure) • Forested plots were inconsistent, 98.8m in block 2

for C content (maximum lag with weak structure)•Suggests dependence below or above scale of the plot (limited by 10 –

100m point separation)•Supported by high nugget effect•Other studies have shown variation in C at scales < 10 m (Schöning et

al., 2006; Liski, 1995)

• Overall, inconsistencies of spatial structure and dependence between landcovers suggests influence of other variables, such as topography (Moore et al., 1993; Gessler et al., 2000; Thompson & Kolka, 2005)

• Kriged surfaces incorporate spatial dependence in their estimates and are continuous

• Block kriging sums surface over plot to create average estimate

Applications—How do we incorporate spatial dependence for C content

estimates?

Block Krige Estimate

Block Landcover Total C Conten

t

St. Error

------kg·ha-1-----

1 Hardwood 29608 6902

Pine 20770 4840

Pasture 22369 3822

2 Hardwood 30252 5564

Pine 19538 5400

Pasture 16785 2885

•However, quality of kriged estimates are related to spatial structure

Applications

• Spatial dependence was not well defined by landcover

- Factors other than landcover (i.e. topography) most likely play significant role in determining spatial structure of soil C content

• Many soil properties demonstrated ranges = maximum lag

- Suggests dependencies > or < scale of plot

• Higher standard errors in forested plots for soil C concentration and content suggest necessity for more intensive sampling due to local heterogeneities

• Further information about spatial structure and dependence would be necessary in these landcovers for kriging estimates to be useful

• Kriging more useful in pasture estimates

- Stronger spatial structure; consistent ranges

Conclusions

Part II:

A Comparison of Four Landscape Sampling Methods to Estimate Soil

Carbon

Does the manner in which samples are located affect our ability to estimate soil C content?

Does incorporating ancillary landscape attributes affect our ability to estimate soil C content?

Questions:

IntroductionIntroduction•Builds on study by Minasny & McBratney (2006)

•Tested relative ability of random, stratified, and cLHS sampling to approximate ancillary data distributions

•But what if ancillary data are covariates for a different variable of interest, such as soil C content...

-Will extra stratification in the presence of data afford better estimates?

Introduction

A comparison of 4 different landscape sampling methods using 5 different sampling sizes:

Sampling Methods Sampling Sizes

•Random Sampling - 10 samples (1%)

•Systematic Random Sampling - 40 samples (5%)

•Stratified Random Sampling - 100 samples (12%)

•Conditioned Latin Hypercube - 300 samples (35%)

Sampling - 500 samples (58%)

• Samples collected in BF Grant Memorial Forest during summer 2007

• Single plot with 903 sampling locations on 10x10 grid

• Same sampling scheme: 1 Bulk Density, 3 Soil Composites (1, 2.5, 5m; 120°)

• Same lab prep: dry combustion to

determine C & N

concentrations• Combined with

bulk densities for

total C content

Methods

Sample Space BF Grant

Managed Pine 33% 24%

Natural Pine 27% 28%

Hardwood 32% 26%

Pasture 8% 14%

Other 0% 7%

21 by 43 plots

Methods

Landscape Variables

Landcover

Soil Series

Planiform CurvatureSlope (%)

What’s a Latin hypercube? Latin Square

Hypercube

> 3

dimensions

Minasny, B., McBratney, A.B., 2006. A conditioned Latin hypercube method for sampling in the presence of ancillary information. Computers & Geosciences 32, 1378-1388.

• No variable consideration

Selecting Sampling Locations: Simple Random

n = 10 n = 40 n = 100 n = 300 n = 500

• Random Start; Consistent Spacing by Row

Selecting Sampling Locations: Systematic Sampling

n = 10 n = 40 n = 100 n = 300 n = 500

Selecting Sampling Locations

Stratified Random cLHS

n = 10 n = 40 n = 10 n = 100

Results

•Random and Systematic overestimate mean at low sample size•Stratified and cLHS underestimate mean at low sample size•No estimate excludes population mean from 95% confidence interval•Systematic sampling yielded largest confidence intervals•All methods converge at larger sample sizes

•Lowest sampling size (n = 10) poor approximation for any method•cLHS remains most consistent at small sample size (n = 40, 100)•cLHS provides better approximation of distribution tails

Results PopulationN =10N = 40N = 100N = 300N = 500

Results

PopulationN = 40N = 100N = 300N = 500

Stratified Random

cLHS

Results•cLHS, stratified, and random provide close estimates of mean at small sample sizes

•Systematic consistently overestimates mean, least accurate for mean and std. dev.

•cLHS approximates median well at small sample sizes, along with stratified at larger n

2275922344227002287622927Median

103469133113351210712145St. Dev.

2399723545244142422024649MeanN = 100

2275923204232222172823777Median

1034689126487164119697St. Dev.

2399725070235872598724756MeanN = 40

2275922148216582635625296Median

1034653496445348925176St. Dev.

2399722719216194219125064MeanN = 10

PopulationcLHSStratifiedSystematicRandom

Total C Content

Sample Size Considerations

•Sample sizes n = 300 & n = 500 are unrealistically large; estimates have converged → utility of stratification techniques at much smaller sizes

•Often commercial scale sampling conducted on 1-ha grids (Kravchenko, 2003; Mueller et al., 2001), but often this scale is regarded as marginal or insufficient to estimate soil properties (Mueller et al., 2001; Hammond, 1993; Wollenhaupt, 1994)

•Therefore, n = 40 and n = 100, representing 5% and 12% sampled area, represent most realistic proportions

ConclusionsStratified design showed the narrowest confidence interval / best mean approximation when n = 40

cLHS demonstrated the most accurate mean, standard deviation, population distribution approximation, and narrowest confidence interval when n = 100

Might expect a threshold below n = 100 at which cLHS affords better estimation than stratified due to extra variables

Or, additional continuous variables may offer less predictive ability than discrete variables

Extra effort of systematic sampling appears inadequate at all sizes → even simple random is preferable

At mid-sample sizes (realistic sizes), some degree of stratification affords better estimates, and is therefore recommended

Ancillary data can possibly improve efficiency of sampling and accuracy of estimates for minimal additional effort

Take-Home Message

Part I: Landcover alone does not completely describe spatial attributes of soil C content; variation also exists outside the scale of our plots

Part II: Stratified sampling methods (stratified random, cLHS) with consideration of ancillary variables may provide more accurate estimates of soil C content

The EndAcknowledgments: Daniel Markewitz, Nate Nibbelink, Larry West, Emily Blizzard, Danny Figueroa, Erin Moore, Scott Devine, Marco Galang, Sami Rifai, Patrick Bussell, Budiman Minasny, Dustin Thompson, Jay Brown, my family, my friends, and many others...

Questions?

spatial dependence and multivariate stratification for improving soil carbon estimates in the...

Documents

soil carbon sequestration

soil carbon estimates

influences soil carbon

soil carbon contents

structure of soil c

spatial dependence

spatial structure

spatial autocorrelation