spatial dependence and multivariate stratification for improving soil carbon estimates in the...
TRANSCRIPT
Spatial Dependence and Multivariate Stratification for Improving Soil
Carbon Estimates in the Piedmont of Georgia
by Luke WorshamM.S. Candidate
Major Professor: Daniel MarkewitzCommittee: Nate Nibbelink
Larry T. West
The Global Carbon Cycle
http://earthobservatory.nasa.gov/Library/CarbonCycle/carbon_cycle4.html
Carbon Cycle
•Soils represent largest terrestrial pool of C-1500 PgC (twice the atmospheric pool; 300 times annual atmospheric release)
•~ 2 PgC unaccounted for missing sink (β-factor)
•= 2 PgC annual oceanic uptake
• Current rate of atmospheric increase determined by relative rate of ocean uptake
-Uptake rate increasing more slowly than rate of emissions
Why estimate soil carbon?
•Soil carbon sequestration ameliorates excess greenhouse gases from the atmosphere
•Carbon registries: Chicago Climate Exchange (2002) California Climate Action Registry (2001) Georgia Carbon Sequestration Registry (2004) Climate Registry (2007) Regional Greenhouse Gas Initiative (2009)
What influences soil carbon?• Input
- Leaf litter deposition and decomposition
- Belowground biomass
• Soil attributes- Clay content- Mineralogy
• Environmental covariates- Temperature- Moisture- Microorganism nutrient cycling
• Look-up tables• USFS developed regional set of look-up tables– 40 biomes: compositions, age, productivity,
history
• Models• FORCARB (Century; Roth-C) estimates– Carbon budget: land-use change and harvest
predictions
• Direct Sampling• Registries require small area estimates; often
heterogeneous landscapes direct sampling
How do we currently estimate soil carbon content and
change?
Objectives
• How can we improve the efficiency, accuracy, and precision of direct soil sampling?
– Does the spatial dependence and structure of soil C content depend on landcover? (Part I)
– Does incorporating ancillary landscape attributes affect our ability to estimate soil C content? (Part II)
Part I:
Using Spatial Dependence to Estimate Soil Carbon Contents
Under Three Different Landcover Types in the Piedmont of Georgia
Does the spatial dependence and structure of soil C content depend on landcover?
Hypothesis: Pasture would demonstrate largest range and spatial structure.
What is spatial dependence and structure?
Tobler’s Law 1st Law of Geography:Everything is related to everything else, and near things are more related than distant things.
-W. Tobler (1970)• Spatial Autocorrelation: Quantified by a semivariogram
• Separated into bins by lag distances (h)
Methods
• Samples collected in BF Grant Memorial Forest during summer 2006
• 6 Plots Total; 2 for landcover, grouped into 2 blocks– Hardwood: > 70 years; Oak-hickory type– Managed pine plantation: ~25 years old– Grazed pasture: > 70 years
• All plots occurred on Davidson loam or clay loam ultisols (Kaolinitic Kandiudults)
• …with the exception of pine plot in block 2 Vance (Typic Hapludult) and Wilkes sandy loam (Typic Hapludalf)
Methods
(Cecil Series) (Wilkes Series)
• 64 Sampling Locations per plot in cyclical pattern (Burrows et al., 2002) to facilitate semivariograms
Field Methods
• At each location:– 1 bulk density core (7.5 cm depth)– 3 samples augered (7.5 cm depth), composited– Leaf litter collected from each of 4 locations,
composited
Field Methods
• Bulk densities dried before weighing; corrected for roots, rocks
• Soil composites dried, sieved (2mm), ground, and dry combusted to determine C & N contents
• Same for leaf litter
Lab Methods
• C & N concentrations (%) multiplied by bulk density (g/cm-3) yield content to 7.5cm
• These data, along with leaf litter, were added to GIS attribute table for samples
• Generated semivariograms for each soil property at each plot using ArcGIS Geostatistical Analyst and SAS (Proc VARIOGRAM)
• Semivariograms were fit using spherical models to express spatial dependence and structure
• Averaged plot data and range/nugget:sill ratio were analyzed with One-way ANOVA (n=2)
Analysis
0
100
200
300
Mi dpoi nt of I nt er val
2. 1 8. 3 16. 6 25. 0 33. 3 41. 6 49. 9 58. 2 66. 6 74. 9 83. 2 91. 5 99. 8 108. 1 116. 5 124. 8 133. 1 141. 4
Distribution of Pairwise Distances
Midpoint of Intervals
Fre
que
ncy
Co
unt
64 Samples Number of Pairs = 64! / (2!(64 – 2)!) = 2016
• Correlation threshold- Major range
• Overall variability - Sill
• Degree of measurement error or micro-scale variation- Nugget effect
• Strength, or amount of spatial dependence- Nugget:Sill ratio
What does a semivariogram tell us?
Results
Bulk Density (g cm-3)• Landcover* (p < 0.01)• Block* (p < 0.03)• Highest for pasture
across blocks• Major range and spatial
structure were not affected by landcover or block
0.0191
0.0180
0.0160
0.0083
0.0176
0.0179
Sill
Pasture
Pine
Hardwood
Pasture
Pine
Hardwood
Landcover
1.37*
1.11
0.98
1.52*
1.19
1.09
g·cm-3
Mean
0.02
0.02
0.02
0.01
0.02
0.02
St. Error
77.9
98.8
98.8
79.5
28.8
39.4
m
Range
0.0106
0.0089
0.0136
0.0058
0.0111
0.0144
Nugget
0.56
0.49
0.85
0.70
0.63
0.81
Nugget:Sill
2
1
Block
C concentration (%)• Landcover* (p < 0.03)• Block (p < 0.92)• Highest for hardwood
across blocks; lowest for pasture
• Major range and spatial structure were not affected by landcover or block
0.257
1.215
0.600
0.119
0.422
0.552
Sill
Pasture
Pine
Hardwood
Pasture
Pine
Hardwood
Landcover
1.62
2.31
4.10
1.95
2.34
3.67
-------%-------
Mean
0.04
0.10
0.09
0.04
0.08
0.09
St. Error
98.8
98.8
98.8
42.9
98.8
98.8
m
Range
0.042
0.251
0.204
0.066
0.354
0.332
Nugget
0.28
0.29
0.34
0.56
0.84
0.60
Nugget:Sill
2
1
Block
Total C Content (kg ha-1)
• Landcover (p < 0.28)• Block* (p < 0.06)• Highest for hardwood
across blocks (p < 0.04)• Major range and spatial
dependence were not affected by landcover or block
1.32·107
4.61·107
4.19·107
1.73·107
2.28·107
5.41·107
Sill
Pasture
Pine
Hardwood
Pasture
Pine
Hardwood
Landcover
16558
18932
29865
22216
20694
29884
kg ha-1
Mean
428
839
774
611
609
896
St. Error
77.9
98.8
98.8
79.5
28.8
64.0
m
Range
6.27 ·106
2.86 ·107
2.53 ·107
9.62 ·106
2.18 ·107
3.95 ·107
Nugget
0.47
0.62
0.60
0.56
0.96
0.73
Nugget:Sill
2
1
Block
asd
Leaf Litter C Concentration (%)• Landcover (p <
0.13)• Block (p < 0.76)• Major range and
spatial dependence were not affected by landcover or block
ffff
-
0.0307
75.1
-
65.6
33.3
Sill
Pasture
Pine
Hardwood
Pasture
Pine
Hardwood
Landcover
-
33.00
30.33
-
33.96
29.91
kg ha-1
Mean
-
1.15
0.95
-
1.19
0.72
St. Error
-
39.7
98.8
-
80.4
94.6
m
Range
-
0.0259
22.2
-
59.8
33.3
Nugget
-
0.84
0.30
-
0.91
1.00
Nugget:Sill
2
1
Block
Summary
• Landcover • only significant for bulk density & soil C
concentration, but not soil C content
• Major range (for soil C content)• Within the scale of the plots only for pasture in both
blocks (medium structure) • Forested plots were inconsistent, 98.8m in block 2
for C content (maximum lag with weak structure)•Suggests dependence below or above scale of the plot (limited by 10 –
100m point separation)•Supported by high nugget effect•Other studies have shown variation in C at scales < 10 m (Schöning et
al., 2006; Liski, 1995)
• Overall, inconsistencies of spatial structure and dependence between landcovers suggests influence of other variables, such as topography (Moore et al., 1993; Gessler et al., 2000; Thompson & Kolka, 2005)
• Kriged surfaces incorporate spatial dependence in their estimates and are continuous
• Block kriging sums surface over plot to create average estimate
Applications—How do we incorporate spatial dependence for C content
estimates?
Block Krige Estimate
Block Landcover Total C Conten
t
St. Error
------kg·ha-1-----
1 Hardwood 29608 6902
Pine 20770 4840
Pasture 22369 3822
2 Hardwood 30252 5564
Pine 19538 5400
Pasture 16785 2885
•However, quality of kriged estimates are related to spatial structure
Applications
• Spatial dependence was not well defined by landcover
- Factors other than landcover (i.e. topography) most likely play significant role in determining spatial structure of soil C content
• Many soil properties demonstrated ranges = maximum lag
- Suggests dependencies > or < scale of plot
• Higher standard errors in forested plots for soil C concentration and content suggest necessity for more intensive sampling due to local heterogeneities
• Further information about spatial structure and dependence would be necessary in these landcovers for kriging estimates to be useful
• Kriging more useful in pasture estimates
- Stronger spatial structure; consistent ranges
Conclusions
Part II:
A Comparison of Four Landscape Sampling Methods to Estimate Soil
Carbon
Does the manner in which samples are located affect our ability to estimate soil C content?
Does incorporating ancillary landscape attributes affect our ability to estimate soil C content?
Questions:
IntroductionIntroduction•Builds on study by Minasny & McBratney (2006)
•Tested relative ability of random, stratified, and cLHS sampling to approximate ancillary data distributions
•But what if ancillary data are covariates for a different variable of interest, such as soil C content...
-Will extra stratification in the presence of data afford better estimates?
Introduction
A comparison of 4 different landscape sampling methods using 5 different sampling sizes:
Sampling Methods Sampling Sizes
•Random Sampling - 10 samples (1%)
•Systematic Random Sampling - 40 samples (5%)
•Stratified Random Sampling - 100 samples (12%)
•Conditioned Latin Hypercube - 300 samples (35%)
Sampling - 500 samples (58%)
• Samples collected in BF Grant Memorial Forest during summer 2007
• Single plot with 903 sampling locations on 10x10 grid
• Same sampling scheme: 1 Bulk Density, 3 Soil Composites (1, 2.5, 5m; 120°)
• Same lab prep: dry combustion to
determine C & N
concentrations• Combined with
bulk densities for
total C content
Methods
Sample Space BF Grant
Managed Pine 33% 24%
Natural Pine 27% 28%
Hardwood 32% 26%
Pasture 8% 14%
Other 0% 7%
21 by 43 plots
Methods
Landscape Variables
Landcover
Soil Series
Planiform CurvatureSlope (%)
What’s a Latin hypercube? Latin Square
Hypercube
> 3
dimensions
Minasny, B., McBratney, A.B., 2006. A conditioned Latin hypercube method for sampling in the presence of ancillary information. Computers & Geosciences 32, 1378-1388.
• No variable consideration
Selecting Sampling Locations: Simple Random
n = 10 n = 40 n = 100 n = 300 n = 500
• Random Start; Consistent Spacing by Row
Selecting Sampling Locations: Systematic Sampling
n = 10 n = 40 n = 100 n = 300 n = 500
Selecting Sampling Locations
Stratified Random cLHS
n = 10 n = 40 n = 10 n = 100
Results
•Random and Systematic overestimate mean at low sample size•Stratified and cLHS underestimate mean at low sample size•No estimate excludes population mean from 95% confidence interval•Systematic sampling yielded largest confidence intervals•All methods converge at larger sample sizes
•Lowest sampling size (n = 10) poor approximation for any method•cLHS remains most consistent at small sample size (n = 40, 100)•cLHS provides better approximation of distribution tails
Results PopulationN =10N = 40N = 100N = 300N = 500
Results
PopulationN = 40N = 100N = 300N = 500
Stratified Random
cLHS
Results•cLHS, stratified, and random provide close estimates of mean at small sample sizes
•Systematic consistently overestimates mean, least accurate for mean and std. dev.
•cLHS approximates median well at small sample sizes, along with stratified at larger n
2275922344227002287622927Median
103469133113351210712145St. Dev.
2399723545244142422024649MeanN = 100
2275923204232222172823777Median
1034689126487164119697St. Dev.
2399725070235872598724756MeanN = 40
2275922148216582635625296Median
1034653496445348925176St. Dev.
2399722719216194219125064MeanN = 10
PopulationcLHSStratifiedSystematicRandom
Total C Content
Sample Size Considerations
•Sample sizes n = 300 & n = 500 are unrealistically large; estimates have converged → utility of stratification techniques at much smaller sizes
•Often commercial scale sampling conducted on 1-ha grids (Kravchenko, 2003; Mueller et al., 2001), but often this scale is regarded as marginal or insufficient to estimate soil properties (Mueller et al., 2001; Hammond, 1993; Wollenhaupt, 1994)
•Therefore, n = 40 and n = 100, representing 5% and 12% sampled area, represent most realistic proportions
ConclusionsStratified design showed the narrowest confidence interval / best mean approximation when n = 40
cLHS demonstrated the most accurate mean, standard deviation, population distribution approximation, and narrowest confidence interval when n = 100
Might expect a threshold below n = 100 at which cLHS affords better estimation than stratified due to extra variables
Or, additional continuous variables may offer less predictive ability than discrete variables
Extra effort of systematic sampling appears inadequate at all sizes → even simple random is preferable
At mid-sample sizes (realistic sizes), some degree of stratification affords better estimates, and is therefore recommended
Ancillary data can possibly improve efficiency of sampling and accuracy of estimates for minimal additional effort
Take-Home Message
Part I: Landcover alone does not completely describe spatial attributes of soil C content; variation also exists outside the scale of our plots
Part II: Stratified sampling methods (stratified random, cLHS) with consideration of ancillary variables may provide more accurate estimates of soil C content
The EndAcknowledgments: Daniel Markewitz, Nate Nibbelink, Larry West, Emily Blizzard, Danny Figueroa, Erin Moore, Scott Devine, Marco Galang, Sami Rifai, Patrick Bussell, Budiman Minasny, Dustin Thompson, Jay Brown, my family, my friends, and many others...
Questions?