Semiparametric Mixed Models for Increment-Averaged Data with Application to Carbon Sequestration in Agricultural Soils

Download Semiparametric Mixed Models for Increment-Averaged Data with Application to Carbon Sequestration in Agricultural Soils

Post on 23-Jan-2017




4 download


<ul><li><p>Semiparametric Mixed Models for Increment-Averaged Data with Application to CarbonSequestration in Agricultural SoilsAuthor(s): F. Jay Breidt, Nan-Jung Hsu and Stephen OgleSource: Journal of the American Statistical Association, Vol. 102, No. 479 (Sep., 2007), pp. 803-812Published by: American Statistical AssociationStable URL: .Accessed: 14/06/2014 07:44</p><p>Your use of the JSTOR archive indicates your acceptance of the Terms &amp; Conditions of Use, available at .</p><p> .JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact</p><p> .</p><p>American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to Journalof the American Statistical Association.</p><p> </p><p>This content downloaded from on Sat, 14 Jun 2014 07:44:10 AMAll use subject to JSTOR Terms and Conditions</p><p></p></li><li><p>Semiparametric Mixed Models for </p><p>Increment-Averaged Data With Application to Carbon Sequestration in Agricultural Soils </p><p>F. Jay Breidt, Nan-Jung Hsu, and Stephen Ogle </p><p>Adoption of conservation tillage practice in agriculture offers the potential to mitigate greenhouse gas emissions. Studies comparing conser </p><p>vation tillage methods to traditional tillage pair fields under the two management systems and obtain soil core samples from each treatment. </p><p>Cores are divided into multiple increments, and matching increments from one or more cores are aggregated and analyzed for carbon stock. These data represent not the actual value at a specific depth, but rather the total or average over a depth increment. A semiparametric mixed model is developed for such increment-averaged data. The model uses parametric fixed effects to represent covariate effects, random effects to capture correlation within studies, and an integrated smooth function to describe effects of depth. The depth function is specified as an </p><p>additive model, estimated with penalized splines using standard mixed model software. Smoothing parameters are automatically selected </p><p>using restricted maximum likelihood. The methodology is applied to the problem of estimating a change in carbon stock due to a change in </p><p>tillage practice. </p><p>KEY WORDS: Core sample; Greenhouse gas; Nonparametric regression; Ornstein-Uhlenbeck process; Penalized spline; Restricted max </p><p>imum likelihood; Varying-coefficient model. </p><p>1. INTRODUCTION </p><p>Traditional agricultural management uses tillage to turn over </p><p>the soil and bury postharvest crop residues, often several times </p><p>before planting. Recently, "no-till" production systems that do </p><p>not use tillage have become economically feasible due to new </p><p>techniques and equipment. No-till, in which crop residues are </p><p>left on the soil surface, reduces soil losses due to wind and wa </p><p>ter erosion (Lindstrom, Schumacher, Cogo, and Blecha 1998). This in turn reduces the flow of sediments, nutrients, and pes </p><p>ticides into surface waters. In addition, no-till enhances soil or </p><p>ganic matter due to reduced soil disturbance (Six, Elliot, Paus </p><p>tian, and Doran 1998) and over time may improve soil fertility. Furthermore, no-till may result in lower production costs, due </p><p>to fewer management steps and lower machinery costs. (Con </p><p>ventional tillage requires more expensive, higher horsepower </p><p>tractors.) "Reduced-till" systems limit tillage and other soil </p><p>disturbing activities and leave substantial residue on the soil </p><p>surface, but to a lesser extent than no-till. Reduced-till systems </p><p>offer many of the same advantages as no-till. Together, these </p><p>systems are known as "conservation tillage" methods (Kern and </p><p>Johnson 1993; U.S. Department of Agriculture 1994). Recent interest in conservation tillage has focused on its </p><p>potential for reducing greenhouse gas (GHG) emissions, be cause of reduced soil disturbance that leads to more carbon </p><p>storage in the profile, particularly in no-till systems (Kern and Johnson 1993; Paustian et al. 1997; Lai, Kimble, Fol </p><p>lett, and Cole 1998; Smith, Powlson, Smith, Falloon, and Coleman 2000). The amount of carbon sequestered due to a </p><p>change in tillage system is economically as well as environmen </p><p>tally important; for example, the Chicago Climate Exchange </p><p>F. Jay Breidt is Professor, Department of Statistics, Colorado State Uni </p><p>versity, Fort Collins, CO 80523 (E-mail: Nan-Jung Hsu is Associate Professor, Institute of Statistics, National Tsing-Hua Univer </p><p>sity, Hsin-Chu, Taiwan 30043 (E-mail: Stephen Ogle is Research Scientist, Natural Resource Ecology Laboratory, Colorado State </p><p>University, Fort Collins, CO 80523 (E-mail: The work reported here was developed under STAR Research Assistance Agree ment CR-829095 awarded by the U.S. Environmental Protection Agency (EPA) to Colorado State University. This report has not been formally reviewed by the </p><p>EPA, and the EPA does not endorse any products or commercial services men tioned in this report. </p><p>( lists agricultural soil sequestration as a means of obtaining tradable carbon credits. </p><p>Note that there are three major biogenic GHGs (CO2, N2O, and CH4) that determine the overall net change in radiative </p><p>forcing to the atmosphere. Few studies have considered the ef </p><p>fect of all three GHGs (e.g., Robertson, Paul, and Harwood </p><p>2000), an essential question because the GHGs differ greatly in their global warming potential (computed by converting kg per hectare of each gas to CO2 equivalents). In particular, N2O has about 300 times the global warming potential of CO2. </p><p>In this article, however, we study the effect of tillage practice on emissions of CO2, with the cautionary note from the forego </p><p>ing discussion that this is only part of the GHG story on agri cultural soils. We consider all available studies reporting differ ences in soil-mediated carbon fluxes between traditional tillage and conservation tillage systems. These studies pair fields man </p><p>aged with traditional tillage with fields managed with conser vation tillage (or in some cases plots within fields) and track carbon storage over time. From these data, we select those stud </p><p>ies with complete information on conservation tillage type (no till or reduced till), soil type (aquic or nonaquic), climate (wet or dry), years since management change, carbon stock under </p><p>traditional tillage, and carbon stock under conservation tillage. Because the methods described in this article are similar for ei ther no-till or reduced till comparisons, from this point on we focus on no-till exclusively. The basic measures of interest are </p><p>then changes in carbon stock after 1 or more years since man </p><p>agement change from traditional tillage to no-till, with positive values indicating more carbon sequestered under no-till. </p><p>A special challenge of these data is that they are collected from studies in which one or more soil cores are divided into </p><p>depth increments, with matching increments across cores ag </p><p>gregated for carbon stock analysis. There are 63 paired studies in these data, with a total of 211 increments. The increment </p><p>averaged data are displayed in Figure 1. The upper and lower </p><p>endpoints of these increments vary from study to study. For ex </p><p>ample, one study may report Y\\ = total change in carbon stock </p><p>? 2007 American Statistical Association Journal of the American Statistical Association </p><p>September 2007, Vol. 102, No. 479, Applications and Case Studies DOI 10.1198/016214506000001167 </p><p>803 </p><p>This content downloaded from on Sat, 14 Jun 2014 07:44:10 AMAll use subject to JSTOR Terms and Conditions</p><p></p></li><li><p>804 Journal of the American Statistical Association, September 2007 </p><p>Depth (cm) </p><p>Figure 1. Increment-averaged carbon differences between no-till and traditional tillage versus depth. Fitted curves for wet ( ) and dry (??) climates at 20 years since management change are superimposed. </p><p>over the increment 0-15 cm and Y12 = total change in car </p><p>bon stock over the increment 15-30 cm. A second study may </p><p>report only I21 = total change over the increment 0-50 cm. </p><p>Soil scientists have used a variety of ad hoc methods to deal with this challenge. They might drop studies with nonmatching increments, or even "adjust" the Y values to make the incre </p><p>ments match. In the foregoing example, they might form the new variables Y? </p><p>= YU + Y\2 and Y* = (30/50)Y2u each rep </p><p>resenting the increment 0-30 cm. Clearly, these ad hoc methods run the risk of losing information or relying heavily on implicit assumptions. </p><p>Another technique that might seem quite natural would be to ignore the nonmatching problem by assigning Y values to </p><p>the midpoints of the increments. Such midpoint assignment can lead to substantial bias, as the following numerical exper iment illustrates. In what follows, we convert totals to aver </p><p>ages by dividing by the increment width. We estimated a simple parametric model for the tillage data, accounting for increment </p><p>averaging but ignoring other complexities, yielding the fitted </p><p>model </p><p>Y = ?[? f2(ao </p><p>+ ^A </p><p>dt + , { } iid N(0, a2) d2-d\ Jdl V 1+r/ </p><p>for increment \d\,d2), where oto = ?.17, ai = 3.32, and </p><p>a2 = .17. Then, using the fitted model as the true model, we </p><p>simulated 10,000 realizations of the tillage data, using the ac </p><p>tual increments appearing in the dataset. For each realization, we fitted the hyperbolic model by ordinary least squares, using </p><p>midpoint assignment instead of increment averaging, yielding E[ao] = </p><p>? 17 and E[?i] = 4.21. That is, the intercept esti </p><p>mator is unbiased, but the slope estimator under midpoint as </p><p>signment has &gt;25% relative bias in this illustration. Midpoint </p><p>assignment is analogous to covariate measurement error, which </p><p>typically leads to biased and inconsistent estimators (see, e.g., sec. 1.1.1 of Fuller 1987). </p><p>Our alternative approach to analyzing such increment data </p><p>starts by recognizing that the recorded value represents not the </p><p>instantaneous value at that depth, but rather the total or aver </p><p>age over the increment at that depth. In Section 2 we develop a </p><p>novel semiparametric mixed model for the increment-averaged data. The model allows for parametric fixed effects to represent covariate effects, random effects to capture within-core corre </p><p>lation, and an integrated, smooth function to describe effects </p><p>of depth. The random effects may be further modeled as in </p><p>tegrated realizations of stochastic processes; we use a low-rank </p><p>approximation for the increment-averaged stochastic processes. The model is formulated so that the instantaneous depth func </p><p>tion is an additive, smooth function with components estimated </p><p>using penalized splines. Variance components and the smooth </p><p>ing parameters for the spline components are estimated using restricted maximum likelihood (REML). We give details of </p><p>the estimation methods in Section 3. We then apply this new </p><p>methodology in Section 4 to estimate the effects of tillage prac tice on carbon sequestration in agricultural soils. We follow </p><p>with a brief discussion in Section 5. The methods developed here are widely applicable to soil or sediment core sample data </p><p>and may have applications in other contexts (such as estimated </p><p>vertical profiles of stratospheric ozone, as suggested by the as </p><p>sociate editor). </p><p>This content downloaded from on Sat, 14 Jun 2014 07:44:10 AMAll use subject to JSTOR Terms and Conditions</p><p></p></li><li><p>Breidt, Hsu, and Ogle: Semiparametric Mixed Models 805 </p><p>2. SEMIPARAMETRIC MIXED MODEL FOR INCREMENT-AVERAGED DATA </p><p>2.1 Model Specification </p><p>Assume that the sample consists of m independent paired studies and that the ith study has n? increments {[dij-\, dij)}^{, where dij- \ and dy indicate the top and bottom bounds of the </p><p>ijth datum. Let F/y denote the increment average in the jth in crement from the ith study, and assume that it satisfies the fol </p><p>lowing model: </p><p>Yij = xjj? + ??- / g(t; W/) dt + b?u? + iJ9 (1) </p><p>dij dij-\ Jdij-i </p><p>where ? is a vector of unknown regression coefficients; Xy and </p><p>w/ are known covariate vectors; g(t; w/) is a smooth function </p><p>of depth, t\ W[ are iid N(0, (J^Ilxl) vectors of random effects; and ij are iid N(0, a2) errors, independent of u/. The vector </p><p>by is fixed but may depend on unknown covariance parameters, </p><p>as described in Section 2.3. As the notation suggests, w? is a </p><p>characteristic of the study that is not increment-specific (e.g., </p><p>soil type, climate factors, and number of years since manage </p><p>ment change in our example). Increment-specific effects can be </p><p>incorporated without loss of generality in xj-?, rather than in g, where they might violate the assumed smoothness. </p><p>The model in (1) is a semiparametric mixed model, similar in spirit to that of Zhang, Lin, Raz, and Sowers (1998); see also the references given by Ruppert, Wand, and Carroll (2003, sec. 9.4). Due to the increment averaging, linear functionals of </p><p>g, not g itself, underlie the observations. Engle, Granger, Rice, </p><p>and Weiss (1986) considered a similar problem using cross validated smoothing splines (see also Wahba 1990). Our ap proach is based on penalized splines, with penalties selected </p><p>automatically using REML. It is easily implemented using stan dard software. </p><p>2.2 Integrated Nonparametric Function Specification </p><p>Let w/ = (w/i,..., Wiq)7. We assume that g(t; w?) is an addi </p><p>tive varying-coefficient model (Hastie and Tibshirani 1993), </p><p>q </p><p>g(?;w/) = ]Ta?(r)wtf, i=\ </p><p>where the component smooth functions are allowed to be </p><p>splines, although polynomials with lower degrees of freedom are special cases. For these splines, we use the truncated power </p><p>basis with a common set of fixed knots, k\ &lt; &lt; kk (although other choices of basis or knots could be easily incorporated), </p><p>K </p><p>a?(t) = a0i + ant H-h ap?f + ^ aki(t - </p><p>Kk)p+, (2) k=\ </p><p>where (t)+ = f if t &gt; 0 and 0 otherwise and p is the degree </p><p>of the spline. Here the o^'s are fixed, unknown coefficients, </p><p>whereas the au are iid N(0, o2?) random effects. If the num </p><p>ber of knots K is sufficiently large, then the class of functions </p><p>at (t) is very large and can approximate most smooth functions with a high degree of accuracy. For a2? </p><p>= co, the splines would </p><p>be piecewise polynomials and would require fitting of a large number of parameters. We rule out this case by considering </p><p>?ai &lt; ??' m wmcn me spline coefficients au are shrunken to </p><p>ward 0, resulting in a smooth yet parsimonious fit. T...</p></li></ul>


View more >