back transformation

7
1 Plotting Johnson’s S B Distribution using a new parameterization Keith Rennolls 1 and Mingliang Wang 1,2 , 1 CMS, University of Greenwich, London, UK. 2 Chinese Academy of Forestry, Beijing, PRC. [ k.rennolls, m.wang]@gre.ac.uk Overview The S B distribution is widely used in forestry to represent the empirical distributions of forest tree variables such as diameter, height and volume. The parametric form of the S B model that has invariably been used is the form originally put forward by Johnson, in the 1949 paper in which he introduced the S B distribution. A more “natural” Parameterization of S B is suggested in a CJFR paper, (in press). This contribution simply uses this prameterization to produce plots of the S B density by varying the model parameters, in the new parameterization. This is done in Excel, and the spreadsheet, available from www.forestmodelarchive.info, can be used pedagogically to develop a feel for the S B as a distribution model. The theory will appear in a forthcoming CJFR paper, and further discussion of the properties of the S B in the new parameterization will appear in FBMIS, shortly. 1 Introduction Hafley and Schreuder (1977) first introduced the four parameter Johnson’s S B distribution into forest literature, and since then it has been widely used in forest diameter (and height) distribution modelling (Hafley and Buford 1985, Knoebel and Burkhart 1991, Zhou and McTague 1996, Kamziah et al. 1999, Li et al. 2002, Scolforo et al. 2003, Zhang et al. 2003). 2. An Alternative Parameterization of the Johnson S B distribution Essentially, the S B distribution is transformed to normality by the logit transformation, and by analogy with the log-normal distribution (as the distribution transformed to normality by the log transformation) might well have been named the logit-normal distribution. 2.2 An Inverse definition and a new parameterization. (i)Let ~N(0, 1). Scale to u , by: z z z u + = δ δ γ 1 ; 0 > δ (4) So, ~N(-g/d,1/d u 2 ). It is the parameterization of this scaling transformation (corresponding to the affine transformation of (iii) above) that seems rather “unnatural” to us. (ii)Apply a standard logistic transformation to u to give , in the (0, 1) range: y

Upload: joanne-wong

Post on 23-Dec-2015

2 views

Category:

Documents


1 download

DESCRIPTION

Transformation

TRANSCRIPT

Page 1: Back Transformation

1

Plotting Johnson’s SB Distribution using a new parameterization

Keith Rennolls1 and Mingliang Wang1,2, 1CMS, University of Greenwich, London, UK. 2Chinese Academy of Forestry, Beijing, PRC.

[ k.rennolls, m.wang]@gre.ac.uk

Overview

The SB distribution is widely used in forestry to represent the empirical distributions of forest tree variables such as diameter, height and volume. The parametric form of the SB model that has invariably been used is the form originally put forward by Johnson, in the 1949 paper in which he introduced the SB distribution. A more “natural” Parameterization of SB is suggested in a CJFR paper, (in press). This contribution simply uses this prameterization to produce plots of the SB density by varying the model parameters, in the new parameterization. This is done in Excel, and the spreadsheet, available from www.forestmodelarchive.info, can be used pedagogically to develop a feel for the SB as a distribution model. The theory will appear in a forthcoming CJFR paper, and further discussion of the properties of the SB in the new parameterization will appear in FBMIS, shortly. 1 Introduction Hafley and Schreuder (1977) first introduced the four parameter Johnson’s SB distribution into forest literature, and since then it has been widely used in forest diameter (and height) distribution modelling (Hafley and Buford 1985, Knoebel and Burkhart 1991, Zhou and McTague 1996, Kamziah et al. 1999, Li et al. 2002, Scolforo et al. 2003, Zhang et al. 2003). 2. An Alternative Parameterization of the Johnson SB distribution Essentially, the SB distribution is transformed to normality by the logit transformation, and by analogy with the log-normal distribution (as the distribution transformed to normality by the log transformation) might well have been named the logit-normal distribution. 2.2 An Inverse definition and a new parameterization. (i)′ Let ~N(0, 1). Scale to u , by: z z

zu

+

−=

δδγ 1 ; 0>δ (4)

So, ~N(-g/d,1/du 2). It is the parameterization of this scaling transformation (corresponding to the affine transformation of (iii) above) that seems rather “unnatural” to us. (ii)′ Apply a standard logistic transformation to u to give , in the (0, 1) range: y

Page 2: Back Transformation

2

)exp(1

1u

y−+

= (5)

(iii)′ Scale to y x , with range λ , and minimum ξ :

yx λξ += ; 0>λ (6) Though the affine transformation given in (3) is a natural choice in mathematics, we see, when it is re-expressed as a scaling transformation in (4), that it is not the form of transformation that is statistically ‘natural’. The natural scaling transformation would be: zu δγ ′+′= (7) so that u ~N(g′, d′2) (ªN(m, s2)) where

δσδ 1)( =≡′

(8)

δγµγ −=≡′ )( (9)

2.3 SB as a General Logistic Transformation of N(0,1) We may look upon the SB distribution as being obtained in a number of alternative but equivalent ways. First, combining these re-parameterized transformations we obtain:

))(exp(1 zx

σµλ

ξ+−+

+= (10)

a four-parameter logistic transformation of the standard normal z which reveals the transformational simplicity of the SB distribution. A similar model is used in Item Response Theory of psychological testing as the 4-parameter Rasch model. The only parameter constraint in (10) is that s, as a standard-deviation parameter, should be positive. If l is also positive then x is monotonic increasing with z. If l is negative then x is monotonic decreasing with z, and x becomes the upper boundary parameter. The signs of parameters m and l determine the sign of the skew of SB, with sign(skew(SB)) = -sign(l)sign(m) so that sign( l) and sign(m) both positive yields a negative skew SB, etc… We note also, that (i)′, the scaling up from N(0,1), could be dropped if we just started the construction with a N(m,s2). See (11) in Section 3.

Page 3: Back Transformation

3

Alternatively, and equivalently, we could retain the start of the construction with N(0,1), drop the scaling up to N(m,s2), but apply a simple-linear-logistic regression model,

)))(exp(1(1 zx σµ +−+= , to N(0,1), finally scaling up to the range (x, x+l). Fi FigthrSB Tha d(sine FigseeW

gure 1. Construction of SB from a 3-parameter logistic transformation on N(m,s2.).

ure 1 illustrates the construction of SB by transformation: from a N(0,1) on the real x-axis, ough N((m,s2), followed by the transformation by ))exp(1( xy −++= λξ (in blue), to the on the y-axis (in red).

e constructed SB is also plotted (red-dashed) on the x-axis for comparison purposes. With such iagram it is easy to see that SB approaches the Log-Normal (with positive skew) as m/s Ø -¶, nce the lower tail of the logistic is asymptotically exponential), while the Log-Normal with gative skew is obtained from m/s Ø ¶.

ure 2 illustrates some fits to empirical data sets using maximum likelihood estimation. It is m that there is a range of shapes, including both positive and negative skew. See Rennolls and

ang (2005, in press, CJFR) for details of fitting methods.

Page 4: Back Transformation

4

Plot 308

0

5

10

15

20

25

30

8 10 12 14 16 18 20 22 24 26 28 30

DBH(cm)

N

ObservedFitted

Plot 319

024681012141618

10 12 14 16 18 20 22 24 26 28 30 32 36

DBH(cm)

N

ObservedFitted

Figure 2. Histograms of diameter data for two Changbai larch sample plots and the fitted Johnson’s SB frequency curves. The mid-class diameters are given. Johnson has demonstrated theoretically that SB can approaches the lower-limit line asymptotically, and hence can take bimodal shapes. It is fairly difficult to see this intuitively. However, use of parameters: (m, s, x, l) = (1, 3, 1, 2) provides the illustration of Figure 3.

Figure 3. SB demonstrating a bimodal shape, near the lower-limit line in (skew, kutosis) shape-space, (i.e. Figure 4).

Page 5: Back Transformation

5

3 Generating and Plotting the SB pdf The Excel spreadsheet SB_Graph.xls provides a simple and easy means of displaying the pdf of the Johnson’s SB distribution. The calculation formulae are shown in Table 1.

Table 1. Spreadsheet calculations for the SB. The parameters mu and sigma correspond to the mean m and the standard deviation s of the initial scaling transformation from N(0,1). Hence the initial scaling is to N(1,0.25) in Table 1. The parameters zeta and lambda are the lower bound, x, and the range parameter, l, of the final scaling of the SB to its bounded range. The parameters are 1 and 2 respectively in table 1, and hence the SB distribution is defined to lie between 1 and 3. In this demo software, the range of the original scale, (in the new paramerization framework, i.e. the scale including the sources N(0,1)) has been set from -6 to +6, in steps of 0.01. y1 is an evaluation of the standard normal density, and this is shown plotted and labelled in Figure 1. y2 is the calculated values of the scaled N(0,1) distribution, i.e. N(mu, sigma2), also shown in Figure 1. “logisticy” is the transformation function N(mu, sigma2) to SB. This is shown plotted as a logistic curve with lower asymptote zeta, and upper asymptote value (zeta +lambda). As pointed out above, the SB density f(y) is obtained by application of the transformation:

)1( xe

y−+

+=λξ (11)

to f(x) = N(m,s2), the Normal density. Then the density f(y) is given by:

1

)()(−

⋅=dxdyxfyf (12)

Page 6: Back Transformation

6

From (11) we obtain

x

x

ee

dxdy

−− +=

λ

21 )1( (13)

which gives the formula for sbdens in Table 1. To obtain the SB distribution as a transformation distribution, plotted on the y-axis, we plot ylogistic values as the “y-values”, and sbdens values as the “x-values”. Interestingly (for Excel plotting enthusiasts) as the source x ranges from -6 to +6, the sbdens values transcribe the two sides of the SB density on the y-axis (full red line in Figure 1, shown on the LHS of the y-axis for display convenience). Finally, the SB density is plotted on the original source x-axis (shown red dotted) by transposition of the (ylogistic, sbdens) values. 4. References Assmann, E. 1970. The principles of forest yield study. Pergamon, Oxford. Cox, D.R., and Hinkley, D.V. 1974. Theoretical statistics. Chapman and Hall, London. Hafley W.L., and Buford, M.A. 1985. A bivariate model for growth and yield prediction. For.

Sci. 31:237-247. Hafley, W.L., and Schreuder H.T. 1977. Statistical distributions for fitting diameter and height

data in even-aged stands. Can. J. For. Res. 7:481-487. Johnson, N. L. 1949a. Systems of frequency curves generated by methods of translation.

Biometrika 36:149-176. Johnson, N.L. 1949b. Bivariate distributions based on simple translation systems. Biometrika

36:297-304. Johnson, N. L., and Kotz, S. 1970. Continuous univariate distributions (2 volumes). Houghton

Mifflin, New York. Kamziah, A.K., Ahmad M.I., and Lapongan J. 1999. Nonlinear regression approach to estimating

Johnson SB parameters for diameter data. Can. J. For. Res. 29:310-314. Knoebel, B.R., and Burkhart, H.E. 1991. A bivariate distribution approach to modelling forest

diameter distributions at two points of time. Biometrics 47:241-253. Li, F., Zhang L., and Davis C.J. 2002. Modeling the joint distribution of tree diameters and

heights by bivariate generalized beta distribution. For. Sci. 48(1):47-58. Mathsoft, 1999. S-plus 2000 guide to statistics, Vol.1. Data analysis products division, Mathsoft,

Inc., Seattle, Washington. Rennolls, K. and Wang, M. 2005. Extreme Value Regression for Estimation of the lower bound

of a Diameter Distribution. Ross, G.J.S. 1990. Nonlinear estimation. New York : Springer-Verlag. Schreuder, H.T., Bhattacharya, H.T., and McClure, J.P. 1982a. Towards a unified distribution

theory for stand variables using SBBB distribution. Biometrics 38:137-142.

Page 7: Back Transformation

7

Schreuder, H.T., Bhattacharya, H.T., and McClure, J.P. 1982b. The SBBB distribution: a potentially useful trivariate distribution. Can. J. For. Res. 12:641-645.

Schreuder, H.T., and Hafley, W.L. 1977. A useful bivariate distribution for describing stand structure of tree heights and diameters. Biometrics 33:471-478.

Scolforo, J.R.S., Tabai, F.C.V., Macedo, R.L.G., Acerbi, F.W., and Assis, A.L. 2003. SB distribution's accuracy to represent the diameter distribution of Pinus taeda, through five fitting methods. For. Ecol. Manage. 175:489-496.

Tewari, V.P., and Gadow, K.V. 1997. Fitting a bivariate distribution to diameter–height data of forest trees. Indian Forester 123:815-820.

Tewari, V.P., and Gadow, K.V. 1999. Modelling the relationship between tree diameters and heights using SBB distribution. For. Ecol. Manage. 119:171-176.

Wang, M., and Rennolls, K. 2004b. Truncated distribution modelling with tree diameter data. Wang, M., and Rennolls, K. 2004c. Bivariate Distribution Modelling with Tree Diameter and

Height Data. Wang, M., and Rennolls, K. 2004d. Diameter Distribution Modelling with the Logit-Logistic

Distribution. Zhang, L., Packard, K.C., and Liu, C. 2003. A comparison of estimation methods for fitting

Weibull and Johnson’s SB distributions to mixed spruce-fir stands in northeastern North America. Can. J. For. Res. 33:1340-1347.

Zhou, B., and McTague J.P. 1996. Comparison and evaluation of five methods of estimation of the Johnson system parameters. Can. J. For. Res. 26:928-935.