2.1 introductiontothetreelistgenerationdatabase............ 21 2.1.1...

TREE LIST GENERATION

DATABASE

USER’S GUIDE AND

REFERENCE MANUALKEVIN R. GEHRINGER

ERIC C. TURNBLOM

STAND MANAGEMENT COOPERATIVE

SMC WORKING PAPER NUMBER 3

AUGUST 2001

COLLEGE OF FOREST RESOURCES

UNIVERSITY OF WASHINGTON

BOX 352100

SEATTLE, WASHINGTON 98115-2100

TABLE OF CONTENTS

List of Figures vi

List of Tables ix

Disclaimer xii

Chapter 1: Introduction 1

1.1 A conceptual overview of the tree list generation process . . . . . . . 3

1.1.1 Weibull parameter recovery stand generation requirements . . 4

1.1.2 Tree list generation database stand generation requirements . 6

1.2 Conventions and notations . . . . . . . . . . . . . . . . . . . . . . . . 10

1.3 System requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

1.3.1 System requirements for standard distribution . . . . . . . . . 12

1.3.2 System requirements for developer distribution . . . . . . . . . 12

1.3.3 Contents of standard distribution . . . . . . . . . . . . . . . . 12

1.3.4 Contents of developer distribution . . . . . . . . . . . . . . . . 12

1.4 Limitations of the tree list generation database software . . . . . . . . 13

1.5 Installation instructions . . . . . . . . . . . . . . . . . . . . . . . . . 15

1.5.1 Installation of the standard distribution . . . . . . . . . . . . 15

1.5.2 Installation of the developer distribution . . . . . . . . . . . . 17

1.6 Software bug reporting and contact addresses . . . . . . . . . . . . . 19

Chapter 2: Tree List Generation Database User’s Manual 21

i

2.1 Introduction to the tree list generation database . . . . . . . . . . . . 21

2.1.1 Tree list generation database naming conventions . . . . . . . 25

2.1.2 Tree list generation database files and directory structure . . . 27

2.1.3 Tree list generation database units conversions . . . . . . . . . 27

2.2 Creating a tree list generation database . . . . . . . . . . . . . . . . . 30

2.2.1 The TGNEW program . . . . . . . . . . . . . . . . . . . . . . . . 30

2.2.2 TGNEW example: Creating a tree list generation database . . . 33

2.3 Adding measurement files to a tree list generation database . . . . . . 33

2.3.1 The TGADD program . . . . . . . . . . . . . . . . . . . . . . . . 34

2.3.2 TGADD examples: Adding data to a tree list generation database 38

2.4 Creating simulated tree lists using a tree list generation database . . 39

2.4.1 The TGRAND program . . . . . . . . . . . . . . . . . . . . . . . 42

2.4.2 TGRAND examples: Generating simulated stands or tree lists . . 47

2.5 Summarizing a tree list generation database . . . . . . . . . . . . . . 48

2.5.1 The TGSUMRY program . . . . . . . . . . . . . . . . . . . . . . 53

2.5.2 TGSUMRY example: Summarizing a tree list generation database 55

2.6 Locking and unlocking a tree list generation database . . . . . . . . . 56

2.6.1 The TGLOCK program . . . . . . . . . . . . . . . . . . . . . . . 56

2.6.2 TGLOCK example: Locking a tree list generation database . . . 59

2.6.3 The TGUNLOCK program . . . . . . . . . . . . . . . . . . . . . . 59

2.6.4 TGUNLOCK example: Unlocking a tree list generation database . 62

2.7 Exploding a tree list generation database . . . . . . . . . . . . . . . . 62

2.7.1 The TGXPLODE program . . . . . . . . . . . . . . . . . . . . . . 63

2.7.2 TGXPLODE example: Exploding a tree list generation database . 67

2.8 Creating a scatterplot file from a tree list generation database . . . . 67

2.8.1 The TGSCATRP program . . . . . . . . . . . . . . . . . . . . . . 68

ii

2.8.2 TGSCATRP examples: Creating scatterplot files . . . . . . . . . 71

2.9 Tree list generation database software error messages . . . . . . . . . 72

2.10 Data coverage in the tree list generation database: tgdb1r00 . . . . . 74

2.10.1 Data coverage summary for untreated stands . . . . . . . . . . 77

2.10.2 Data coverage summary for thinned stands . . . . . . . . . . . 79

2.11 File types, file formats, and keyword definitions . . . . . . . . . . . . 79

2.11.1 Stand measurement file . . . . . . . . . . . . . . . . . . . . . . 82

2.11.2 Stand description file . . . . . . . . . . . . . . . . . . . . . . . 87

2.11.3 Random stand measurement file . . . . . . . . . . . . . . . . . 95

2.11.4 Species mapping file . . . . . . . . . . . . . . . . . . . . . . . 97

2.11.5 Schema file . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

2.11.6 Index paramter keyword definitions . . . . . . . . . . . . . . . 107

Chapter 3: Methods, Data, Testing and Validation 119

3.1 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

3.1.1 Stand classification procedure . . . . . . . . . . . . . . . . . . 123

3.1.2 Nearest neighbor stand selection procedure . . . . . . . . . . . 125

3.1.3 Stand representation . . . . . . . . . . . . . . . . . . . . . . . 127

3.1.4 Tree list generation procedure . . . . . . . . . . . . . . . . . . 129

3.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

3.3 Testing and validation . . . . . . . . . . . . . . . . . . . . . . . . . . 138

3.3.1 Goodness of fit testing methods and criteria . . . . . . . . . . 140

3.3.2 The piae-value as a test statistic . . . . . . . . . . . . . . . . . 144

3.3.3 Goodness of fit testing results for untreated stands . . . . . . 161

3.3.4 Goodness of fit testing results for thinned stands . . . . . . . . 172

3.3.5 Software testing procedures . . . . . . . . . . . . . . . . . . . 183

3.4 Conclusions, caveats and other considerations . . . . . . . . . . . . . 185

iii

3.4.1 How conservative are the total error rates? . . . . . . . . . . . 186

3.4.2 Simulating forest stands using a tree list generation database . 192

3.4.3 Limitations of the tree list generation database . . . . . . . . 195

Chapter 4: Tree List Generation Database Programmer’s Reference197

4.1 Tree list generation database subroutine library . . . . . . . . . . . . 197

4.1.1 Subroutines for reading and writing measurement files . . . . . 198

4.1.2 Subroutines for reading and writing description files . . . . . . 198

4.1.3 Subroutines for reading and writing schema files . . . . . . . . 199

4.1.4 Binary search tree subroutines . . . . . . . . . . . . . . . . . . 199

4.1.5 Bucket data mapping subroutines . . . . . . . . . . . . . . . . 201

4.1.6 Tree measurement data subroutines . . . . . . . . . . . . . . . 202

4.1.7 Database information subroutines . . . . . . . . . . . . . . . . 203

4.1.8 Unit conversion subroutines . . . . . . . . . . . . . . . . . . . 204

4.1.9 Index parameter keyword subroutines . . . . . . . . . . . . . . 205

4.1.10 Species mapping subroutines . . . . . . . . . . . . . . . . . . . 206

4.1.11 Stand type subroutines . . . . . . . . . . . . . . . . . . . . . . 207

4.1.12 Index parameter subroutines . . . . . . . . . . . . . . . . . . . 208

4.1.13 Stand origin subroutines . . . . . . . . . . . . . . . . . . . . . 208

4.1.14 Treatment type subroutines . . . . . . . . . . . . . . . . . . . 209

4.1.15 Miscellaneous supporting subroutines . . . . . . . . . . . . . . 209

4.2 Future extensions and enhancements . . . . . . . . . . . . . . . . . . 212

Glossary 218

Bibliography 222

Appendix A: TGSUMRY output for tgdb1r00 226

iv

Appendix B: Tree Species Common and Latin Names 229

Appendix C: A Sample Tree List Generation Database Species Map-

ping File 231

Appendix D: A Sample Tree List Generation Database Schema File 234

Appendix E: A Tree List Generation Database Schema Template File240

Appendix F: The SPICE System - A Brief Overview 247

v

LIST OF FIGURES

2.1 Tree list generation database file and directory structure . . . . . . . 28

2.2 TGNEW command line syntax . . . . . . . . . . . . . . . . . . . . . . . 31

2.3 TGNEW example: Creating a tree list generation database . . . . . . . 33

2.4 TGADD command line syntax: Single stand measurement file . . . . . . 35

2.5 TGADD command line syntax: Multiple stand measurement files . . . . 35

2.6 TGADD example 1: Adding a single stand measurement file . . . . . . . 39

2.7 TGADD example 2: Adding multiple stand measurement files . . . . . . 39

2.8 Stand measurement file listing file for TGADD example 2 . . . . . . . . 40

2.9 TGRAND command line syntax: Single stand description file . . . . . . 43

2.10 TGRAND command line syntax: Multiple stand description files . . . . 44

2.11 TGRAND example 1: Simulating an untreated stand . . . . . . . . . . . 48

2.12 Stand description file for TGRAND example 1 . . . . . . . . . . . . . . . 49

2.13 Untreated stand height vs. diameter plot for TGRAND example 1 . . . . 50

2.14 TGRAND example 2: Simulating a thinned stand . . . . . . . . . . . . . 50

2.15 Stand description file for TGRAND example 2 . . . . . . . . . . . . . . . 51

2.16 Thinned stand height vs. diameter plot for TGRAND example 2 . . . . . 52

2.17 TGSUMRY command line syntax . . . . . . . . . . . . . . . . . . . . . . 53

2.18 TGSUMRY example: Summarizing a tree list generation database . . . . 56

2.19 TGLOCK command line syntax . . . . . . . . . . . . . . . . . . . . . . 57

2.20 TGLOCK example: Locking a tree list generation database . . . . . . . 59

2.21 TGUNLOCK command line syntax . . . . . . . . . . . . . . . . . . . . . 60

2.22 TGUNLOCK example: Unlocking a tree list generation database . . . . . 62

vi

2.23 TGXPLODE command line syntax . . . . . . . . . . . . . . . . . . . . . 64

2.24 TGXPLODE example: Exploding a tree list generation database . . . . . 67

2.25 TGSCATRP command line syntax . . . . . . . . . . . . . . . . . . . . . 69

2.26 TGSCATRP example 1: An untreated stand scatterplot file . . . . . . . 71

2.27 TGSCATRP example 2: A thinned stand scatterplot file . . . . . . . . . 72

2.28 Error message format . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

2.29 Example error message from TGRAND . . . . . . . . . . . . . . . . . . 74

2.30 Stand measurement file template for untreated stands . . . . . . . . . 84

2.31 Stand measurement file template for thinned stands . . . . . . . . . . 85

2.32 Required index parameters for untreated stands . . . . . . . . . . . . 88

2.33 Required index parameters for thinned stands . . . . . . . . . . . . . 89

2.34 Example stand description file for untreated stands . . . . . . . . . . 90

2.35 Example stand description file for thinned stands . . . . . . . . . . . 91

2.36 Random stand measurement file example . . . . . . . . . . . . . . . . 96

2.37 Template for schema file treatment types . . . . . . . . . . . . . . . . 103

2.38 Template for schema file index parameters for a treatment . . . . . . 104

2.39 Template for schema file index parameter attributes . . . . . . . . . . 104

3.1 Site location map for the stand measurement data. . . . . . . . . . . 135

3.2 True t-test total error and piae-value for mean difference of 0 units. . . 148



3.5 True t-test total error and piae-value v.s difference in means. . . . . . 151

3.6 Histogram and empirical distribution function for piae-values . . . . . 154

3.7 Histogram and empirical distribution function for t-test p-values . . . 155

3.8 Nonparametric density function estimates for a good piae-value . . . . 157

3.9 Nonparametric density function estimates for a bad piae-value . . . . 158

vii

3.10 Histogram of DBH piae-values for untreated stands . . . . . . . . . . 164

3.11 Histogram of height piae-values for untreated stands . . . . . . . . . . 165

3.12 Histogram of species composition piae-values for untreated stands . . 166

3.13 Plot of simulated vs. actual QMD for untreated stands . . . . . . . . 167

3.14 Plot of simulated vs. actual mean height for untreated stands . . . . . 169

3.15 Overall QMD distributions for actual and simulated stands . . . . . . 170

3.16 Overall mean height distributions for actual and simulated stands . . 171

3.17 Histogram of DBH piae-values for thinned stands . . . . . . . . . . . . 174

3.18 Histogram of height piae-values for thinned stands . . . . . . . . . . . 175

3.19 Histogram of species composition piae-values for thinned stands . . . . 176

3.20 Plot of simulated vs. actual QMD for thinned stands . . . . . . . . . 178

3.21 Plot of simulated vs. actual mean height for thinned stands . . . . . . 179

3.22 Overall QMD distributions for actual and simulated stands . . . . . . 181

3.23 Overall mean height distributions for actual and simulated stands . . 182

3.24 Histogram of DBH piae-values for actual stands . . . . . . . . . . . . 189

3.25 Plot of DBH piae-values vs. breast height age for actual stands . . . . 190

3.26 Plot of density function estimates for 10 simulations of the same stand 193

viii

LIST OF TABLES

2.1 Tree list generation database file descriptions . . . . . . . . . . . . . . 28

2.2 Measurement unit equivalents for Imperial and metric units. . . . . . 29

2.3 Tree list generation modes for TGRAND . . . . . . . . . . . . . . . . . . 41

2.4 Stand type summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

2.5 Stand origin summary for untreated stands . . . . . . . . . . . . . . . 76

2.6 Stand origin summary for thinned stands . . . . . . . . . . . . . . . . 76

2.7 Untreated stand coverage summary for stand type PURE DF . . . . . . 77

2.8 Untreated stand coverage summary for stand type PURE WH . . . . . . 77

2.9 Untreated stand coverage summary for stand type DF DOMINANT . . . 77

2.10 Untreated stand coverage summary for stand type WH DOMINANT . . . 78

2.11 Untreated stand coverage summary for stand type MIXTURE . . . . . . 78

2.12 Thinned stand coverage summary for stand type PURE DF . . . . . . . 79

2.13 Thinned stand coverage summary for stand type PURE WH . . . . . . . 79

2.14 Thinned stand coverage summary for stand type DF DOMINANT . . . . 80

2.15 Thinned stand coverage summary for stand type WH DOMINANT . . . . 80

2.16 Thinned stand coverage summary for stand type MIXTURE . . . . . . . 81

2.17 Tree list generation database species mapping . . . . . . . . . . . . . 101

3.1 List of data sources for the tree list generation database . . . . . . . . 135

3.2 Stand type summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 136

3.3 True t-test total error and piae-value for mean differences from 0 to 7 152

3.4 Tree list generation testing scenarios . . . . . . . . . . . . . . . . . . 160

ix

3.5 Misclassification rates for untreated stands by scenario . . . . . . . . 163

3.6 Linear regression coefficients for simulated vs. actual QMD for un-

treated stands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

3.7 Linear regression coefficients for simulated vs. actual mean height for

untreated stands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

3.8 Overall QMD and mean height distribution piae-values for actual and

simulated stands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171

3.9 Misclassification rates for thinned stands by scenario . . . . . . . . . 175

3.10 Linear regression coefficients for simulated vs. actual QMD for thinned

stands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177

3.11 Linear regression coefficients for simulated vs. actual mean height for

thinned stands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179

3.12 Overall QMD and mean height distribution piae-values for actual and

simulated stands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180

3.13 Total error rates for different cutoff piae-values. . . . . . . . . . . . . . 191

3.14 piae-values for DBH, height, and species for 10 simulations of a partic-

ular stand. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194

B.1 Tree list generation database species codes and names . . . . . . . . . 229

x

ACKNOWLEDGMENTS

The authors wish to thank the Stand Management Cooperative at the University

of Washington for providing the funding and data for this project. Specifically, we

thank Randol Collier and John Haukaas of the SMC for a wide variety of services

throughout the duration of this project. We would also like to thank Dave Marshall,

U.S. Forest Service, Pacific Northwest Research Station, Portland, for providing us

with much of the initial data that were used to create the tree list generation database,

tgdb1r00.

We would also like to thank the reviewers of the tree list generation database

documentation: Edith C. Sonne, University of Washington, Jim Flewelling, himself,

and Temesgen Hailemariam, University of British Columbia. Their comments were

most helpful. Any errors remaining in the documentation are attributable only to the

authors, after all, we wrote it.

We would also like to thank the Navigation Ancillary Information Facility (NAIF)

at NASA’s Jet Propulsion Laboratory for their impeccable work in developing and

maintaining the SPICE system, an extensive, robust, and portable Fortran subroutine

library, which was integral to the development of the tree list generation database

software. In particular, we would like to thank Nathaniel Bachman, William Taber,

and Charles H. Acton for their support of, and interest in, a nontraditional, i.e.,

nonaerospace, application of their work.

xi

DISCLAIMER

THERE IS NO WARRANTY FOR THE SOFTWARE OR PROGRAMS PRO-

VIDED, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT

WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS OR

OTHER PARTIES PROVIDE THE PROGRAM ”AS IS” WITHOUT WARRANTY

OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT

LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND

FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE

QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD

THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NEC-

ESSARY SERVICING, REPAIR OR CORRECTION.

IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO

IN WRITING WILL ANY COPYRIGHT HOLDER BE LIABLE TO YOU FOR

DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CON-

SEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO

USE THE SOFTWARE OR PROGRAMS (INCLUDING BUT NOT LIMITED TO

LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUS-

TAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM

TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER

OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH

DAMAGES.

xii

1

Chapter 1

INTRODUCTION

Many modern forest stand growth models and forest stand simulation systems use

some form of tree list, nominally a set of compatible diameter at breast height (DBH)

and height measurements with an indication of tree species, to describe the stand

being modeled [11, 20, 8, 35, 36]. An appropriate initial tree list is required as input

for individual tree based stand simulation systems in order to accurately predict the

stand dynamics and the growth and yield. Tree lists obtained from measurements

of existing plots or stands, and for which complete individual tree measurements are

available to define the initial condition of a stand, are highly advantageous. With

actual tree list data for a stand, forest growth simulations may begin with the ground

truth. Unfortunately, in many situations the individual tree measurement data which

compose a tree list, or from which a tree list may be constructed, are not available

or appropriate, e.g., when only stand attributes are available or when actual stand

measurement data are not available for the particular management regime under

consideration. In order to support the use of tree list based growth and yield or stand

simulation models for a broad range of modeling and silvicultural situations, the

ability to generate a reasonable tree list from a minimal number of stand attributes,

e.g., site index, age, number of stems per unit area, etc., is highly desirable.

The use of tree lists as a primary input to forest stand simulators and growth

and yield models is likely to persist. Ongoing improvements in growth and yield

models and stand simulation systems are increasing the detail or resolution in models

2

and simulators in order to provide more realistic growth predictions and simulations

of stand dynamics. The increases in the detail represented in of the growth and

simulation models and in the precision of their outputs are important for economic

analyses of forest productivity on a shrinking land base with shorter rotations than

in the past, as well as for more ecologically oriented analyses, such as the restoration

or creation of habitat structures for wildlife via silvicultural manipulations of forest

stands. The models in current use are generally limited by the detail of the initial

stand descriptions. Thus, the ability to generate a simulated tree list which mimics the

aggregate characteristics of an actual stand is necessary to meet the tree list oriented

input requirements of the current and future generations of forest stand growth and

yield and simulation models.

A general procedure for creating simulated tree lists will now be briefly described.

The procedure is based on combining the representation of a forest stand as a mix-

ture distribution, composed of components representing the tree sizes and the species

composition of a stand, and a straightforward database design. The tree list gener-

ation procedure and its supporting software and data files will be referred to as a

tree list generation database. The version of the tree list generation database that is

described has been designed for managed, mixed-species stands of Douglas-fir (Pseu-

dotsuga menziesii) and western hemlock (Tsuga heterophylla) as dominant or codom-

inant species, and possibly with an appropriate mixture of other tree species found

throughout Oregon, Washington, and southern British Columbia, west of the Cascade

Mountains. Two treatment regimes for managed stands are supported by the tree list

generation database: untreated stands and thinned stands.

A conceptual overview of the tree list generation database is provided in the follow-

ing section, with complete details in Chapter 3. To demonstrate the effectiveness of

the proposed tree list generation method, a version of the tree list generation database

has been implemented in Fortran 95 and tested with good success. Comprehensive

results of the tree list generation database validation tests are presented in Chapter 3.

3

1.1 A conceptual overview of the tree list generation process

A tree list is a compatible set of diameter at breast height (DBH) and height mea-

surements for each tree in a stand with an indication of tree species. To generate a

tree list there are five requirements. First, a method for classifying forest stands so

that a simulated tree list may be customized , in some sense, to match a specified set

of stand level criteria or attributes, e.g., site index, quadratic mean diameter (QMD)

and trees per hectare or acre. Second, a method for selecting an appropriate stand

class, or set of stand classes, to be used for generating a simulated stand. Third,

an appropriate method for representing a forest stand which may be mixed species,

multi-aged, etc., in order to generate a tree list. Fourth, a method for generating a

tree list from a stand representation; this requirement is necessarily directly related

to the third requirement for an appropriate stand representation. Fifth, a wide va-

riety of stand measurement data sets, e.g., measurements of DBH and height with

species, that may be used to specify a tree list generation procedure once the first

four requirements are met.

These five requirements for generating simulated stands or tree lists may be

thought of as defining a database of forest stands that is indexed by a small set

of stand attributes. The formulation of the problem concisely defines a framework for

its solution, while simultaneously permitting a great deal of flexibility. Additionally,

the tree list generation problem and the problem of pseudorandom number genera-

tion are quite similar, so the framework of pseudorandom number generation may,

and should, be used to guide the development of a tree list generation procedure,

particularly since the issues involved in pseudorandom number generation are well

understood. In the context of pseudorandom number generation, the tree list genera-

tion problem may be thought of as the problem of generating a random stand of trees

using a small set of stand attributes as the seed of a random stand generation proce-

dure. The combination of these two ideas, a database of stand descriptions indexed

4

by a set of stand attributes and the framework of pseudorandom number generation,

forms the basis for the solution to the tree list generation problem that has been

implemented.

There are many possible ways of defining and implementing the first four require-

ments for generating simulated stands or tree lists, and the subsequent use of the stand

measurement data, of requirement five, to calibrate a particular stand generation pro-

cedure. The specific methods selected for the first four requirements of generating

simulated stands in the tree list generation database are now briefly introduced. For

comparison, the equivalent concepts or methods from the classical Weibull parame-

ter recovery method, generally used to represent and simulate diameter distributions

[5, 27, 3, 2], are first presented within this context.

1.1.1 Weibull parameter recovery stand generation requirements

The stand classification and stand selection procedures in the Weibull parameter re-

covery method for diameter distributions are both done implicitly via the multiple

regression procedures used to define the Weibull parameter prediction equations. The

stand classes and the stands selected for tree list generation are determined by the

estimated coefficient values of the Weibull parameter prediction equations. If a broad

geographic region is involved in the Weibull parameter recovery method, an explicit

classification may first be performed on the stands to reduce the typically large vari-

ances that are involved. Weibull parameter recovery procedures are then be applied to

each set of explicitly classified stands, producing a set of Weibull parameter prediction

equations for each separate class.

The stand representation in the Weibull parameter recovery method for diameter

distributions is an explicit, parametric representation that is based on the Weibull

probability density function, hence the name. The Weibull probability density func-

tion is defined by

5

f(x) =c

b

(x − a

b

)c−1

exp[−

(x − a

b

)c], a ≤ x < ∞

where a is the location parameter, b is the scale parameter, and c is the shape pa-

rameter [5]. In the Weibull parameter recovery method, the Weibull parameters are

estimated for the measured tree diameters to characterize their distribution for each

stand measurement data set. The scale and shape parameters are generally the only

Weibull parameters estimated, with the location parameter assumed to be zero. Once

the Weibull parameters have been estimated for each stand, multiple regression tech-

niques are applied to determine prediction equations for the Weibull parameters based

on a small number of stand attributes. The Weibull density function may take on

a wide variety of unimodal shapes, which is why it is commonly chosen to represent

diameter distributions. Unfortunately, the Weibull distribution is of rather limited

utility for representing forestry data. The Weibull distribution is strongly unimodal,

and therefore cannot adequately represent multimodal diameter distributions which

may occur, nor can it represent relatively flat diameter distributions. In addition,

tree species must be accounted for separately, as the Weibull distribution may only

be used to represent continuous distributions.

The tree list generation method used in the Weibull parameter recovery method

contains three steps. First, the attributes of the desired stand are used with the

Weibull parameter prediction equations to obtain a set of Weibull parameters. Second,

a Weibull random number generator is used with the parameters obtained in the first

step to generate a set of diameter measurements. Three, generate the tree species, if

desired, using some appropriate method.

As outlined above, the Weibull parameter recovery approach to generating tree

lists has two major drawbacks. First, if new data are to be added to update an

existing set of Weibull parameter prediction equations, the entire multiple regression

procedure must be repeated with both the original and new data. Second, there is

6

no clear way to extend the Weibull parameter recovery approach to directly include

tree heights and tree species. The tree list generation database has been developed

to overcome these difficulties, among others.

1.1.2 Tree list generation database stand generation requirements

The stand classification procedure used for the tree list generation database is an

explicit classification based on a multidimensional histogram. The dimensions of

the histogram are a set of stand attributes, e.g., site index, QMD, stand type, etc.,

and the histogram bins define the stand classes. Stands with similar attributes are

identified as belonging to the same multidimensional histogram bin or stand class. The

multidimensional histogram was chosen as the stand classification scheme primarily

for its statistical properties, its ease of implementation, and its overall simplicity

and flexibility. In particular, the histogram is a consistent estimator of a probability

density function for any data set and for arbitrary dimension; the histogram allows

a refinement of its classification if there are sufficient data to warrant the higher

resolution; the histogram does not require the reevaluation or reestimation of the

classification rules when new data are added.

The stand selection procedure used by the tree list generation database is an

explicit procedure which determines the stands most similar to a desired stand based

on a specified set of stand attributes, i.e., the nearest neighbor stands as indicated

by the set of stand attributes. A similarity function is used to compute similarity

scores between a specified set of desired stand attributes and the collection of stand

attributes defined by a tree list generation databse stand classification. The similarity

function has a value of zero if two sets of stand attributes are identical and increases

in value as the sets of stand attributes become more dissimilar. Stands similar to

the specified stand will have small similarity scores, scores near zero. The k smallest

similarity scores are used to select the collection of stands, or stand classes defined by

the multidimensional histogram bins. The k stands or stand classes selected from a

7

tree list generation database will then be used to form a canonical stand from which

a simulated tree list may be generated using the stand representations.

The stand representation chosen for the tree list generation database is an implicit,

nonparametric representation based on mixture distributions or densities. A forest

stand may be composed of many tree species, with one or more cohorts, or age classes,

within each species, and a variety of tree sizes within each species or cohort grouping.

This variety in forest stands is nicely captured by the notion of a finite mixture density

[33, 2, 26, 3]. A finite mixture density is a probability density function of the form

p(x | (α, φ)) =m∑

i=1

αipi(x, φi))

where each αi ≥ 0, i = 1, 2, . . . , m,∑m

i=1 αi = 1, and where each function pi is itself

a probability density function parameterized by a scalar or vector φi of parameters

[33, 26, 2]. The function p would represent the structure of a forest stand, with each

function pi representing, possibly, a different species, cohort, or size class and the

αi representing the proportion of the stand associated with that particular species,

cohort, or size class. The individual probability density functions pi may be the same

e.g., the Weibull density function [2, 3], or the normal density function, but this is not

necessary. Each pi could be specifically selected to suit a particular stand attribute.

Additionally, the pi need not be parametric density functions, like those mentioned,

they may be nonparametric density function estimates [28, 9, 10, 26, 24, 25] created

for each stand component.

Given this generic form for representing a forest stand, the mixture density may

be represented explicitly or implicitly. If the representation is explicit, parametric

or nonparametric probability density functions must be chosen for all of the compo-

nent densities. This involves determining the necessary mixing parameters, αi, and

determining values for the parameters of each parametric density function or non-

parametric density function used in the mixture density. The general problem of

estimating the parameters of a finite mixture distribution is well understood [26], so

8

an explicit representation of a stand as a mixture distribution is possible, though it

is not necessarily desirable. For example, the explicit mixture distribution approach

would severely limit the ability to add new data to a tree list generation database:

all of the parameters in the explicit representation would need to be reestimated each

time new data were added. Given this difficulty with the explicit representation,

among others [28, 33], an implicit representation of the mixture density was chosen

for the tree list generation database.

The implicit representation of a forest stand used by the tree list generation

database is a finite mixture distribution determined by the actual stand measure-

ment data, DBH and height measurements with an species for each tree. This type of

implicit representation of a forest stand is a nonparametric mixture distribution which

allows the stand measurement data to speak for itself ; the actual stand measurement

data provide the best, though not necessarily the most concise, representation of a

forest stand. Using the actual stand measurement data to represent a forest stand

automatically provides the potential for the generation of univariate stands, e.g., tree

diameters only, or multivariate stands, e.g., diameter, height, and species, as well as

capturing any multimodal characteristics of the forest stands. This approach also

provides the potential to generate truly commensurate diameters and heights, if both

are desired, with an appropriate species distribution, again if desired, when an ap-

propriate stand generation procedure is used with the data.

The tree list generation method used in the tree list generation database is based

upon a method for generating random vectors [31] to augment or appropriately fill

in sparse data sets. The random vector generation approach works directly from

actual data to generate simulated data points having the same, or nearly the same,

covariance structure as the original data. This method is incredibly well suited to the

task of generating simulated forest stands, the only requirement being an appropriate

list of representative tree measurements for a particular set of stand attributes. The

representative stand measurement data may be obtained, as described above, through

9

a combination of a stand classification procedure (requirement 1) and a stand selection

procedure (requirement 2). This method automatically guarantees that individual

tree diameters and heights will be compatible, and that the species composition for

a stand will be reasonable; they are derived from actual data.

In addition to the five requirements for the generation of simulated stands or tree

lists, there are two fundamental assumptions and a precise definition of a forest stand

that are necessary and will be used throughout this document.

Forest Stand A spatially contiguous group of trees and associated vegetation having

similar structures and growing under similar soil and climate conditions [22], or

a relatively homogeneous forested area.

Fundamental Assumption 1 (FA1) Forest stands, or sample plots, having similar

attributes, e.g., site index, QMD, species composition, etc., are similar.

Fundamental Assumption 2 (FA2) Attributes of forest inventory data collected

from a representative set of sample plots within a larger stand can be extended

to the whole stand.

The definition of a forest stand and both of the fundamental assumptions are in

widespread use within the science of forestry. The first assumption is a basic tenet

of forestry modeling, including growth and yield modeling, though it is frequently

not stated explicitly. The second assumption is commonly used in forest inventory

development; data from cruised sample plots are scaled up to represent the actual

inventory on larger homogeneous forested areas. The second assumption is also gen-

erally not stated explicitly, but is presumed by the statistical methods used for the

inventory sampling and analysis. These two fundamental assumptions and the defi-

nition of a forest stand are presented for clarity and completeness, since they provide

the motivation for the tree list generation procedures used in the tree list generation

database.

10

In developing the tree list generation procedures, the first fundamental assump-

tion, FA1, is used in its traditional sense, but the second fundamental assumption,

FA2, is not. FA2 is traditionally used to scale aggregate sample plot data up to

a larger area for inventory purposes. This assumption, however, must also be valid

when considering the individual trees on the inventory sample plots, and it is in this

sense that the assumption is used for the tree list generation database. Under this

interpretation, the plot data are scaled to a larger area by generating sufficient trees

to be representative of the larger area, that is, by populating the larger area with indi-

vidual trees. This interpretation of FA2 provides the basis for the tree list generation

procedures developed for the tree list generation database.

The concept of a forest stand , or stand of trees, is used in four distinct, but

compatible, ways within this documentation. First, forest stand may refer to an actual

stand of trees, as in the definition of a forest stand just given. Second, forest stand

may refer to the trees on a temporary or permanent sample plot that is assumed to be

representative of a larger area. Third, forest stand may refer to a tree list consisting of

DBH and height measurement with species, whether actually measured on a sample

plot or simulated, that is assumed to be representative of some actual stand of trees.

Fourth, forest stand may refer to a canonical stand, or multidimensional histogram

bin, from the stand classification in a tree list generation database. The usage in each

specific situation is consistent with the typical meaning and usage, and so should

cause no confusion. Reference to the specific context in which the concept of a forest

stand is used will be made if the particular cantext is important or to clarify any

potential ambiguity.

1.2 Conventions and notations

The following typographic conventions will be followed throughout the remainder of

this document.

11

1. Data file contents and source code when presented in the text will appear in a

teletype or fixed pitch font, e.g., SUBROUTINE XYZ(A,B,C).

2. Species common names, e.g., Douglas-fir, will appear in the standard font, and

species latin names, e.g., Pseudotsuga menziesii will appear in an italic font.

3. References to file names or program names will be in a teletype or fixed pitch

font, e.g., input.dat or TGNEW.

4. References to interest addresses , email addresses, and uniform resource locators

(URLs) will be in a teletype or fixed pitch font http://www.website.com or

emailhere.there.com.

5. In examples, input that a user would type on a command line will be in a teletype

or fixed pitch font preceded by a prompt indicator, e.g., PROMPT> dir *.dat.

Output from commands that would be displayed on the computer screen or shell

will appear in an italic teletype or fixed pitch font, e.g., File successfully

created .

6. A forward slash , ‘/’, is used as the path element separator throughout this

document for consistency. When using the tree list generation database soft-

ware, either a forward slash, ‘/’, or a backslash, ‘\’, may be used in filenames or

directory paths to separate the individual path elements. The software will sub-

stitute the correct path separation character for the computing platform being

used.

1.3 System requirements

Developed for windows 95/98/NT or later systems using pentium processors. The

executables may work on i486DX based systems, but these are considered to be an

12

obsolete system configuration. The programs have been compiled with pentium op-

timizations, so performance on i486DX based systems, if the programs execute, may

be degraded. The tree list generation database software uses dynamic memory allo-

cation, so there is no upper limit to the data base size, in theory, that may be created

and used. The recommended amount of memory is 32 megabytes, but more is always

better.

1.3.1 System requirements for standard distribution

1. Approximately 15 MB in a .zip file.

2. Approximately 30 MB uncompressed.

1.3.2 System requirements for developer distribution

1. Approximately 45 MB in a .zip file.

2. Approximately 90 MB uncompressed.

1.3.3 Contents of standard distribution

1. Tree list generation database executables.

2. The initial tree list generation database which supports untreated and thinned

stands of Douglas-fir Pseudotsuga menziesii , western hemlock Tsuga hetero-

phylla, or combinations of these two species.

3. Tree list generation database documentation.

1.3.4 Contents of developer distribution

1. Tree list generation database executables.

13

2. The initial tree list generation database which supports untreated and thinned

stands of Douglas-fir Pseudotsuga menziesii , western hemlock Tsuga hetero-

phylla, or combinations of these two species.

3. Tree list generation database documentation.

4. Tree list generation database source code and libraries.

To make full use of the tree list generation database developer distribution a

Fortran 90/95 compiler is required. The Lahey/Fujitsu Fortran 95 compiler, LF95

version 5.5, was used as the development compiler [30]. If a different Fortran 90/95

compiler is used some code modifications will likely be necessary to build the tree list

generation database libraries and executable programs. In addition, a make utility

and other UNIX based text processing tools would likely prove useful.

1.4 Limitations of the tree list generation database software

1. Douglas-fir and western hemlock are currently the only tree species that may be

used to determine stand type, e.g., pure (species basal area ≥ 80%) or dominant

(species basal area ≥ 50%). It is a straightforward, though not trivial, exten-

sion of the tree list generation database subroutine library to support arbitrary

dominant and codominant tree species for the stand type determination.

2. File format conversion utilities to convert the tree list files produced by the tree

list generation database software into formats appropriate for direct input to

growth and yield models are not provided. Fortran 90/95 subroutines which

read and write each file format are provided to facilitate the development of

conversion utilities by those best able to develop them, e.g., the organizations

developing growth and yield models or other forestry related software.

14

3. The methodology developed to generate a tree list currently requires that a user

of the system completely specify a standard stand description via a set of input

index parameters. This is, in general, not a major constraint because the overall

methodology is sufficiently general that an additional software module could be

developed and added to an appropriate place in the tree list generation process

to estimate any possibly missing parameters.

4. The tree list generation database software performs Imperial to metric or metric

to Imperial units conversions. All of the conversions used are accurate to at least

10−6 units. This issue is important because all of the data stored in a tree list

generation databse are in metric units. For the typical use the accuracy of the

conversions poses no concerns, but it is possible to generate tree lists for very

large areas, and inaccuracis could appear in this setting.

5. There were several tree species code conflicts in the data, i.e., two different codes

being used to represent the same tree species. To resolve these conflicts, the

code used by the SMC was selected for use in the tree list generation database.

See Section 2.11.4 and Table 2.17 for more information.

• The tree species code OH for other hardwood was changed to be HD, the

code used by the SMC. Further use of the tree species code OH is not

recommended.

• The tree species code PM for pacific madrone (Arbutus menziesii) was

changed to be MD, the code used by the SMC. Further use of the tree

species code PM is not recommended.

• The tree species code O3 appeared in some of the data. A description of

this species code could not be found, so it was changed to OT, the code

used by the SMC for unrecognized species or mixtures of species. Further

use of the tree species code O3 is not recommended.

15

6. Program command line and input/output limitations.

• Maximum command line length: 4096 characters.

• Maximum filename length: 1024 characters.

• Maximum line length for reading or writing files: 16384 characters.

• Tree list generation database size is limited only by available disk space.

• The size of a simulated tree list is limited only by the available memory

and disk space.

1.5 Installation instructions

In the following sections basic installation instructions are provided for the standard

and developer tree list generation database distributions. The installation instructions

are provided as a basic list of steps that should be followed to install the software and

database. Installation details for any particular operating system are not discussed;

thus a basic familiarity and facility with the operating system used is presumed for

the installation procedures. If the instructions are followed, no warning or error

messages appear. If such a message does appear, please refer back to the installation

instructions for clarification.

1.5.1 Installation of the standard distribution

1. Decide where the tree list generation database executable files, documentation,

and data files should reside on the hard drive, creating any directories if neces-

sary.

2. Create a temporary directory, tlgtemp,

PROMPT> mkdir tlgtemp

16

where the tree list generation database .zip file archive will be placed and its

files extracted. This is done to make cleanup easier in the event of an error of

some sort.

3. Download the program unzip.exe from the download page of the tree list gen-

eration database web site, placing it into the the temporary directory tlgtemp.

4. Download the standard distribution of the tree list generation database from

download page of the tree list generation database web site, the filename is

tlgv1r0s.zip, also placing it into the the temporary directory tlgtemp. A

logon ID and password are required for this download operation.

5. Extract the files and directories for the standard distribution tlgv1r0s.zip

using the program unzip.exe,

PROMPT> unzip tlgv1r0s

which will automatically preserve the directory structure of the distribution.

This creates a directory tree in the current directory with the top level direc-

tory name tlgv1r0s, which contains subdirectories for the documentation, the

executable files, and a tree list generation database. The directory tlgv1r0s

should contain the following subdirectories.

• data This directory contains the various data files used to create the tree

list generation database tgdb1r00 and the tree list generation database

itself.

• doc This directory contains the documentation for the tree list generation

database software and files.

• exe This directory contains the executable programs which comprise the

interface to a tree list generation database.

17

6. Place the executable files from the directory tlgv1r0s/exe into a directory

which is on the executable file search path or add the path tlgv1r0s/exe to

the executable file search path.

7. Place the documentation files from the directory tlgv1r0s/doc into the direc-

tory for the tree list generation database documentation.

8. Place the tree list generation database directory, tlgv1r0s/data/tgdb1r00,

and all of its subdirectories, into the location on the hard drive where the tree

list generation database should reside, creating a directory if necessary.

1.5.2 Installation of the developer distribution

1. Decide where the tree list generation database executable files, documentation,

source code, and data files should reside on the hard drive, creating any direc-

tories if necessary.

2. Create a temporary directory, tlgtemp,

PROMPT> mkdir tlgtemp

where the tree list generation database .zip file archive will be placed and its

files extracted. This is done to make cleanup easier in the event of an error of

some sort.

3. Download the program unzip.exe from the download page of the tree list gen-

eration database web site, placing it into the the temporary directory tlgtemp.

4. Download the developer distribution of the tree list generation database from

download page of the tree list generation database web site, the filename is

tlgv1r0d.zip, also placing it into the the temporary directory tlgtemp. A

logon ID and password are required for this download operation.

18

5. Extract the files and directories for the developer distribution tlgv1r0d.zip

using the program unzip.exe,

PROMPT> unzip tlgv1r0d

which will automatically preserve the directory structure of the distribution.

This creates a directory tree in the current directory with the top level direc-

tory name tlgv1r0d, which contains subdirectories for the documentation, the

executable files, the source code and libraries, and a basic tree list generation

database. The directory tlgv1r0d should contain the following subdirectories.

• data This directory contains the various data files used to create a tree list

generation database tgdb1r00 and the tree list generation database itself.

• doc This directory contains the documentation for the tree list generation

database software and files.

• exe This directory contains the executable programs which comprise the

interface to a tree list generation database.

• libs This directory contains the object libraries used to build the exe-

cutable programs.

• src This directory contains the Fortran 95 source code for the tree list

generation database executables, library, and supporting subroutines.

6. Place the executable files from the directory tlgv1r0d/exe into a directory

which is on the executable file search path or add the path tlgv1r0d/exe to

the executable file search path.

7. Place the documentation files from the directory tlgv1r0d/doc into the direc-

tory for the tree list generation database documentation.

19

8. Place the tree list generation database source code directory, tlgv1r0d/src, and

all of its subdirectories, into the location on the hard drive where the tree list

generation database source code should reside, creating a directory if necessary.

9. Place the tree list generation database directory, tlgv1r0d/data/tgdb1r00,

and all of its subdirectories, into the location on the hard drive where the tree

list generation database should reside, creating a directory if necessary.

1.6 Software bug reporting and contact addresses

In the event that a software bug is discovered, send an email or FAX to the Stand

Management Cooperative at the University of Washington. Before sending the prob-

lem report to the SMC, please use the following short check-list when formulating

your problem report. This will enable a more rapid resolution of the problem.

1. Be sure that the problem to be reported is real and not simply an erroneous

input value or some other simple problem that is not a bug or defect.

2. Attempt to fully, and concisely, document the circumstances under which the

problem occurred. Be sure to include the following items to include in any

message reporting a possible bug that is sent to the SMC.

• Your name and contact information, email address and phone number in

particular.

• The name and version of the program that was used.

• The version of the tree list generation database subroutine library that was

used.

• All input or output files that were used when the problem occurred.

• Error messages, if any, reported by the tree list generation database soft-

ware or the computer operating system when the problem occurred.

20

3. The subject line of the email or FAX should clearly indicate that the message

is a bug report or problem report for the SMC tree list generation database

software.

Contact the Stand Management Cooperative at the address below to report problems

with, or possible bugs in, the tree list generation database software.

Stand Management Cooperative

University of Washington

College of Forest Resources

Room 164 Bloedel Hall

BOX 352100

Seattle, WA 98195-2100

email: [email protected]

FAX: (206) 685-3091

SMC Home Page:

http://www.cfr.washington.edu/smc/

SMC Tree list generation database Home Page:

http://depts.washington.edu/silvproj/tlghome/

21

Chapter 2

TREE LIST GENERATION DATABASE

USER’S MANUAL

This chapter describes the structure of a tree list generation database and the

executable programs and files necessary to effectively use the tree list generation

database system. The chapter begins with a brief introduction to the tree list genera-

tion database. Next, the software used to access and manipulate a tree list generation

database is described, and examples are given for each executable program. The tree

list generation database input and output file descriptions and formats follow the

program descriptions. The chapter ends with a brief summary of the data contained

in the initial tree list generation database tgdb1r00.

2.1 Introduction to the tree list generation database

The tree list generation database consists of a set of eight executable programs and

a customized database of tree measurement data, compatible DBH and height mea-

surements with tree species, that may be used to construct simulated tree lists. The

tree list generation database permits the construction of simulated forest stands or

tree lists using aggregate or per unit area descriptions of forest stands. Stand descrip-

tions are provided, for example, as site index, stand total age, QMD stand type, and

trees per hectare (acre). This chapter describes a particular implementation of a tree

list generation database, including a basic description of the database structure, the

database access and utility programs, and the data files that are used to generate tree

lists.

22

The eight tree list generation database programs are command line oriented, and

have their own online help that may be accessed by using the appropriate command

line option, -h or -help with no other command line options. The tree list genera-

tion database programs provide the primary means of access to a tree list generation

database, and provide a basic set of database oriented functions. The tree list genera-

tion database programs and their functions are briefly described below, with complete

descriptions in Section 2.2 to Section 2.8. The majority of this chapter is devoted to

describing the individual tree list generation database programs, their use, and their

input and output files.

TGNEW Create a new tree list generation database. Section 2.2.

TGADD Add a stand measurement file or a set of stand measurement files to a tree list

generation database. Section 2.3.

TGRAND Generate a simulated stand or random stand measurement file from a tree

list generation database stand description file, or generate a set of simulated

stands from a set of stand description files. Section 2.4.

TGSUMRY Summarize a tree list generation database. Section 2.5.

TGLOCK Lock a tree list generation database to prevent further modifications, e.g.,

the addition of new stand measurement data. Section 2.6.

TGUNLOCK Unlock a tree list generation database to allow further modifications, e.g.,

the addition of new stand measurement data. Section 2.6.

TGXPLODE Break a tree list generation database into its component parts. This is

necessary if the index parameters in an existing database are to be changed, or

if the stand classification interval widths are to be modified. Section 2.7.

23

TGSCATRP Create a file of scatter plot data for a treatment in a tree list generation

database from the multidimensional histogram based stand classification. Sec-

tion 2.8.

The eight tree list generation database programs each use one or more of the

following tree list generation database input or output files. The tree list generation

database input and output files have been designed to be both easily scanned by

software and easily understood by human beings. The files generally follow a keyword

equals value format and may be viewed or edited using any text editor.

stand measurement file A tree list generation database stand measurement file

defines the attributes of a measured stand of trees, including DBH and height

measurements and an indication of species for each tree. Section 2.11.1.

stand description file A tree list generation database stand description file defines

the attributes of a measured stand that are desired for a simulated or random

stand that is produced by TGRAND. Section 2.11.2.

random measurement file A tree list generation database random measurement

file is identical in format to the stand measurement file except that it is pro-

duced by TGRAND to simulate a stand specified in a stand description file. Sec-

tion 2.11.3.

species mapping file A tree list generation database tree species mapping file de-

fines a mapping from short character tree species identifiers to integer codes

used internally by the tree list generation database software. This file is only

used when creating a new tree list generation database. Section 2.11.4.

schema file A tree list generation database schema file defines the index parameters

that are to be used in a particular tree list generation database. This file is only

used when creating a new tree list generation database. Section 2.11.5.

24

In addition to the tree list generation database files just described, there are two

output file types, a summary file and a scatterplot file, and one input file type, a

list file, that are used by the tree list generation database software. The tree list

generation database summary and scatterplot fils are used to represent summaries of

the stand classification structure, the treatment types, and the individual tree data

contained in a tree list generation database. The list file is used to provide a list of

stand measurement or stand description file names to be processed in a batch to either

TGADD or TGRAND, respectively. The files specified in the list file for each program must

be of the appropriate type: stand measurement files for TGADD and stand description

files for TGRAND.

summary file A tree list generation database summary file is a text file that is

produced by TGSUMRY. The text file contains a concise summary of the data

contained in a tree list generation database.

scatter plot file A tree list generation database scatterplot file is a comma delim-

ited text file created by TGSCATRP. The scatterplot file allows the individual data

dimensions of the multidimensional histogram bin centers from the stand clas-

sification to be plotted against each other for each treatment. This file may be

thought of as providing a detailed summary of a tree list generation database,

and it may be used to identify gaps in the data coverage.

list file A text file that contains a list of file names, one per line, specifying a set of

stand measurement files or stand description files that are to be processed by

TGADD or TGRAND, respectively. The file may contain comments, indicated by a

percent symbol (%) as the first nonblank character of a line, or blank lines.

25

2.1.1 Tree list generation database naming conventions

A tree list generation database consists of two components: a software component

consisting of source code and executable programs and a data component which stores

the tree data used to generate tree lists. In determining suitable naming conventions

for a tree list generation database distribution both of these components must be taken

into account, as well as the type of tree list generation database distribution: standard

or developer. The software version and the database format must be compatible, and

changes to the contents of the tree measurement data in a tree list generation database

should also be taken into account. An appropriate naming convention takes both of

these issues into consideration.

A two part naming convention composed of a version number and a revision num-

ber has been selected as the naming convention for the tree list generation database

software and data components. A third part is added to the name to indicate the type

of tree list generation database distribution. Version numbers must agree to guaran-

tee that the software and the data for a particular tree list generation database are

compatible. The version number is only changed if there is a major modification to

the tree list generation database software, i.e., a modification which affects the file

formats or which introduces a potential backward incompatibility. Revision numbers

indicate the addition of new stand measurement data to an existing tree list genera-

tion database or a backward compatible update, whether maintenance or bug fixes,

to the tree list generation database software.

A tree list generation database name in the standard or developer distribution fol-

lows the convention tgdbXrYY, where X is replaced with a single digit version number

for the software, and data formats and YY is replaced with the revision number, e.g.,

tgdb1r00. After each addition of data to a tree list generation database the revision

number should be incremented by one (1) to indicate that the tree measurement data

has changed from a previous tree list generation database distribution.

26

The tree list generation database distribution naming convention is tlgvXrYD,

where X is replaced with a version number for the software and data formats, Y is

replaced with the revision number, and D is replaced with the letter d or the letter s

for the developer or standard tree list generation database distributions, respectively,

e.g., tlgv1r0s. The revision number in the tree list generation database distribution

name should only be incremented if the tree list generation database software changes.

The version number should onlybe incremented if there is a major modification to

the tree list generation database software.

An existing tree list generation database may be renamed, for example to increase

its revision number to reflect the addition of new data. The renaming consistes of two

steps. First, change the basename of the tree list generation database information

file to the new tree list generation database name. Second, change the tree list

generation database directory name to the new tree list generation database name.

As an example, suppose new data have just been added to the tree list generation

database tgdb1r00, and its revision number is to be incremented from 00 to 01 to

reflect this modification. Assuming the tree list generation database to be renamed

is in the current directory, it is renamed by the following two steps.

1. Change the name of the tree list generation database information file from

tgdb1r00/tgdb1r00.dat to tgdb1r00/tgdb1r01.dat

2. Change the tree list generation database directory name from tgdb1r00 to

tgdb1r01.

See Section 2.1.2 for details of the tree list generation database file and directory

structure.

27

2.1.2 Tree list generation database files and directory structure

A tree list generation database consists of a main database directory which contains a

species mapping file, a database information file with a basename that is the same as

the main database directory name, and a subdirectory for each available treatment.

The species mapping file defines which tree species are recognized by a tree list gen-

eration database. The database information file contains data relating to the current

state of a tree list generation database, e.g., the number and type of treatments,

the number of trees in each treatment, the number of histogram bins used to classify

stands for each treatment, the index parameters used for the stand classifications, and

the creation and last modification dates. Each treatment subdirectory contains four

files: a tree data file which contains the DBH, height, and species for all of the trees

associated with the treatment; a stand catalog file storing the stand description data

for each stand measurement file associated with the treatment, an index file based on

the selected index parameters for the treatment, and a file linking the index to the

tree data.

The directory structure for the tree list generation database tgdb1r00 which con-

tains untreated stands and thinned stands as its treatments is given by Figure 2.1.

For this tree list generation database, the data for the untreated treatment is stored

in the subdirectory u and the data for the thinned treatment is stored in the subdi-

rectory t. The files in the example which correspond to the description of a tree list

generation database given above are given in Table 2.1.

2.1.3 Tree list generation database units conversions

The tree list generation database software provides seamless measurement units con-

versions between the Imperial and metric systems of measurement for a standard set

of forestry measurement data types. The conversion factors used provide a conver-

sion accuracy of at least 1.0× 10−6 for a value of one (1) unit. The units conversions

28

tgdb1r00/spmap.dattgdb1r00/tgdb1r00.dattgdb1r00/u/pbktmap.dattgdb1r00/u/pindex.dattgdb1r00/u/stnddesc.dattgdb1r00/u/treedata.dattgdb1r00/t/pbktmap.dattgdb1r00/t/pindex.dattgdb1r00/t/stnddesc.dattgdb1r00/t/treedata.dat

Figure 2.1: Example of the tree list generation database file and directory structurefor the tree list generation database tgdb1r00 containing both untreated and thinnedtreatments.

Table 2.1: File descriptions for the tree list generation database file and directorystructure for the example in Figure 2.1. The file associations for the thinned treatmenttype are identical to those of the untreated treatment type, and so are not listed. Theyare obtained by replacing the u with a t.

File name Tree list generation database file description

tgdb1r00/spmap.dat Species mapping file

tgdb1r00/tgdb1r00.dat Database information file

tgdb1r00/u/pbktmap.dat Index to tree data mapping for untreated stands

tgdb1r00/u/pindex.dat Index file

tgdb1r00/u/stnddesc.dat Stand catalog data file for untreated stands

tgdb1r00/u/treedata.dat Individual tree data for untreated stands

29

performed by the tree list generation database software are listed in Table 2.2.

Table 2.2: The measurement unit equivalents for the Imperial and metric measure-ment systems for each of the tree or stand measurements recognized by the tree listgeneration database software.

Measurement Type Imperial Unit Metric Unit

diameter inches (in) centimeters (cm)

height feet (ft) meters (m)

area acres (ac) hectares (ha)

basal area square feet/acre (ft2ac−1) square meters/hectare (m2ha−1)

density trees per acre (tpa) trees per hectare (tph)

30

2.2 Creating a tree list generation database

The tree list generation database program TGNEW is used to create a new tree list

generation database. Required inputs are a tree list generation database schema file,

which defines the index parameters that are valid for the new database, a tree list

generation database species mapping file, which defines the tree species that will be

recognized by the tree list generation database programs, and the name of the new

tree list generation database. Descriptions of the schema file and the species mapping

file may be found in Section 2.11.5 and Section 2.11.4, respectively.

2.2.1 The TGNEW program

TGNEW is a program that creates a new tree list generation database. Required inputs

for this program are the tree list generation database name, a tree list generation

database schema file, and the name of a tree list generation database species mapping

file. An optional input to the program is the name of a directory where the tree list

generation database is to be located. Path elements are limited to eight (8) characters

with filenames limited to an eight-dot-three format with eight (8) characters forming

the base filename, followed by a dot or period, followed by a three (3) character

filename extension. Filenames and path elements may not contain any blanks or

other nonprinting characters. These common restrictions are imposed for the easy

portability of filenames and directory structures.

The TGNEW program is a command line program. The syntax for the TGNEW com-

mand line is given in Figure 2.2. The square brackets, “[” and “]”, indicate optional

command line options and any associated values. Angle brackets, “<” and “>”

indicate required values for the options when specified and required command line

arguments. The ellipses, ”. . .”, are used to indicate that the line break is to be ig-

nored, i.e., that the text on the second line should be typed on the same command

line as the rest of the TGNEW command. The vertical bar, “|”, indicates that there are

31

TGNEW [-h|-help] ...[-ver|-version] ...[-path <path>] ...[-silent] ...<tgdb name> ...<schema file> ...<species mapping file>

Figure 2.2: TGNEW command line syntax with the command line options and argu-ments.

synonyms for an option, either of which is valid.

All command line options must precede any command line arguments. The order

of the command line options is not important, but options which have associated val-

ues must immediately precede their values. All command line arguments are required,

and the ordering of the command line arguments is important. The three command

line arguments for this program are the tree list generation database name, the tree

list generation database schema filename, and the tree list generation database species

mapping filename.

Options for the TGNEW program are indicated by a dash, or minus sign, immediately

preceding the option name with no intervening spaces. Options may appear in any

order but all options must appear before any command line arguments. Some options

have associated values which must immediately follow the option separated from the

option name by one or more blanks or spaces. The case of the command line option

names is not important. The case of filenames, paths, etc., that are specified on the

command line may be important depending on the operating system in use.

The complete list of the TGNEW program options are:

-h Display the online help message on the standard output, typically a terminal

screen, and exit.

-help Display the online help message on the standard output, typically a terminal

32

screen, and exit.

-silent This option suppresses normal output to the standard output device. Error

messages, if an error occurs while the program is running, are however, reported.

-ver Display the program version on the standard output, typically a terminal screen,

and exit.

-version Display the program version on the standard output, typically a terminal

screen, and exit.

-path <path> Define a tree list generation database path to be <path>. The tree

list generation database path specifies the directory where a tree list generation

database is located. If specified, the length of <path> may be limited, depending

on the operating system being used. Each path element may be at most eight

(8) characters in length and all of the path elements specified must exist. This

option is not required.

Arguments for the TGNEW program are all required. The TGNEW command line

arguments are:

<tgdb name> The name of the tree list generation database subdirectory that is to

be created. This name is combined with the tree list generation database path,

if specified, to generate a full path to the tree list generation database and its

files and subdirectories. If the -path command line option is not specified, the

tree list generation database is created in the current directory.

<schema file> The tree list generation database schema file. This file defines which

treatments are to be included in a tree list generation database and which

index parameters are to be used to index the tree measurement data. See

Section 2.11.5 for details about the schema file.

33

PROMPT> tgnew sampledb schema.dat spmap.dat

Creating the tree list generation database: sampledb

in the current directory.

The tree list generation database was created successsfully.

Figure 2.3: Example of creating a new tree list generation database using TGNEW.

<species map> The tree list generation database species code to integer ID mapping

file. This file defines unique two (2) and four to six (4-6) character species codes

and unique integer identifiers for each tree species. See Section 2.11.4 for details

about the species mapping file.

2.2.2 TGNEW example: Creating a tree list generation database

In this example, the TGNEW program is used with the tree list generation database

species mapping file spmap.dat and the tree list generation database schema file

schema.dat to create a tree list generation database sampledb in the current direc-

tory. Figure 2.3 presents the command used and the resulting output.

2.3 Adding measurement files to a tree list generation database

The tree list generation database program TGADD is used to add a stand measurement

file, or a set of stand measurement files, to a tree list generation database. The

required inputs for TGADD are the name of a stand measurement file and a tree list

generation database name, when adding a single stand measurement file, or a stand

measurement file list file name and a tree list generation database name, when adding

a set of stand measurement files. The stand measurement file list file contains the

name of each stand measurement file in the set of files to be added to a tree list

generation database, one file name per line. The two ways in which TGADD may be

used are indicated by the -mf and -lf command line options; the former indicating

34

that a single measurement file is to be added, and the latter indicating that a set of

stand measurement files specified in a stand measurement file list file are to be added.

Be sure to create a backup of a tree list generation database before adding any

stand measurement files. Stand measurement files are added tt a tree list generation

database live, i.e., the new data are not staged before they are added to the database.

This means that if an error occurs while adding stand measurement data with TGADD,

the tree list generation database being modified will most likely become corrupted

unusable.

2.3.1 The TGADD program

TGADD is a program that adds stand measurement data to a tree list generation

database. Required inputs for this program are the tree list generation database

name and a stand measurement file or a stand measurement file list file, as indicated

by one of the option flags -mf <file> or -lf <file>, respectively. Only one of these

options may be specified at a time. An optional input to the program is the name of

a directory, or path, where the tree list generation database is located. Path elements

are limited to eight (8) characters with filenames limited to an eight-dot-three format

with eight (8) characters forming the base filename, followed by a dot or period, fol-

lowed by a three (3) character filename extension. Filenames and path elements may

not contain any blanks or other nonprinting characters. These common restrictions

are imposed for the easy portability of filenames and directory structures.

The TGADD program is a command line program. The syntax for the TGADD com-

mand line when adding a single stand measurement file is given in Figure 2.4, and

the syntax for the TGADD command line when adding a set of stand measurement files

specified in a list file is given in Figure 2.5. The square brackets, “[” and “]”, indicate

optional command line options and their associated values. Angle brackets, “<” and

“>” indicate required values for the options when specified and required command

line arguments. Ellipses, ”. . .”, indicate that the new line is to be ignored, i.e., that

35

TGADD [-h|-help] ...[-ver|-version] ...[-path <path>] ...[-silent] ...[-max tree <n>] ...-mf <measurement file> ...<tgdb name>

Figure 2.4: TGADD command line syntax with the command line options and argumentsfor adding a single stand measurement file to a tree list generation database.

TGADD [-h|-help] ...[-ver|-version] ...[-path <path>] ...[-silent] ...[-max tree <n>] ...-lf <listing file> ...<tgdb name>

Figure 2.5: TGADD command line syntax with the command line options and argumentsfor adding a set of stand measurement files specified in a list file to a tree list generationdatabase.

36

the text on the second line should be typed on the same command line as the rest of

the TGADD command. The vertical bar, “|”, indicates that there are synonyms for an

option, either of which is valid. Before running TGADD to modify a tree list generation

database, be sure to make a backup of the database. If an error occurs while adding

data to a tree list generation database, the database could become corrupted. Always

make a backup of a tree list generation database before adding new data.


of the command line options is not important, but options which have associated

values must immediately precede their values. The only command line argument for

this program is the tree list generation database name, and it is required.

Options for the TGADD program are indicated by a dash, or minus sign, immediately

preceding the option name with no intervening spaces. Options may appear in any

order but all options must appear before any command line arguments. Some options

have associated values which must immediately follow the option separated from the

option name by one or more blanks or spaces. The case of the command line option

names is not important. The case of filenames, paths, etc., that are specified on the

command line may be important depending on the operating system in use.

The list of command options for the TGADD program are:


screen, and exit.


screen, and exit.




and exit.

37


screen, and exit.







-mf <file> Command line option indicating that the file named by the option value

is a stand measurement file. A tree list generation database stand measurement

file defines the stand treatment type, among other relevant stand attributes,

and provides a list of tree measurements to be added to a tree list generation

database. Tree measurements consist of a diameter at breast height measure-

ment, a height measurement, and a two or four to six letter species code.

-lf <file> Command line option indicating that the file named by the option value

is a listing file containing the names of stand measurement files whose data are

to be added to a tree list generation database. The listing file provides an easy

and efficient mechanism for adding multiple stand measurement files to a tree

list generation database.

-max tree <n> This option indicates that the program should allow a maximum of

<n> trees when reading a stand measurement file. The number <n> must be a

positive integer. The default value for the number of trees is 1024, which should

be sufficient for most purposes. In the event that there are more than 1024 trees

in a stand measurement file this option may be used to modify the maximum

number of trees allowed so that the measurement file may be processed. This

38


Arguments for the TGADD program are all required. The TGADD command line

argument is:

<tgdb name> The name of the tree list generation database to which tree measure-

ment data from the stand measurement file, or files, are be added. This name

is combined with the tree list generation database path, if specified, to generate

a full path to a tree list generation database, and its files and directories. If the

-path command line option is not specified, the tree list generation database

specified is assumed to be in the current directory. It is important to create a

backup of a tree list generation database before adding data; if any errors occur

during the execution of the TGADD program the tree list generation database be-

ing modified may be corrupted. Always make a backup of the tree list generation

database before adding new data.

2.3.2 TGADD examples: Adding data to a tree list generation database

In these examples the TGADD program is used to add stand measurement files to the

tree list generation database sampledb created in the example for TGNEW in Section 2.2.

The tree list generation database sampledb, the stand measurement files, and the

stand measurement file list file are all assumed to be in the current directory. For

these examples, there are eleven (11) stand measurement files that are to be added

to the tree list generation database. The first example adds the stand measurement

file file0001.dat. The second example uses the list file filelist.dat to add the

remaining ten (10) files, file0002.dat through file0011.dat. The list file used for

the second example may be found in Figure 2.8.

Figure 2.6 presents the command used and resulting output for the example

which adds a single stand measurement file, file0001.dat, to the tree list generation

database sampledb using the TGADD option -mf. Figure 2.7 presents the command

39

PROMPT> tgadd -mf file0001.dat sampledb

Processing measurement file: file0001.dat

Figure 2.6: Example of adding a single stand measurement file to a tree list generationdatabase using TGADD and the -mf option.

PROMPT> tgadd -lf filelist.dat sampledb











Figure 2.7: Example of adding a set of stand measurement files to a tree list generationdatabase using a list file and the TGADD -lf option.

used and resulting output for the example which adds multiple stand measurement

files to the tree list generation database sampledb using a stand measurement file list

file, filelist.dat and the TGADD option -lf.

2.4 Creating simulated tree lists using a tree list generation database

The tree list generation database program TGRAND is used to generate simulated tree

lists, or random stand measurement files. TGRAND is the primary workhorse of the

tree list generation database system. TGRANDrequires a stand description file, or a

set of stand description files specified in a list file, which describe the simulated tree

list(s), or random stand measurement file(s), that are to be generated and a tree list

generation database that is compatible with stand index parameter keywords used in

40

file0002.datfile0003.datfile0004.datfile0005.datfile0006.datfile0007.datfile0008.datfile0009.datfile0010.datfile0011.dat

Figure 2.8: Stand measurement file listing file filelist.dat for TGADD example two,adding multiple stand measurement files to a tree list generation database using TGADD

and the -lf option.

the stand description file or files. The stand description files used with a particular

tree list generation database must contain appropriate values for the complete set of

index parameters used by the tree list generation database for the desired treatment.

See Section 2.11.2 for more information on stand description files and Sections 2.11.5

and 2.11.6 for more information on the specification and use of index parameter

keywords in a tree list generation database.

TGRAND has four modes for generating simulated tree lists or random measure-

ment files. The TGRAND stand generation mode is determined by the values of the

RANDOM MF PURE and RANDOM MF JITTER keywords in a particular stand description

file. Values for these keywords may be YES, indicating that the mode should be used,

or NO, indicating that the mode should not be used. These two stand description file

keywords and their values define the four TGRAND stand generation modes. Mnemonic

acronyms for the four modes with the appropriate keyword values are given in table

Table 2.3. The default mode for TGRAND is the RMF mode, which does not force 100%

pure stands and does not jitter the simulated tree height values.

TGRAND uses a nearest neighbor selection procedure to determine which multidi-

mensional histogram bins in a particular tree list generation database are similar to

41

Table 2.3: The four simulated tree list generation modes for the tree list generationdatabase program TGRAND. Jittering tree heights adds or subtracts a small randomamount to or from each height value. The 100% pure stand mode forces generatedstands to be 100% of the trees to be the dominant species if they have a stand typeof PURE DF or PURE WH. For details on how to set these properties for a tree list, seeSection 2.11.2.

TGRAND mode 100% pure stands Jitter heights

RMF No No

RMFJ No Yes

RMFP Yes No

RMFJP Yes Yes

a desired stand that is specified by a stand description file. If there are no data rep-

resented in a tree list generation database that are close enough to the desired stand,

TGRAND will halt and give an error message indicating that there were no similar stands

available in the database. As more data are added to a tree list generation database

this issue will become less of a problem. There are two TGRAND options which control

the maximum number of nearest neighbor multidimensional histogram bins that may

be selected and used to generate a tree list and the maximum similarity score to be

used in the nearest neighbor selection procedure. These are the -nn max stands op-

tion and the -nn max score option, respectively. The default value for the number of

multidimensional histogram bins is 5 and the default value for the maximum nearest

neighbor score is 5.0.

If TGRAND halts and indicates there were no multidimensional histogram bins near

the desired stand description and the nearest neighbor score reported is greater than

or equal to 10000, then there is no stand sufficiently near the desired stand. The

reason for this is that the desired stand description differs in one of its categorical,

or FIXED, type index parameters, e.g., STAND TYPE, from all of the stands in a tree

list generation database. There is nothing that may be done in this event to coerce

TGRAND to generate a simulated tree list. The only way to resolve this problem is to

add actual stand measurement data for the desired stand description to the tree list

42

generation database.

If TGRAND halts, indicating that there were no multidimensional histogram bins

near the desired stand description and the nearest neighbor score reported is relatively

small, e.g., less than 100.0, then a simulated tree list may be generated. The command

line option -nn max score is used to increase the maximum nearest neighbor score

until at least one multidimensional histogram bin is selected. A simulated tree list

generated in this manner may or may not be representative of the desired stand

description, and this should be verified before the tree list is used. This condition

indicates that the data space spanned by the stands in a tree list generation database

has small gaps, and these should be identified precisely so that data may be collected

from temporary or permanent sample plots to fill the gaps. The TGSCATRP program

and the scatterplot files it creates may prove useful for identifying gaps in a tree list

generation database.

The simulated stands or tree lists produced by TGRAND and stored in a random

stand measurement file do not contain tree multipliers to scale the number of trees

to a hectare (acre) basis commonly used by growth and yield or stand simulation

models [11, 20, 8, 35, 36]. Each tree in a simulated stand represents a single tree for

the specified output plot size. Thus, if a hectare is the desired output plot size and the

stand density has 1000 trees per hectare, the random measurement file will contain

1000 trees. If tree multipliers are necessary for compatibility with a particular growth

and yield model, they may be obtained via the value of the PLOT SIZE keyword in a

random stand measurement file.

2.4.1 The TGRAND program

TGRAND is a program that generates a simulated tree list using a stand description file

and a tree list generation database. Required inputs for this program are the tree list

generation database name and a stand description file or a description file listing file,

as indicated by one of the option flag -df <file> or -lf <file> respectively. Only

43

TGRAND [-h|-help] ...[-ver|-version] ...[-silent] ...[-random_seed <seed>] ...[-nn_max_score <score>] ...[-nn_max_stands <n>] ...[-path <path>] ...-df <description file> ...<tgdb name>

Figure 2.9: TGRAND command line syntax with the command line options and argu-ments for generating a single random stand measurement file from a stand descriptionfile using a tree list generation database.

one of these options may be specified at a time. An optional input to the program

is the name of a directory where the tree list generation database is to be located.

Path elements are limited to eight (8) characters with filenames limited to an eight-

dot-three format with eight (8) characters forming the base filename, followed by a

dot or period, followed by a three (3) character filename extension. Filenames and

path elements may not contain any blanks or other nonprinting characters. These

common restrictions are imposed for the easy portability of filenames and directory

structures.

The TGRAND program is a command line program. The syntax for the TGRAND

command line when generating a single random stand measurement file is given in

Figure 2.9, and the syntax for the TGRAND command line when generating a set of

random stand measurement files specified in a list file is given in Figure 2.10. The

square brackets, “[” and “]”, indicate optional command line options and their asso-

ciated values. Angle brackets, “<” and “>” indicate required values for the options

when specified and required command line arguments. Ellipses, ”. . .”, indicate that

the new line is to be ignored, i.e., that the text on the second line should be typed on

the same command line as the rest of the TGRAND command. The vertical bar, “|”,

indicates that there are synonyms for an option, either of which is valid.

44

TGRAND [-h|-help] ...[-ver|-version] ...[-silent] ...[-random_seed <seed>] ...[-nn_max_score <score>] ...[-nn_max_stands <n>] ...[-path <path>] ...-lf <listing file> ...<tgdb name>

Figure 2.10: TGRAND command line syntax with the command line options and ar-guments for generating a set of random stand measurement files from a set of standdescription files specified in a list file using a tree list generation database.





Options for the TGRAND program are indicated by a dash, or minus sign, imme-

diately preceding the option name with no intervening spaces. Options may appear

in any order but all options must appear before any command line arguments. Some

options have associated values which must immediately follow the option separated

from the option name by one or more blanks or spaces. The case of the command line

option names is not important. The case of filenames, paths, etc., that are specified

on the command line may be important depending on the operating system in use.

The list of options for the TGRAND program are:


screen, and exit.


screen, and exit.

45


messages, if an error occurs while the program is running, are however, still

reported.


and exit.


screen, and exit.







-df <file> Command line option indicating that the file named by the option value

is a stand description file. A tree list generation stand description file defines

the stand treatment type and other relevant stand defining attributes, allowing

the generation of a random list of tree measurements appropriate for the stand

description using a tree list generation database. The tree measurements consist

of a diameter at breast height measurement, a height measurement, and a two

or four to six letter species code. When using a single stand description file to

generate a random stand measurement file, if an output file name is not specified

using the RANDOM MF NAME keyword in the description file, the default filename

of output.rmf will be used.

-lf <file> Command line option indicating that the file named by the option value

is a listing file containing the names of stand description files that are to be used

46

with a tree list generation database to generate a random tree list. The listing

file provides an easy and efficient mechanism for generating multiple random

stand measurement files from a tree list generation database. When using a

listing file, the names for all output random stand measurement files that are to

be generated must be defined in the description files using the RANDOM MF NAME

keyword.

-random seed <seed> Command line option to specify a seed value for the uniform

random number generator used to generate the simulated stands of trees. The

seed should be a large, at least six (6) digits, relatively prime integer, but large

and odd will generally suffice. This option allows the generation of multiple

stands having similar characteristics, but with different simulated trees for dif-

ferent seed values when using the same description file. Alternatively, multiple

description files for the same stand could be created and supplied to TGRAND via

a listing file, using multiple copies of the same description file with different out-

put filenames, to create similar stands with different sets simulated trees. This

latter method is the preferred method for generating multiple stands having

similar characteristics since a complete record of the stands generated and their

descriptions are contained in the random measurement files and the description

files. This option is not required.

-nn max score <score> Command line option which sets the maximum value for the

nearest neighbor similarity score to be used when selecting the tree measurement

data from the tree list generation database that are used to generate a simulated

or random stand of trees. The score values are somewhat arbitrary, but are

based on a similarity index. A score of zero would indicate an exact match

between the attributes of a desired stand and the attributes for a stand whose

data are contained in the tree list generation database. A default maximum

similarity score of 5.0 is used. By default, only the best 5 multidimensional

47

histogram bins, and their associated tree data, as determined by their scores,

are used to generate a simulated stand. This default may be overridden by using

the -nn max stands option. This option is not required.

-nn max stands <n> Command line option which sets the maximum value for the

number of neighboring stands, multidimensional histogram bins, from the tree

list generation database that are to be used when generating a simulated stand

of trees. This option may be used in conjunction with the -nn max score option

to control the amount of data pooled to create a canonical stand from which a

simulated tree list is generated. The default value for this option is 5 stands.

This option is not required.

The single argument for the TGRAND program is required. The TGRAND command

line argument is:

<tgdb name> The name of the tree list generation database to which tree measure-

ment data from the stand measurement file, or files, are be added. This name

is combined with the tree list generation database path, if specified, to generate

a full path to a tree list generation database, and its files and directories. If the

-path command line option is not specified, the tree list generation database

specified is assumed to be in the current directory.

2.4.2 TGRAND examples: Generating simulated stands or tree lists

In these examples the TGRAND program is used to generate random stand measurement

files from the tree list generation database tgdb1r00. The tree list generation database

tgdb1r00 and the stand description files used are all assumed to be in the current

directory. TGRAND is used to create a random stand measurement file for an untreated

stand and a random stand measurement file for a thinned stand. An example using

TGRAND with a stand description file list file is not presented. The -lf option of

48

PROMPT> tgrand -df untreatd.df tgdb1r00

Processing description file: untreatd.df

Creating random measurement file: untreatd.rmf

Figure 2.11: Example of generating a single random stand measurement file for anuntreated stand specified by the file untreatd.df, Figure 2.12, from a tree list gen-eration database using TGRAND.

TGRAND and the list file format are exactly the same as the -lf option and list file

format of TGADD.


which creates a single random stand measurement file for an untreated stand using

the description file untreatd.df and the tree list generation database tgdb1r00.

Figure 2.12 presents the stand description file used, and Figure 2.13 presents height vs.

diameter plots for actual and simulated tree lists for the four TGRAND stand generation

modes for the untreated stand example.


which creates a single random stand measurement file for a thinned stand using the

description file thinned.df and the tree list generation database tgdb1r00. Fig-

ure 2.15 presents the stand description file used, and Figure 2.16 presents height vs.

diameter plots for actual and simulated tree lists for the four TGRAND stand generation

modes for the thinned stand example.

2.5 Summarizing a tree list generation database

The tree list generation database program TGSUMRY is used to create a brief sum-

mary of the contents and status of a tree list generation database. The summary

information includes the number of treatments in a tree list generation database, the

index parameters used for each treatment, whether the tree list generation database

49

%% Tree list generator stand description file.%% File created: 11/08/1999 at 14:19:50.14%MEASUREMENT_UNITS = METRICTREATMENT_TYPE = UNTREATEDSITE_INDEX_50 = 35.966STAND_TOTAL_AGE = 27.000STAND_ORIGIN = NATURALQMD = 14.277STAND_DENSITY = 3951.000STAND_TYPE = WH_DOMINANTRANDOM_MF_UNITS = METRICRANDOM_MF_PLOT_SIZE = 0.041RANDOM_MF_PURE = NORANDOM_MF_JITTER = NORANDOM_MF_NAME = untreatd.rmf

Figure 2.12: Stand description file untreatd.df for TGRAND example one, generatinga simulated tree list for an untreated stand from a tree list generation database usingTGRAND and the -df option. 162 trees are generated for this stand description file.

50

0 10 20 30 405

10

15

20

25A) Scenario RMF

DBH (cm)

Hei

ght (

m)

Actual SimulatedActual Simulated

0 10 20 30 405

10

15

20

25B) Scenario RMFJ

DBH (cm)

Hei

ght (

m)


0 10 20 30 405

10

15

20

25C) Scenario RMFP

DBH (cm)

Hei

ght (

m)

Actual Simulated

0 10 20 30 405

10

15

20

25D) Scenario RMFJP

DBH (cm)

Hei

ght (

m)

Actual Simulated

Figure 2.13: Height vs. diameter plots for an untreated stand. The plots show data forthe actual stand represented by the stand description file untreatd.df and simulatedtree lists generated by TGRAND for each of the four stand generation modes. Only halfthe trees sampled for this plot were plotted in the figures.

PROMPT> tgrand -df thinned.df tgdb1r00

Processing description file: thinned.df

Creating random measurement file: thinned.rmf

Figure 2.14: Example of generating a single random stand measurement file for athinned stand specified by the file untreatd.df, Figure 2.15, from a tree list genera-tion database using TGRAND.

51

%% Tree list generator stand description file.%% File created: 11/03/1999 at 16:01:34.24%MEASUREMENT_UNITS = IMPERIALTREATMENT_TYPE = THINNEDSITE_INDEX_50 = 129.900STAND_TOTAL_AGE = 76.000STAND_ORIGIN = NATURALNUMBER_OF_THINNINGS = 1.000PRE_THIN_DENSITY = 437.000PRE_THIN_BASAL_AREA = 242.200PCT_BASAL_AREA_REMOVED = 34.600YEARS_SINCE_TREATMENT = 2.900QMD = 11.425STAND_DENSITY = 243.000STAND_TYPE = PURE_DFRANDOM_MF_UNITS = IMPERIALRANDOM_MF_PLOT_SIZE = 0.124RANDOM_MF_PURE = NORANDOM_MF_JITTER = NORANDOM_MF_NAME = thinned.rmf

Figure 2.15: Stand description file thinned.df for TGRAND example two, generatinga simulated tree list for a thinned stand from a tree list generation database usingTGRAND and the -df option. 29 trees are generated for this stand description file.

52

0 20 40 6010

15

20

25

30

35

40A) Scenario RMF

DBH (cm)

Hei

ght (

m)


0 20 40 6010

15

20

25

30

35

40B) Scenario RMFJ

DBH (cm)H

eigh

t (m

)


0 20 40 6010

15

20

25

30

35

40C) Scenario RMFP

DBH (cm)

Hei

ght (

m)

Actual Simulated

0 20 40 6010

15

20

25

30

35

40D) Scenario RMFJP

DBH (cm)

Hei

ght (

m)

Actual Simulated

Figure 2.16: Height vs. diameter plots for a thinned stand. The plots show data forthe actual stand represented by the stand description file thinned.df and simulatedtree lists generated by TGRAND for each of the four stand generation modes.

53

TGSUMRY [-h|-help] ...[-ver|-version] ...[-silent] ...[-path <path>] ...[-file <file>] ...<tgdb name>

Figure 2.17: TGSUMRY command line syntax with the command line options and ar-guments.

is locked or unlocked, and a variety of other potentially useful items. TGSUMRY will

save the summary information to an output file, specified using the -file option, or

display the summary information on the standard output device, generally a termi-

nal screen or shell window. The output from the TGSUMRY program for the tree list

generation database tgdb1r00 is provided in Appendix A.

2.5.1 The TGSUMRY program

TGSUMRY is a program that displays a brief summary of a tree list generation database.

The only required input for this program is the tree list generation database name.

An optional input to the program is the name of a directory, or path, where a tree

list generation database is located. Path elements are limited to eight (8) characters






The TGSUMRY program is a command line program. The syntax for the TGSUMRY

command line is given in Figure 2.17. The square brackets, “[” and “]”, indicate



line arguments. The ellipses, ”. . .”, are used to indicate that the line break is to be

54

ignored, i.e., that the text on the second line should be typed on the same line as

the rest of the TGSUMRY command. The vertical bar, “|”, indicates that there are






Options for the TGSUMRY program are indicated by a dash, or minus sign, imme-







The list of options for the TGSUMRY program are:


screen, and exit.


screen, and exit.




and exit.


screen, and exit.

55

-file <file> This option specifies a file to which the tree list generation summary

information is to be written. The filename specified must be the name of a

new file. This option is not required. If this option is not specified, the sum-

mary information will be displayed on the standard output, typically a terminal

screen.







The argument for the TGSUMRY program is required. The TGSUMRY command line

argument is:

<tgdb name> The name of the tree list generation database that is to be summarized

by TGSUMRY. This name is combined with the tree list generation database path,

if specified, to generate a full path to a tree list generation database, and its

files and directories. If the -path command line option is not specified, the tree

list generation database specified is assumed to be in the current directory.

2.5.2 TGSUMRY example: Summarizing a tree list generation database

In this example, the TGSUMRY program is used to generate a summary for the tree

list generation database tgdb1r00, which is in the current directory. The TGSUMRY

option -file is used to place the tree list generation database summary into the

file summary.txt, also in the current directory. The tree list generation database

summary identifies which treatments are in the database, which index parameters are

used for each treatment, and some other useful information, such as the time and date

56

PROMPT> tgsumry -file summary.txt tgdb1r00

Writing information to file: summary.txt

Figure 2.18: Example summarizing a tree list generation database using TGSUMRY. SeeAppendix A for the contents of the file summary.txt.

a tree list generation database was last modified. Figure 2.18 presents the command

used and resulting output. See Appendix A for the contents of the file summary.txt.

2.6 Locking and unlocking a tree list generation database

The tree list generation database programs TGLOCK and TGUNLOCK are used to pre-

vent or permit the addition of new stand measurement data to a tree list generation

database. The name of the tree list generation database that is to be locked or

unlocked is the only required input to the programs. The TGLOCK and TGUNLOCK

programs may be used to restrict the ability to add data to a tree list generation

database to a selected group within an organization, e.g., the group which collects

and conditions the stand measurement data before it is distributed to the users of the

data. The average user of a tree list generation database should not necessarily be

adding data.

2.6.1 The TGLOCK program

TGLOCK is a program that locks a tree list generation database preventing the addition

of tree measurement data or other modification of the database. Once locked, a tree

list generation database may be unlocked using the program TGUNLOCK, permitting

the addition of new tree measurement data.




57

TGLOCK [-h|-help] ...[-ver|-version] ...[-path <path>] ...[-silent] ...<tgdb name>

Figure 2.19: TGLOCK command line syntax with the command line options and argu-ments.






The TGLOCK program is a command line program. The syntax for the TGLOCK




line arguments. Ellipses, ”. . .”, indicate that the new line is to be ignored, i.e., that

the text on the second line should be typed on the same line as the rest of the TGLOCK

command. The vertical bar, “|”, indicates that there are synonyms for an option,

either of which is valid.





Options for the TGLOCK program are indicated by a dash, or minus sign, imme-




58




The list of options for the TGLOCK program are:


screen, and exit.


screen, and exit.




and exit.


screen, and exit.







The argument for the TGLOCK program is required. The TGLOCK command line

argument is:

59

PROMPT> tglock sampledb

The tree list generation database is now locked.

Figure 2.20: Example of locking a tree list generation database using TGLOCK.

<tgdb name> The name of the tree list generation database that is to be locked, pre-

venting further modification until the tree list generation database is unlocked

using TGUNLOCK. This name is combined with the tree list generation database

path, if specified, to generate a full path to a tree list generation database, and

its files and directories. If the -path command line option is not specified, the

tree list generation database specified is assumed to be in the current directory.

2.6.2 TGLOCK example: Locking a tree list generation database

In this example, the TGLOCK program is used to lock the tree list generation database

sampledb in the current directory. This prevents further modification of the tree list

generation database, and does not allow the addition of new stand measurement files.

Figure 2.20 presents the command used and resulting output.

2.6.3 The TGUNLOCK program

TGUNLOCK is a program that unlocks a tree list generation database allowing the ad-

dition of tree measurement data or other modification of a database. Once unlocked,

a tree list generation database may be locked using the program TGLOCK, preventing

the addition of tree measurement or other modification.





60

TGUNLOCK [-h|-help] ...[-ver|-version] ...[-path <path>] ...[-silent] ...<tgdb name>

Figure 2.21: TGUNLOCK command line syntax with the command line options andarguments.





The TGUNLOCK program is a command line program. The syntax for the TGUNLOCK




line arguments. Ellipses, ”...”, indicate that the new line is to be ignored, i.e., that

the text on the second line should be typed on the same line as the rest of the

TGUNLOCKcommand. The vertical bar, “|”, indicates that there are synonyms for an

option, either of which is valid.





Options for the TGUNLOCKprogram are indicated by a dash, or minus sign, imme-





61



The list of options for the TGUNLOCKprogram are:


screen, and exit.


screen, and exit.




and exit.


screen, and exit.







The argument for the TGUNLOCKprogram is required. The TGUNLOCKcommand line

argument is:

<tgdb name> The name of the tree list generation database that is to be unlocked,

permitting modifications until the tree list generation database is locked using

62

PROMPT> tgunlock sampledb

The tree list generation database is now unlocked.

Figure 2.22: Example of unlocking a tree list generation database using TGUNLOCK.

TGLOCK. This name is combined with the tree list generation database path, if

specified, to generate a full path to a tree list generation database, and its files

and directories. If the ”-path” command line option is not specified, the tree

list generation database specified is assumed to be in the current directory.

2.6.4 TGUNLOCK example: Unlocking a tree list generation database

In this example, the TGUNLOCK program is used to unlock the locked tree list generation

database sampledb in the current directory. This allows modification of the tree list

generation database and the addition of new stand measurement files. Figure 2.22

presents the command used and resulting output.

2.7 Exploding a tree list generation database

The tree list generation database program TGXPLODE is used to convert a specified tree

list generation database into its component files, the schema file, the species mapping

file, and the individual stand measurement files for each treatment. TGXPLODE re-

quires the name of the tree list generation database that is to be converted into its

components and the name of a destination directory for the files. The file and direc-

tory structure for the exploded tree list generation database is identical to that of a

normal tree list generation database, except the individual stand measurement files

will replace the tree list generation database files for each treatment. The schema

file, the species mapping file, and the treatment subdirectories are placed in the des-

tination directory. Descriptions of the schema file, the species mapping file, and the

63

treatment measurement files may be found in Section 2.11.5 and Section 2.11.4, and

Section 2.11.1, respectively. For a description of the tree list generation database file

and directory structure, see Section 2.1.2.

TGXPLODE may be used in conjunction with TGNEW and TGADD to reconfigure the

treatments in a tree list generation database, and the stand index parameters and

their multidimensional histogram based classification of the stand measurement data

in a tree list generation database. This may done in three steps. First, the tree list

generation database schema files is modified to reflect the desired changes in index

parameters of histogram bin widths. Second, TGNEW is used to create a new tree list

generation database. Third, TGADD is used to add all of the stand measurement files

into the reconfigured tree list generation database.

Two scenarios for the application of this procedure readily present themselves.

First, if a tree list generation database has a very sparse classification, caused by

using too many index parameters or having histogram bin widths that are too narrow,

the number of index parameters used and the histogram bin widths may be modified

to resolve this problem. Second, the multidimensional histogram based classification

procedure used to implement the tree list generation database permits the refinement,

or other modification, of the histogram bin widths as more data are added to a

database.

2.7.1 The TGXPLODE program

TGXPLODE is a program that extracts the individual source stand measurement files,

schema file, and species mapping file from a tree list generation database. TGXPLODE

extracts this information from a specified tree list generation database and places

the individual files into a specified destination directory and treatment subdirecto-

ries of the destination directory. This program allows the data buckets in a tree list

generation database to be refined, i.e., it allows the creation of a new tree list gen-

eration database with smaller, or different interval widths or index parameter sets

64

TGXPLODE [-h|-help] ...[-ver|-version] ...[-silent] ...[-path <path>] ...<tgdb name> ...<destination dir>

Figure 2.23: TGXPLODE command line syntax with the command line options andarguments.

from an existing tree list generation database. For example, the original tree list

generation database could eventually contain enough data to add more index param-

eters or smaller interval widths, both providing a refinement of the database, creating

more data buckets, and allowing the generation of random stands which more closely

approximate the desired stands.

The required input arguments for this program are the tree list generation database

name and the destination directory or path. An optional input to the program is the

name of a directory, or path, where a tree list generation database is located. Path

elements are limited to eight (8) characters with filenames limited to an eight-dot-

three format with eight (8) characters forming the base filename, followed by a dot

or period, followed by a three (3) character filename extension. Filenames and path

elements may not contain any blanks or other nonprinting characters. These common

restrictions are imposed for the easy portability of filenames and directory structures.

The TGXPLODE program is a command line program. The syntax for the TGXPLODE








65


of the command line options is not important, but options which have associated val-

ues must immediately precede their values. All command line arguments are required,

and the ordering of the command line arguments when several arguments are present

is important. The two command line arguments for this program are the tree list

generation database name and the destination directory for the database component

files.

Options for the TGXPLODE program are indicated by a dash, or minus sign, imme-







The list of options for the TGXPLODE program are:


screen, and exit.


screen, and exit.




and exit.


screen, and exit.

66







The arguments for the TGXPLODE program are required. The TGXPLODE command

line arguments are:

<tgdb name> The name of the tree list generation database that is to be converted

into its component parts by TGXPLODE. This name is combined with the tree

list generation database path, if specified, to generate a full path to a tree list

generation database, and its files and directories. If the -path command line

option is not specified, the tree list generation database specified is assumed to

be in the current directory.

<destination dir> The destination directory for the component files of the specified

tree list generation database <tgdb name>. This directory must exist. The

destination directory “.” may be used to indicate that the current directory is

the destination. The schema file and the species mapping file for the specified

tree list generation database, schema.dat and spmap.dat, respectively, are both

written to the destination directory. Additionally, for each treatment defined in

the specified tree list generation database, a treatment directory is created in

the destination directory and stand measurement files containing the treatment

data placed into the tree list generation database created in the appropriate

treatment subdirectory of the destination directory.

67

PROMPT> tgxplode tgdb1r00 kaboom

Exploding the tree list generation database: tgdb1r00

Located in current directory

To the destination directory: kaboom

PROMPT> dir kaboom

schema.dat

spmap.dat

t

u

Figure 2.24: Example of exploding a tree list generation database into its compo-nent files using TGXPLODE. The component files are a schema file, a species mappingfile, and individual stand measurement files for each treatment in their appropriatesubdirectory.

2.7.2 TGXPLODE example: Exploding a tree list generation database

In this example the TGXPLODE program is used to convert the tree the tree list gener-

ation database tgdb1r00 into its component parts: a schema file, a species mapping

file, and the measurement files for each treatment. The tree list generation database

is assumed to be in the current directory and the destination directory is named

kaboom. Figure 2.24 presents the command used and resulting output. Also shown

in the figure is a directory listing of the destination directory kaboom. The treatment

subdirectories kaboom/u and kaboom/t contain the individual stand measurement

files for the two treatments.

2.8 Creating a scatterplot file from a tree list generation database

The tree list generation database program TGSCATRP is used to create a scatterplot

file for a specified tree list generation database and treatment. The scatterplot file is

a comma delimited text file with the index parameters for the specified treatment as

68

the data column headings, and the centers of the multidimensional histogram bins,

or data buckets, as the data. The scatterplot file may be used with other software

to visually assess the data coverage in a tree list generation database to determine

treatment combinations or other stand conditions that are not represented. The

output of TGSCATRP when combined with the output of TGSUMRY gives a more or less

complete description of the contents of a tree list generation database.

2.8.1 The TGSCATRP program

TGSCATRP is a program that extracts the centers of the multidimensional histogram

bins, or data buckets, from the index of tree list generation database. TGSCATRP

extracts the index parameter names and data from a specified tree list generation

database and creates comma delimited output that is displayed on the standard out-

put device, typically a terminal screen, or written to a specified output file. The

scatter plot information may be used to determine whether there are gaps in the data

contained in a tree list generation database by plotting one index data column against

another. For example, by plotting site index against stand age one could determine

whether there are particular combinations of these two index parameters that are

not in the database, indicating protocols for stands that ought to be sampled in the

future.

The required inputs for this program are the tree list generation database name

and the name of the treatment whose scatterplot data are desired. An optional input

to the program is the name of a directory, or path, where a tree list generation

database is located and an output file to which the scatterplot data are to be written.

Path elements are limited to eight (8) characters with filenames limited to an eight-

dot-three format with eight (8) characters forming the base filename, followed by a

dot or period, followed by a three (3) character filename extension. Filenames and

path elements may not contain any blanks or other nonprinting characters. These

common restrictions are imposed for the easy portability of file names and directory

69

TGSCATRP [-h|-help] ...[-ver|-version] ...[-silent] ...[-path <path>] ...[-file <file>] ...-treatment <treatment> ...<tgdb name>

Figure 2.25: TGSCATRP command line syntax with the command line options andarguments.

structures.

The TGSCATRP program is a command line program. The syntax for the TGSCATRP












Options for the TGSCATRP program are indicated by a dash, or minus sign, imme-







70

The list of options for the TGSCATRP program are:


screen, and exit.


screen, and exit.




and exit.


screen, and exit.







-file <file> This option specifies a file to which the tree list generation scatter plot

information is to be written. The filename specified must be the name of a new

file. This option is not required. If not specified, the scatter plot information

will be written to the standard output device, typically a terminal screen.

-treatment <treatment> This option specifies the name of the treatment whose

scatterplot data are desired. It must be one of the treatments contained in the

71

PROMPT> tgscatrp -file untreatd.dat -treatment untreated tgdb1r00

Writing scatter plot data to file: untreatd.dat

Figure 2.26: Example of creating a tree list generation database scatterplot file forthe untreated stand data using TGSCATRP.

named tree list generation database. This option is required.

The argument for the TGSCATRP program is required. The TGSCATRP command

line argument is:

<tgdb name> The name of the tree list generation database for which a comma de-

limited treatment scatter plot file by TGSCATRP. This name is combined with the

tree list generation database path, if specified, to generate full path to a tree list

generation database, and its files and directories. If the -path command line

option is not specified, the tree list generation database specified is assumed to

be in the current directory.

2.8.2 TGSCATRP examples: Creating scatterplot files

In these examples the TGSCATRP program is used to create scatterplot data files for the

untreated stand data and the thinned stand data in the tree list generation database

tgdb1r00. The tree list generation database is assumed to be in the current directory.

Figure 2.26 and Figure 2.27 present the command used and resulting output. The

contents of the untreated and thinned stand scatterplot files is not presented directly

due to its large volume, but Section 2.10 presents a set of figures derived from the

scatterplot data indicating the coverage of the data in the tree list generation database

tgdb1r00.

72

PROMPT> tgscatrp -file thinned.dat -treatment thinned tgdb1r00

Writing scatter plot data to file: thinned.dat

Figure 2.27: Example of creating a tree list generation database scatterplot file forthe thinned stand data using TGSCATRP.

2.9 Tree list generation database software error messages

The tree list generation database software has three levels of error messages that

may occur. All errors that are detected by the tree list generation database software,

regardless of level, are considered to be fatal errors, that is, they halt the execution of

the program that was running when the error occurred. The three levels correspond

to the point at which an error of some sort is first detected and reported. The tree

list generation database software error messages report problems that occur at the

level of the program, e.g., TGRAND, which is the highest level, at the level of the tree

list generation database subroutine library, which is the middle level, and at the level

of the subroutine library used to build the tree list generation database subroutine

library, the lowest level.

When an error occurs, an error message is displayed on the standard output device

and program execution halts. The error message contains a variety of information

intended to be helpful in resolving the problem which caused the error occur. First,

an error message contains a toolkit version, which refers to the version of the SPICE

system and its subroutine library, developed and maintained by the Navigation and

Ancillary Information Facility (NAIF) of NASA’s Jet Propulsion Laboratory, which

were used to implement the tree list generation database subroutine library. See

Appendix F for a brief description of the SPICE system and NAIF. Second, an error

message contains a short error indicator that is intended to identify the source of the

error and give a mnemonic for the type of error. This part of the error message is

73

intended for processing by a computer. Third, an error message contains a description

of the error that occurred and the relevant context, e.g., missing values, invalid values,

etc. This part of the error message is intended to help determine exactly what caused

the error to occur. Finally, an error message contains traceback information giving the

list of modules that were called before the error was detected. This last information

is likely to be of use only to software developers working with the tree list generation

database subroutine library, and may be usually ignored by the more typical users of

the tree list generation database system.

=====================================================================

Toolkit version: N0048

<level>(<error indicator>)

<error description>

A traceback follows. The name of the highest level module is first.<program> --> <subroutine 1> --> ... <subroutine N>

=====================================================================

Figure 2.28: Tree list generation database error message format.

Figure 2.28 provides a template for the format of the tree list generation database

error messages. The toolkit version, Toolkit version: N0048 indicates the version

of the SPICE toolkit used to develop the tree list generation database software. This

will remain the same until, or unless, a more recent version of the SPICE toolkit is

obtained and incorporated into the tree list generation database system. The <level>

tag indicates the level of the software where the error occurred. In an actual error

message the <level> tag will be replaced with the program name, e.g., TGADD, if

the error occurs within a program, or with TGLIB if the error occurs in the tree

list generation database subroutine library, or with SPICE if the error occurs in the

SPICE subroutine library. The <error indicator> tag gives a short mnemonic for

74

an error that occurs, e.g., USAGEERROR or NOSUCHFILE. The <error description>

tag provides a text description of the type of error that occurred and possibly some

additional context information such as the valid range for a value. This is the part

of the error message that should provide the most help in determining the cause of

an error. Finally, there is the module traceback which gives the list of module names

in the order that they were called before the error occurred. Figure 2.29 provides an

example of a program level error message using TGRAND.

PROMPT> TGRAND

=====================================================================

Toolkit version: N0048

TGRAND(USAGEERROR)

Empty command line. Type ’TGRAND -h’ for help.

A traceback follows. The name of the highest level module is first.

TGRAND

=====================================================================

Figure 2.29: Example error message from TGRAND. In this example, TGRAND was in-voked without any command line arguments, resulting in the error message displayed.

2.10 Data coverage in the tree list generation database: tgdb1r00

This section presents tables of summary statistics for the data coverage in the tree

list generation database tgdb1r00 for untreated and thinned stands separated down

by stand type. The statistics in the summary tables are derived from the scatterplot

files generated by TGSCATRP for the tree list generation database tgdb1r00 and its

untreated and thinned stand treatments.

The information in the summary tables describes the multidimensional histogram

75

bin centers defining the stand classification in the tree list generation database, and

may be thought of as defining the range of stand conditions represented in the tree

list generation database. Each multidimensional histogram bin represents a canonical

or typical stand with the attributes of its bin center.

The tables of summary statistics can only provide a guide to the data that are

actually contained within the tree list generation database tgdb1r00 due to the high

dimensionality of the classification, six (6) nontreatment index parameters for un-

treated stands and eleven (11) nontreatment index parameters for thinned stands.

The relatively large dimension of these index parameter sets causes a combinatorial

explosion in terms of presenting any sort of detailed printed data summary. Summa-

rizing the data coverage in a tree list generation database is further complicated by

the fact that it will contain gaps, and there is also no convenient manner in which to

briefly summarize this information.

The summary information presented in the tables are the mean, standard devia-

tion, minimum, median, maximum, and count for each index parameter of INTERVAL

type for each treatment, broken down by stand type. Table 2.4 presents the stand type

summary for the untreated and thinned stands in the tree list generation database

tgdb1r00. The table summarizes the individual stand measurement files that com-

prise the tree list generation database, not the canonical stands represented by the

multidimensional histogram bins, so the number of stands in this table need not agree

with the canonical stand counts in the summary tables. A summary of the categorical

index parameter STAND ORIGIN does not appear in the general data summaries, but is

provided by stand type in Table 2.5 and Table 2.6 for untreated and thinned stands,

respectively. All of the data coverage summaries are in metric units. See Table 2.2

for the standard units for each type of data.

76

Table 2.4: Stand type summary based on the individual stand measurement files forthe tree list generation database tgdb1r00. Percentages are rounded to the nearest0.1%, so the row and column percent totals may not agree.

Untreated stands Thinned stands Total

Stand type Count Percent Count Percent Count Percent

PURE DF 3404 35.3 3431 35.6 6835 70.8

PURE WH 800 8.3 480 5.0 1280 13.3

DF DOMINANT 691 7.2 417 4.3 1108 11.5

WH DOMINANT 178 1.8 86 0.9 264 2.7

MIXTURE 136 1.4 26 0.3 162 1.7

Total 5209 54.0 4440 46.0 9649 100.0

Table 2.5: Stand origin summary for untreated stands, based on the individual standmeasurement files for the tree list generation database tgdb1r00. Percentages arerounded to the nearest 0.1%, so the row and column percent totals may not agree.

Natural Planted Total


PURE DF 2603 76.5 801 23.5 3404 65.3

PURE WH 550 68.8 250 31.3 800 15.4

DF DOMINANT 479 69.3 212 30.7 691 13.3

WH DOMINANT 112 62.9 66 37.1 178 3.4

MIXTURE 118 86.8 18 13.2 136 2.6

Total 3862 74.1 1347 25.9 5209 100.0

Table 2.6: Stand origin summary for thinned stands, based on the individual standmeasurement files for the tree list generation database tgdb1r00. Percentages arerounded to the nearest 0.1%, so the row and column percent totals may not agree.

Natural Planted Total


PURE DF 2186 63.7 1245 36.3 3431 77.3

PURE WH 449 93.5 31 6.5 480 10.8

DF DOMINANT 372 89.2 45 10.8 417 9.4

WH DOMINANT 86 100.0 0 0.0 86 1.9

MIXTURE 23 88.5 3 11.5 26 0.6

Total 3116 70.1 1324 29.8 4440 100.0

77

2.10.1 Data coverage summary for untreated stands

Table 2.7: Stand coverage summary for untreated stands and the stand type PURE DF.There are 1921 canonical stands or multidimensional histogram bins in the standclassification for this stand type. Metric units are used for this summary.

Index parameter Mean SD Minimum Median Maximum

SITE INDEX 50 32.613 7.126 16.500 31.500 49.500

STAND TOTAL AGE 49.704 18.009 6.000 46.000 126.000

QMD 21.598 10.981 2.000 18.000 74.000

STAND DENSITY 1650.963 1151.281 100.000 1300.000 9700.000

Table 2.8: Stand coverage summary for untreated stands and the stand type PURE WH.There are 493 canonical stands or multidimensional histogram bins in the stand clas-sification for this stand type. Metric units are used for this summary.


SITE INDEX 50 36.946 9.898 19.500 34.500 58.500

STAND TOTAL AGE 46.876 16.252 18.000 46.000 94.000

QMD 17.578 7.870 6.000 14.000 42.000

STAND DENSITY 3692.292 2852.759 700.000 2700.000 14100.000

Table 2.9: Stand coverage summary for untreated stands and the stand typeDF DOMINANT. There are 491 canonical stands or multidimensional histogram binsin the stand classification for this stand type. Metric units are used for this summary.


SITE INDEX 50 33.577 6.723 16.500 34.500 43.500

STAND TOTAL AGE 48.998 16.554 14.000 46.000 106.000

QMD 18.774 8.034 6.000 18.000 50.000

STAND DENSITY 2210.387 1397.065 100.000 1900.000 9500.000

78

Table 2.10: Stand coverage summary for untreated stands and the stand typeWH DOMINANT. There are 138 canonical stands or multidimensional histogram binsin the stand classification for this stand type. Metric units are used for this summary.


SITE INDEX 50 39.609 8.019 25.500 37.500 55.500

STAND TOTAL AGE 48.435 15.118 18.000 46.000 86.000

QMD 21.275 6.595 10.000 22.000 42.000

STAND DENSITY 2056.522 1107.569 700.000 1700.000 5500.000

Table 2.11: Stand coverage summary for untreated stands and the stand typeMIXTURE. There are 96 canonical stands or multidimensional histogram bins in thestand classification for this stand type. Metric units are used for this summary.


SITE INDEX 50 38.344 6.011 28.500 40.500 46.500

STAND TOTAL AGE 53.750 11.920 22.000 54.000 74.000

QMD 22.333 6.583 10.000 22.000 38.000

STAND DENSITY 1535.417 1182.680 300.000 1000.000 4700.000

79

2.10.2 Data coverage summary for thinned stands

Table 2.12: Stand coverage summary for thinned stands and the stand type PURE DF.There are 2820 canonical stands or multidimensional histogram bins in the standclassification for this stand type. Metric units are used for this summary.


SITE INDEX 50 35.769 6.435 19.500 37.500 49.500

STAND TOTAL AGE 55.079 16.545 22.000 54.000 126.000

NUMBER OF THINNINGS 2.640 1.721 1.000 2.000 7.000

PRE THIN DENSITY 1139.149 1130.681 100.000 700.000 6100.000

PRE THIN BASAL AREA 56.938 24.407 2.250 56.250 167.250

PCT BASAL AREA REMOVED 32.414 13.805 0.750 32.250 95.250

YEARS SINCE TREATMENT 5.645 6.052 2.000 2.000 58.000

QMD 29.169 13.781 2.000 26.000 82.000

STAND DENSITY 622.837 482.459 100.000 500.000 4100.000

Table 2.13: Stand coverage summary for thinned stands and the stand type PURE WH.There are 424 canonical stands or multidimensional histogram bins in the stand clas-sification for this stand type. Metric units are used for this summary.


SITE INDEX 50 31.804 2.956 25.500 31.500 40.500

STAND TOTAL AGE 55.972 14.735 26.000 54.000 94.000


PRE THIN DENSITY 2983.019 2475.658 300.000 2100.000 12700.000




QMD 22.811 8.100 6.000 22.000 50.000

STAND DENSITY 1187.736 785.230 300.000 900.000 3900.000

2.11 File types, file formats, and keyword definitions

All of the tree list generation database input and output files are plain text files. The

tree list generation database input and output file formats have been designed to be

80

Table 2.14: Stand coverage summary for thinned stands and the stand typeDF DOMINANT. There are 370 canonical stands or multidimensional histogram binsin the stand classification for this stand type. Metric units are used for this summary.


SITE INDEX 50 36.219 5.575 19.500 37.500 40.500

STAND TOTAL AGE 53.276 14.509 26.000 50.000 94.000


PRE THIN DENSITY 1454.595 1248.041 100.000 1000.000 5900.000




QMD 25.459 10.798 6.000 22.000 54.000

STAND DENSITY 812.973 508.885 100.000 700.000 2700.000

Table 2.15: Stand coverage summary for thinned stands and the stand typeWH DOMINANT. There are 77 canonical stands or multidimensional histogram bins inthe stand classification for this stand type. Metric units are used for this summary.


SITE INDEX 50 36.019 2.315 31.500 37.500 37.500

STAND TOTAL AGE 55.766 17.466 34.000 50.000 94.000


PRE THIN DENSITY 1048.052 772.148 300.000 900.000 3500.000




QMD 28.338 10.164 10.000 26.000 50.000

STAND DENSITY 575.325 253.479 300.000 500.000 1100.000

81

Table 2.16: Stand coverage summary for thinned stands and the stand type MIXTURE.There are 25 canonical stands or multidimensional histogram bins in the stand clas-sification for this stand type. Metric units are used for this summary.


SITE INDEX 50 36.780 3.900 28.500 37.500 40.500

STAND TOTAL AGE 55.600 14.832 34.000 50.000 86.000


PRE THIN DENSITY 940.000 336.650 500.000 900.000 1500.000




QMD 26.960 7.684 14.000 26.000 38.000

STAND DENSITY 684.000 190.788 300.000 700.000 1100.000

easily understood by human readers as well as easily parsed by software. The file

formats are line oriented, meaning that the fundamental unit in a file is a line of text.

The files also generally follow a keyword equals value format, where a keyword and

its value must appear on the same line separated by an equal sign (=). The keywords

may appear in any order within a particular tree list generation database file. Blank

lines and comment lines, indicated by a percent symbol (%) as the first nonblank

character of a line, are ignored.

The file formats are straightforward being easy to modify and understand. The

ability to place comments into the files without affecting the data provides a mecha-

nism for annotating files used for a particular task. This feature, if used, could provide

invaluable information about particular data sets at a later time. The data files also

provide an exact record of the input and output for a particular analysis, and may

be placed in an archive, allowing the reconstruction of the analysis at a later time, if

this were to become necessary.

82

2.11.1 Stand measurement file

The tree list generation database stand measurement file, or measurement file, is a

text file containing a description of a forest stand and an actual tree list. The tree

list consists of DBH and height measurements with an indication of species for each

tree in a measurement data set obtained from a sample plot. The stand measurement

file uses a keyword equals value format and consists of two parts: a set of stand

description keywords and values and the actual measured tree list data. The stand

description keywords and their associated values describe the physical attributes of

a measured stand and basic sample plot characteristics, e.g., site index, stand age,

stand origin, sample plot size, treatment history, and the measurement units of the

data. The measured tree list data are comprised of a DBH measurement, a height

measurement, and a short two character species identification code, e.g., DF, or a

longer four to six character species identification code, e.g., PSME, for each measured

tree. All three items are required for each tree. Either the short or long species

identification codes may be used in a stand measurement file, or they may be freely

intermixed within a stand measurement file.

Each stand measurement file must contain a complete set of the stand description

keywords for its treatment type, and each keyword must have an appropriate value.

The stand description keywords may appear in any order within a stand measure-

ment file. The stand description keywords are not case sensitive, so the keyword

TREATMENT TYPE is equivalent to the keyword Treatment Type. Each stand descrip-

tion keyword must appear with its value on a single line in a stand measurement

file.

The actual tree list data are the only exception to the keyword equals value for-

mat. The beginning of the tree list measurement data is indicated by the marker

BEGIN TREE DATA, and the end of the tree list measurement data is indicated by the

marker END TREE DATA. As with the stand description keywords , the case of the begin

83

and end markers is not important.

The individual tree data consists of DBH and height measurements and a species

identifier, and the order of these data on the lines in the tree list data section of a

stand measurement file must be specified. The first nonblank, noncomment line fol-

lowing the BEGIN TREE DATA marker must contain the tree list data column definitions

indicating the order of the tree measurement data columns. The tree list data column

definitions are specified by some blank delimited combination of the three identifiers

DBH, HEIGHT, and SPECIES, which specify the order of the DBH, height, and species

data for each tree in the tree list.

The END TREE DATA marker must be present; there is no implied end to the tree

list data when a new keyword or the end of file is encountered. Once the actual tree

list data have been identified by the BEGIN TREE DATA marker, only tree list data, one

set of blank delimited DBH, height, and species values per line for each tree, in the

order specified by the data column definitions, may appear, until the END TREE DATA

marker is encountered. See Figure 2.30 and Figure 2.31 for stand measurement file

templates for untreated and thinned stands, respectively.

In the stand measurement file templates for untreated and thinned stands, Fig-

ure 2.30 and Figure 2.31, respectively, a vertical bar | indicates that one of the two or

more possible values is to be selected, and a numeric value must replace the marker

<number>. The N sets of tree DBH, height and species markers must be replaced by

appropriate numeric DBH and height measurements and species codes. The individ-

ual tree DBH and height measurement data values in a stand measurement file must

be appropriate for the measurment units defined for the file.

Each stand measurement file must contain a MEASUREMENT UNITS keyword. This

keyword indicates the units of measure for the numerical data contained within a stand

measurement file. The measurement units keyword may have a value of IMPERIAL

or METRIC, depending on the measurement system used to represent the data. See

Table 2.2 for the standard measurement units for each measured item. The tree

84

MEASUREMENT_UNITS = IMPERIAL | METRICTREATMENT_TYPE = THINNEDSITE_INDEX_50 = <number>STAND_TOTAL_AGE = <number>STAND_ORIGIN = PLANTED | NATURALPLOT_SIZE = <number>BEGIN_TREE_DATA

DBH HEIGHT SPECIESDBH_1 HEIGHT_1 SPECIES_1DBH_2 HEIGHT_2 SPECIES_2DBH_3 HEIGHT_3 SPECIES_3

.

.

.DBH_N HEIGHT_N SPECIES_N

END_TREE_DATA

Figure 2.30: Stand measurement file template for untreated stands. A vertical bar |separates possible values and indicates that one of the two or more possible values isto be selected, and a numeric value must replace the marker <number>.

list generation database software will automatically perform any unit conversions

necessary when reading a stand measurement file. Internally the tree list generation

database uses metric units to represent all of the tree and stand measurement data.

All stand measurement files. regardless of treatment, must contain the following

stand description keywords.

TREATMENT TYPE A text valued keyword that defines the type of treatment, or treat-

ments, applied to a stand. Valid treatment types are UNTREATED and THINNED.

SITE INDEX 50 A numeric valued keyword for the site index of Douglas-fir, western

hemlock, or mixed Douglas-fir and western hemlock stands at a reference stand

age of 50 years. If a stand is not pure, i.e., the basal area for a dominant species

is < 80%, the site index value is for the species with the largeest percentage

of basal area. The value for this keyword must be greater than zero; no other

verification of the value is performed. The value specified should be appropriate

for the measurement units.

85

MEASUREMENT_UNITS = IMPERIAL | METRICTREATMENT_TYPE = THINNEDSITE_INDEX_50 = <number>STAND_TOTAL_AGE = <number>STAND_ORIGIN = PLANTED | NATURALNUMBER_OF_THINNINGS = <number>PRE_THIN_DENSITY = <number>PRE_THIN_BASAL_AREA = <number>PCT_STEMS_REMOVED = <number>PCT_BASAL_AREA_REMOVED = <number>YEARS_SINCE_TREATMENT = <number>PLOT_SIZE = <number>BEGIN_TREE_DATA

DBH HEIGHT SPECIESDBH_1 HEIGHT_1 SPECIES_1DBH_2 HEIGHT_2 SPECIES_2DBH_3 HEIGHT_3 SPECIES_3

.

.

.DBH_N HEIGHT_N SPECIES_N

END_TREE_DATA

Figure 2.31: Stand measurement file template for thinned stands. A vertical bar |separates possible values and indicates that one of the two or more possible values isto be selected, and a numeric value must replace the marker <number>.

86

STAND AGE A numeric valued keyword for the age in years of the stand of trees. This

value must be greater than zero.

STAND ORIGIN A text valued keyword that defines the origin of a stand. Valid stand

origins are NATURAL and PLANTED. If a stand was seeded it is assumed to have a

natural origin.

PLOT SIZE A numeric valued keyword that specifies the sample plot size for the tree

measurement data in the tree list. This is necessary to permit the scaling of

certain values to a per unit area, hectare or acre, basis. The value specified

should be appropriate for the measurement units. The value for this keyword

must be greater than zero and less than 10.

Stand measurement files for treated stands must also contain stand description key-

words which define the treatment history of the stand. The following stand description

keywords describe the thinning history of a stand and are required for measurement

files from thinned stands.

NUMBER OF THINNINGS A numeric valued keyword that defines the number of thin-

nings that have been performed on a stand. This value must be greater than

zero, otherwise no thinnings have been performed.

PRE THIN DENSITY A numeric valued keyword defining the stand density as trees

per unit area, acre or hectare, depending on the measurement units before the

thinning. The value specified should be appropriate for the measurement units.

The value of this keyword must be greater than zero. No other input consistency

checks are performed at this time.

PRE THIN BASAL AREA A numeric valued keyword defining the stand basal area per

unit area, hectare or acre, depending on the measurement units before the

87

thinning. The value specified should be appropriate for the measurement units.

The value of this keyword must be greater than zero. No other input consistency

checks are performed at this time.

PCT STEMS REMOVED A numeric valued keyword defining the percentage of stems re-

moved during the thinning. The value of this keyword must be greater than

zero and less than 100. The upper bound of 100 is not currently enforced by

the tree list generation database software. No other input consistency checks

are performed at this time.

PCT BASAL AREA REMOVED A numeric valued keyword defining the percentage of basal

area removed during the thinning. The value of this keyword must be greater

than zero and less than 100. The upper bound of 100 is not currently enforced

by the tree list generation database software. No other input consistency checks

are performed at this time.

YEARS SINCE TREATMENT A numeric valued keyword defining the number of years

since the last thinning. The value of this keyword must be greater than zero.

2.11.2 Stand description file

The tree list generation database stand description file, or description file, is a text

file defining the attributes of a forest stand that is to be generated using a tree list

generation database. The stand description file follows a strict keyword equals value

format using the stand index parameter keywords and their associated values to define

the gross, aggregate characteristics of a forest stand. Each index parameter keyword

must appear on a line with an appropriate value, delimited by an equal sign (=).

In addition to the index parameter keywords, there are several optional keywords,

specific to the stand description file, which provide additional information to the

program TGRAND when it is used to generate a simulated stand or tree list. As is true

88

TREATMENT_TYPESITE_INDEX_50STAND_TOTAL_AGESTAND_ORIGINQMDSTAND_DENSITYSTAND_TYPE

Figure 2.32: Stand description file index parameters required for the UNTREATED treat-ment type in the tree list generation database tgdb1r00.

with all of the tree list generation database input files, blank lines and comment lines,

indicated by a line whose first nonblank character is a percent sign (%), are ignored

when the files are processed.

Each stand description file must contain a complete set of index parameter key-

words for its treatment type and the tree list generation database in use when gener-

ating simulated stands or tree lists. Each index parameter keyword must also have an

appropriate value. The stand description keywords may appear in any order within a

stand description file. The stand description keywords are not case sensitive, so the

keyword TREATMENT TYPE is equivalent to the keyword Treatment Type. Each stand

description keyword must appear with its value on a single line in a stand description

file. For the tree list generation database tgdb1r00, the required index parameters are

listed in Figure 2.32 and Figure 2.33, for untreated and thinned stands, respectively.

Index parameters keywords not used in a particular tree list generation database may

also appear in a stand description file, providing a more complete stand description,

possibly for later use. In this situation the extra index parameter keywords are ig-

nored.

Figure 2.34 and Figure 2.35 provide examples of stand description files for un-

treated and thinned stands, respectively. In these two examples, the stand descrip-

tion files contain all of the index parameter keywords, even though only those listed

in Figure 2.32 and Figure 2.33 were required for each treatment. The complete set

89

TREATMENT_TYPESITE_INDEX_50STAND_TOTAL_AGESTAND_ORIGINQMDSTAND_DENSITYSTAND_TYPENUMBER_OF_THINNINGSPRE_THIN_DENSITYPRE_THIN_BASAL_AREAPCT_BASAL_AREA_REMOVEDYEARS_SINCE_TREATMENT

Figure 2.33: Stand description file index parameters required for the THINNED treat-ment type in the tree list generation database tgdb1r00.

of optional keywords is also used in these examples. The two stand description files

presented in the figures were used to generate simulated stands for the testing and

validation of the tree list generation database system using the tree list generation

database tgdb1r00. The keywords beginning with “RANDOM MF ” are the optional key-

words, and provide information defining the processing modes of TGRAND and other

items that affect the process of generating a simulated stand or tree list.

The MEASUREMENT UNITS keyword must be contained in each tree list generation

database stand description file. This keyword indicates the units of measure for the

numerical data contained within a stand description file. The measurement units

keyword may have a value of IMPERIAL or METRIC, depending on the measurement

system used to represent the data. See Table 2.2 for the standard measurement units

for each measured item. The tree list generation database software will automatically

perform any unit conversions necessary when reading a schema file. Internally the

tree list generation database uses metric units to represent all of the tree and stand

measurement data.

The optional keywords for a stand description file are defined below. All of the

option keywords begin with the string “RANDOM MF ” to indicate that they are used

90

%% Tree list generator stand description file.%% File created: 11/03/1999 at 16:22:54.52%MEASUREMENT_UNITS = IMPERIALTREATMENT_TYPE = UNTREATEDSITE_INDEX_50 = 134.100STAND_TOTAL_AGE = 60.000STAND_ORIGIN = NATURALMEAN_DBH = 10.835MEAN_HEIGHT = 82.765QMD = 12.442TOP_HEIGHT = 124.125STAND_DENSITY = 240.000STAND_BASAL_AREA = 202.627PCT_DF_STEMS = 62.500PCT_WH_STEMS = 4.167PCT_OT_STEMS = 33.333PCT_DF_BASAL_AREA = 93.962PCT_WH_BASAL_AREA = 0.233PCT_OT_BASAL_AREA = 5.805STAND_TYPE = PURE_DFNUMBER_OF_SPECIES = 3.000RANDOM_MF_UNITS = IMPERIALRANDOM_MF_PLOT_SIZE = 0.200RANDOM_MF_PURE = NORANDOM_MF_JITTER = NORANDOM_MF_NAME = ptb00001.rmf

Figure 2.34: Example stand description file for untreated stands. This stand de-scription file contains all of the possible index parameters with their values for theUNTREATED treatment type. Index parameters not used in a particular tree list gen-eration database but appearing in a description file are ignored.

91

%% Tree list generator stand description file.%% File created: 11/03/1999 at 16:21:09.60%MEASUREMENT_UNITS = IMPERIALTREATMENT_TYPE = THINNEDSITE_INDEX_50 = 134.100STAND_TOTAL_AGE = 60.000STAND_ORIGIN = NATURALNUMBER_OF_THINNINGS = 1.000PRE_THIN_DENSITY = 160.000PRE_THIN_BASAL_AREA = 178.900PCT_STEMS_REMOVED = 18.750PCT_BASAL_AREA_REMOVED = 14.925YEARS_SINCE_TREATMENT = 0.000MEAN_DBH = 13.954MEAN_HEIGHT = 102.292QMD = 14.653TOP_HEIGHT = 118.650STAND_DENSITY = 130.000STAND_BASAL_AREA = 152.233PCT_DF_STEMS = 92.308PCT_WH_STEMS = 0.000PCT_OT_STEMS = 7.692PCT_DF_BASAL_AREA = 98.744PCT_WH_BASAL_AREA = 0.000PCT_OT_BASAL_AREA = 1.256STAND_TYPE = PURE_DFNUMBER_OF_SPECIES = 2.000RANDOM_MF_UNITS = IMPERIALRANDOM_MF_PLOT_SIZE = 0.200RANDOM_MF_PURE = NORANDOM_MF_JITTER = NORANDOM_MF_NAME = ptb00001.rmf

Figure 2.35: Example stand description file for thinned stands. This stand de-scription file contains all of the possible index parameters with their values for theTHINNED treatment type. Index parameters not used in a particular tree list genera-tion database but appearing in a description file are ignored.

92

to affect the output random stand measurement file that is produced by TGRAND. The

complete set of index parameter keywords, other than the description file specific

keywords presented here, are defined in Section 2.11.6.

RANDOM MF NAME A text valued keyword that defines the name of the output random

stand measurement file used by TGRAND to store the simulated tree list. The

file name specified must be the name of a new file. If a path is specified as a

part of the file name, all of the path elements must exist. If this keyword is

not defined, a default output filename of output.rmf is used, provided TGRAND

is being used in its single file processing mode, otherwise it is an error to not

define this keyword.

RANDOM MF UNITS A text valued keyword that defines the output units to be used

in the random stand measurement file created by TGRAND. Valid values for this

keyword are IMPERIAL or METRIC; any other value will result in an error. If this

keyword is not defined, the measurement units in the random measurement file

will be the same as the measurement units in the stand description file used.

RANDOM MF PURE A text valued keyword that defines whether 100% pure stands are to

be generated for pure Douglas-fir or pure western hemlock stands. Valid values

for this keyword are YES or NO; any other value will result in an error. 100%

pure stands of Douglas-fir will be generated if this keyword has a value of YES

and the stand type is PURE DF, and 100% pure stands of western hemlock will

be generated if this keyword has a value of YES and the stand type is PURE WH.

Stands with a stand type other than PURE DF and PURE WH are not affected by

the value of this keyword. If this keyword is not defined, the default value is

to not generate 100% pure stands. See Section 3.1.4 for the procedure used to

generate 100% pure stands.

93

RANDOM MF JITTER A text valued keyword that defines whether individual tree heights

are to be jittered. Valid values for this keyword are YES or NO; any other value

will result in an error. The jittering of tree heights could provide a better tree

height distribution, since most of the individual tree heights stored in a tree

list generation database will be estimated from height-diameter curves. If this

keyword is not defined, the default is to not jitter heights. See Section 3.1.4 for

the procedure used to generate trees with jittered heights.

RANDOM MF PLOT SIZE A numeric valued keyword that specifies the output plot size

for the simulated tree list. This is used to generate the appropriate number of

trees per unit area, hectare or acre, for the simulated stand. The value specified

should be appropriate for stand description file measurement units, not the

units specified by the RANDOM MF UNITS keyword. The value for this keyword

must be greater than zero and less than 10. If this keyword is not defined, the

default value used is 0.1 unit, hectare or acre.

The RANDOM MF NAME keyword is required if TGRAND is being used in its batch

processing mode with a listing file, indicated by the -lf command line option. If

this keyword is not defined and TGRAND is being used in batch mode, an error will

occur and batch processing will halt. If the RANDOM MF NAME keyword is not defined

and TGRAND is being used in its single file mode, indicated by the -df command line

option, a filename of output.rmf is used by default.

If the RANDOM MF PLOT SIZE keyword is used to define the size of the area repre-

sented by the simulated tree list, it is critical that its numeric value be in the appro-

priate units. The units for all of the numeric valued keywords in a description file are

defined by the value of the MEASUREMENT UNITS keyword for the description file, not

by the units specified by the RANDOM MF UNITS keyword. The RANDOM MF UNITS key-

word is used to specify the measurement units for the output random measurement

file, and may be used to automatically perform a units conversion if the measure-

94

ment units of the description file are different than the measurement units desired

for the random measurement file. As an example, suppose that the description file

has a value of METRIC for the MEASUREMENT UNITS keyword, and a value of IMPERIAL

for the RANDOM MF UNITS keyword. A value of 0.1 (ha) for the RANDOM MF PLOT SIZE

keyword would produce an output plot size of 0.247 (ac).

When a stand description file is used to generate a simulated stand using TGRAND,

values of some of the index parameter keywords in the description file are simply

copied, possibly with a change of units, to the random stand measurement file that is

created. The index parameter keyword values that are copied for all treatment types

are:

TREATMENT_TYPE

SITE_INDEX_50

STAND_TOTAL_AGE

STAND_ORIGIN,

and the for the thinned treatment type, the additional index parameter keyword

values that are copied to the random measurement file are:

NUMBER_OF_THINNINGS

PRE_THIN_DENSITY

PRE_THIN_BASAL_AREA

PCT_STEMS_REMOVED

PCT_BASAL_AREA_REMOVED

YEARS_SINCE_TREATMENT.

These index parameter values are copied from a stand description file to a ran-

dom stand measurement file for two reasons. First, the values specified in the stand

description file are the known or desired values for those particular stand attributes

in the simulated stand, so copying them makes sense. Second, the multidimensional

histogram based stand classification procedure combines data from similar stands and

refers to them by the center of the histogram bin. Thus exact values for these index

parameters are not easily available. Further, having exact values for these index pa-

rameters would not necessarily be particularly useful in this context. For example,

what stand age should be given to a simulated stand if the exact stand ages were

95

available for all of the stands in a multidimensional histogram bin? The average age

of all stands? The closest age to the desired age? The simplest, and most appropriate,

resolution of this issue is to simply copy the index parameter values from the stand

description file to the output random stand measurement file, as is done, using the

desired stand attribute values.

2.11.3 Random stand measurement file

The tree list generation database random stand measurement file, or random mea-

surement file, is a text file that contains a set of stand description keywords and

a simulated tree list obtained from a tree list generation database. The format of

the random measurement file is identical to that of the stand measurement file, Sec-

tion 2.11.1; both file types contain the same kind of data, so the same file format

is used. A random stand measurement file obtained from a stand description file,

Section 2.11.2, using the tree list generation database program TGRAND, Section 2.4,

will contain a comment indicating that it is a random stand measurement file, see

Figure 2.36. At this time this is the only method for distinguishing the contents of

a measurement file as being simulated data. Thus, some care must be taken to not

confuse simulated data in random measurement files with actual stand measurement

data.

When a random stand measurement file is created using TGRAND and a stand

description file, values of some of the index parameter keywords in the description file

are simply copied, possibly with a change of units, to the random stand measurement

file. The index parameter keyword values that are copied for all treatment types are:

TREATMENT_TYPE

SITE_INDEX_50

STAND_TOTAL_AGE

STAND_ORIGIN,

and the for the thinned treatment type, the additional index parameter keyword

values that are copied to the random measurement file are:

96

%% Tree list generator random stand measurement file.%% File created: 12/07/1999 at 12:24:11.63%MEASUREMENT_UNITS = METRICTREATMENT_TYPE = UNTREATEDSITE_INDEX_50 = 24.07900STAND_TOTAL_AGE = 54.00000STAND_ORIGIN = NATURALPLOT_SIZE = 0.06100BEGIN_TREE_DATA

DBH HEIGHT SPECIES44.291 27.373 DF27.908 23.847 DF

.

.

.END_TREE_DATA

Figure 2.36: Random stand measurement file example with the file header comments.These comments are currently the only way of distinguishing a random measurementfile containing simulated data from a stand measurement file containing actual data, sosome care must be taken that files containing simulated data do not become confusedwith files containing actual data.

97

NUMBER_OF_THINNINGS

PRE_THIN_DENSITY

PRE_THIN_BASAL_AREA

PCT_STEMS_REMOVED


YEARS_SINCE_TREATMENT.

See Section 2.11.2 for an explanation of why these values are simply copied from the

description file to the random measurement file.

2.11.4 Species mapping file

The tree list generation database species mapping file defines an association between

unique alphabbetic,or character based, codes for the identification of tree species and

an unique integer identification code for each tree species that is recognized by the

tree list generation database system. The alphabetic tree species codes are two (2) or

four (4) to six (6) character codes in common use for identifying tree species, and the

integer identification codes are used internally by the tree list generation database

software. In addition to the species identification codes, the species mapping file also

contains the common and Latin names for each recognized tree species. The tree list

generation database species mapping file is used when creating a tree list generation

database, and it defines the tree species that will be recognizable for a particular tree

list generation database. Tree species codes that are not contained in the species

mapping file used to create a tree list generation database will not be recognized, and

will cause an error to be signalled by the programs requiring the species information

for their use.

The tree list generation database species mapping file should not be modified by

any users of the tree list generation database executable programs. Additions or

modifications made to the species mapping file must be done with extreme care. If

modifications to this file are made incorrectly, its use with the tree list generation

database software may produce unpredictable results. In particular, it is important

that all integer and character based species codes are unique. Two situations where

98

modifications to the tree list generation database species mapping file are warranted

will be discussed shortly.

The tree list generation database species mapping file contains five comma delim-

ited columns:

1. The unique integer tree species ID code.

2. The unique two (2) letter tree species code.

3. The unique four (4) to six (6) letter tree species code.

4. The common name for the species.

5. The Latin name for the species.

The commas in this file are necessary . Comment lines and blank lines are ignored.

Comment lines are lines whose first nonblank character is a percent sign (%). Only the

first three columns are currently used by the tree list generation database software.

The common and Latin species names have been included for completeness. See

Table 2.17 for the species codes and common names. The Latin names may be found

in Appendix C.

There were several tree species code conflicts in the data, i.e., two different codes

being used to represent the same tree species. To resolve these conflicts the species

codes used by the SMC were selected for use in the tree list generation database.

• The tree species code OH for other hardwood was changed to be HD, the code

used by the SMC. Further use of the tree species code OH is not recommended.

• The tree species code PM for pacific madrone Arbutus menziesii was changed to

be MD, the code used by the SMC. Further use of the tree species code PM is not

recommended.

99

• The tree species code O3 appeared in some of the data. A description of this

species code could not be found, so it was changed to OT, the code used by the

SMC for unrecognized or other species. Further use of the tree species code O3

is not recommended.

These decisions were made because it is in the best, long term interests of the SMC

cooperators to standardize the various tree species codes used, making the transfer of

data among the cooperators easier and more reliable. If the data used to create the

tree list generation database are indicative of the data held by the SMC cooperators,

there is greater than 99% agreement already, so these recommendations should have

a minimal impact.

The two situations when modifying a tree list generation database species mapping

file are warranted occur when a new tree species must be recognized by the tree list

generation database software, and when the integer species codes must be changed to

agree with the integer tree species codes used in other software. In the first situation,

a new set of species codes may be added to the end of the file by incrementing the

last integer species code by one (1) and completing the rest of the comma delimited

fields for the definition of the new species. When adding a new tree species, the two

(2) and four (4) to six (6) character species codes added must be unique. In the

second situation, the integer species codes in the file may be changed provided that

the integer codes added are unique. In both instances, a new tree list generation

database must be created using the modified species mapping file. The use of the

modified integer ID codes with an existing tree list generation database will cause

incorrect identification of the tree species codes.

The executable program TGXPLODE may be used first to break an existing tree

list generation database into its component files. The executable programs TGNEW

and TGADD may then be used with a new species mapping file and the component files

extracted by TGXPLODE to build a new tree list generation database that will recognize

100

all of the species.

2.11.5 Schema file

The tree list generation database schema file is a text file defining the structure of a

tree list generation database that is to be created. The tree list generation database

schema file is analogous to a database schema that is used to define the number

and types of data columns in a database. The tree list generation database schema

file defines which treatment types and stand classification variables, or stand index

parameters, are used to create a particular tree list generation database. The schema

file also defines the attributes of each stand index parameter for a particular treatment

type: the index parameter type and the histogram bin width that is used for the stand

classification and indexing. Index parameters may have one of two types: FIXED for

stand attributes that are represented by categorical variables, such as stand origin

or stand type, or INTERVAL for stand attributes that are represented by continuous

variables, such as age, mean DBH, top height, or basal area per acre. The histogram

bin width attribute is only used for index parameters which represent continuous

variables and have an index parameter type of INTERVAL.

Each tree list generation database schema file is composed of three nested levels,

each with its own scope. The first level is the schema file itself, which defines the

treatment types that will be contained in a particular tree list generation database.

The second level is the treatment definition level, which defines the index parameters

that are to be used for a particular treatment in a particular tree list generation

database. The third level is the index parameter definition level, and this is the level

at which the attributes of each index parameter are defined for a treatment in a tree

list generation database. The index parameter attributes indicate whether a particular

index parameter is to be used for a treatment, the type of the index parameter, e.g.,

FIXED or INTERVAL, and the histogram bin width for any index parameter with an

INTERVAL type.

101

Table 2.17: Tree list generation database species mapping table. The table presentsthe internal integer ID codes used by the software, a short two (2) character treespecies code, a long four (4) to six (6) character tree species code, and the commonname for a tree species. See Table B.1 for the latin names of the tree species. Seealso Appendix C for an example of a species mapping file.

ID Code Short Code Long Code Common Name1 BC POTR Black Cottonwood2 BR BEPA Birch3 CA RHPU Cascara4 CH PREM Bitter Cherry5 DF PSME Douglas-fir6 DM ACGL Douglas Maple7 DW CONU Western Flowering Dogwood8 ES PIEN Englemann Spruce9 GC CACH Golden Chinkapin10 GF ABGR Grand Fir11 HD HD Other hardwoods12 HW CRDO Hawthorn13 HZ COCO2 Hazelnut14 IC LIDE Incense Cedar15 KP PIAT Knobcone Pine16 LP PICO Lodgepole Pine17 MA ACMA Bigleaf Maple18 MD ARME Madrone19 MY UMCA Oregon Myrtle20 NF ABPR Noble Fir21 OA FRLA Oregon Ash22 OO QUGA Garry Oak23 OT OT Other species or mixtures24 PC CHLA Port Orford Cedar25 PP PIPO Ponderosa Pine26 RA ALRU Red Alder27 RC THPL Western Red Cedar28 RF ABMA California Red Fir29 SA ALCR2 Sitka Alder30 SB AMELA Servicberry31 SF ABAM Pacific Silver Fir32 SP PILA Sugar Pine33 SS PISI Sitka Spruce34 SX SALIX Willow

102

Table 2.17: (continued)35 TO LIDE2 Tan Oak36 VM ACCI Vine Maple37 WF ABCO White Fir38 WH TSHE Western Hemlock39 WL LAOC Western Larch40 WM MYCA Waxmyrtle41 WP PIMO Western White Pine42 WY TABR Western Yew43 YC CHNO Alaska Yellow Cedar44 AF ABLA Alpine fir45 BO QUKE Black Oak

The tree list generation database schema file uses a slightly modified keyword

equals value format. Treatment type definitions and index parameter attribute def-

initions each have an end marker, END TREATMENT or END INDEX PARAMETER, respec-

tively, to denote the end of the particular nesting level or scope. The end of the a

schema file denotes the end of the treatment type definition level. Index parameter

attributes are defined in a strict keyword equals value manner, with the keyword and

value both appearing on the same line in the file delimited by an equal sign (=). The

treatment type definition level and index parameter definition level of a schema file

begin with a keyword equals value notation and the keywords TREATMENT TYPE and

INDEX PARAMETER, respectively. As is true with all of the tree list generation database

input files, blank lines and comment lines, indicated by a line whose first nonblank

character is a percent sign (%), are ignored when the files are read.

Templates for the three nested levels in a tree list generation database schema

file are presented in Figure 2.37, Figure 2.38, and Figure 2.39. The figures present

increasing levels of detail, or nesting level. In the figures, a vertical bar “|” separates

possible values for an index parameter attribute keyword, and indicates that one, and

only one of the two or more possible values is to be selected. The marker <number>

should be replaced by an appropriate nonzero numerical value. The square brackets

103

MEASUREMENT_UNITS = IMPERIAL | METRIC

TREATMENT_TYPE = <treatment_1><index parameter definitions for treatment_1>

END_TREATMENT

TREATMENT_TYPE = <treatment_1><index parameter definitions for treatment_2>

END_TREATMENT...

TREATMENT_TYPE = <treatment_n><index parameter definitions for treatment_n>

END_TREATMENT

Figure 2.37: Schema file template for the treatment types to be used in a tree listgeneration database. See Figure 2.38 for a the template for the index parameterdefinition level within each treatment.

enclosing the INTERVAL WIDTH parameter attribute, indicate that it isonly required if

the PARAMETER TYPE has a value of INTERVAL, and should not be present for an index

parameter with a PARAMETER TYPE value of FIXED. Specifying the INTERVAL WIDTH

parameter attribute for an index parameter with a FIXED type will produce an error.

Section 2.11.6 provides complete details on the available index parameters for each

treatment.

Each tree list generation database schema file must contain a MEASUREMENT UNITS

keyword. This keyword indicates the units of measure for the numerical data con-

tained within a schema file. The measurement units keyword may have a value of

IMPERIAL or METRIC, depending on the measurement system used to represent the

data. See Table 2.2 for the standard measurement units for each measured item. The

tree list generation database software will automatically perform any unit conversions

necessary when reading a schema file. Internally the tree list generation database uses

metric units to represent all of the tree and stand measurement data.

The index parameters used with a particular treatment in a tree list generation

database are selected via the schema file by setting the value of the USE PARAMETER

104

TREATMENT_TYPE = <treatment_i>INDEX_PARAMETER = <index_parameter_1>

<attribute definitions for index_parameter_1>END_INDEX_PARAMETER

INDEX_PARAMETER = <index_parameter_2><attribute definitions for index_parameter_2>

END_INDEX_PARAMETER...

INDEX_PARAMETER = <index_parameter_m><attribute definitions for index_parameter_m>

END_INDEX_PARAMETEREND_TREATMENT

Figure 2.38: Schema file template for the index parameter definition level within<treatment i>. There are m index parameters defined for this treatment. See Fig-ure 2.39 the template for the index parameter attribute definition level within eachindex parameter.

.

.

.TREATMENT_TYPE = <treatment_i>

.

.

.INDEX_PARAMETER = <index_parameter_j>

USE_PARAMETER = YES | NOPARAMETER_TYPE = FIXED | INTERVAL[INTERVAL_WIDTH = <number>]

END_INDEX_PARAMETER...

END_TREATMENT...

Figure 2.39: Schema file template for the index parameter attribute definition levelfor <index parameter j> within <treatment i>. The INTERVAL WIDTH parameter isonly used if the index parameter type is INTERVAL, otherwise it is not present.

105

attribute. A value of YES indicating that the index parameter should be used and a

value of NO indicating that the index parameter should not be used. It is not necessary

to specify all of the possible index parameters for a treatment in a schema file, only

the index parameters that will actually be used in a tree list generation database for

a treatment need to be defined. In this case, all of the index parameters will have a

value of YES for their USE PARAMETER attributes. Thus, index parameters that are not

used for a treatment need not be included in a schema file, making the file smaller.

The order of treatment type definitions in a tree list generation database schema

file, if there are multiple treatments, is not important, but each treatment specification

must be complete, ending with the END TREATMENT marker before a new treatment

may begin. Likewise, the order of index parameter definitions within a treatment

is not important, but each index parameter specification must be complete, ending

with the END INDEX PARAMETER marker before a new index parameter definition may

begin. Finally, the index parameter attributes may appear in any order within an

index parameter definition for a treatment. Each index parameter attribute must

have an appropriate value, and if the PARAMETER TYPE has a value of INTERVAL, the

INTERVAL WIDTH attribute must be defined and have a nonzero, positive value.

The STAND TYPE index parameter is required for all treatments. This index pa-

rameter is used to distinguish among the different categories of stands, e.g., pure

Douglas-fir, pure western hemlock, douglas-fir and western hemlock mixtures, and

stand with arbitrary species composition. If this index parameter were not required,

stands of different species would be indistinguishable from each other; only their nu-

meric attributes would be used to distinguish them from one another. Thus, not

requiring the STAND TYPE index parameter would defeat the purpose of distinguishing

among different stand types for generating treee lists. This index parameter should

have a PARAMETER TYPE of FIXED, and its USE PARAMETER value must always be YES.

If the STAND TYPE index parameter is not present within a treatment definition, an

error will be signalled when reading the schema file.

106

When defining the index parameter attributes some care must be taken to avoid

defining a categorical classification variable, indicated by a PARAMETER TYPE value of

FIXED, as a continuous classification variable, indicated by a PARAMETER TYPE value

of INTERVAL, or vice versa. The consequences of defining a categorical classification

variable as a continuous classification variable would be to consider a range of cate-

gorical variables as being equivalent when classifying stands, which would obviously

be disastrous for an index parameter such as STAND TYPE which is based on species

composition. The consequences of defining a continuous classification variable as a

categorical classification variable would be to be to consider each value as unique,

creating a separate category for every value, which would become quite cumbersome

for an index parameter such as QMD or top height. The tree list generation database

software does not currently impose particular types on the index parameters, but com-

mon sense and standard interpretations of the index parameter should be a sufficient

guide when determining index parameter types.

When creating a tree list generation database which contains multiple treatments,

all treatments should contain a uniform subset of index parameters with the same

interval widths for the INTERVAL index parameters. This enforces an internal consis-

tency across the treatments. For example, if a tree list generation database contains

data for untreated stands and thinned stands, the index parameters and interval

widths used for the untreated stands should be used for the thinned stands as well,

with one or more additional index parameters specific to the thinned treatment. If

this is not done, the treatments in a tree list generation database will not be classified

consistently across treatments. There may, however, be instances where this latter

situation is desired, so the tree list generation database software does not impose any

type of consistency constraint across treatment types, permitting this type of use.

See Section 2.11.6 for the index parameter keywords, their definitions, and rec-

ommended types and values. The schema file used to create the tree list generation

database tgdb1r00 may be found in Appendix D. An annotated schema file template

107

may be found in Appendix E.

2.11.6 Index paramter keyword definitions

The index parameter keywords used to define a multidimensional histogram stand

classification for a tree list generation database, and to specify a stand description for a

simulated stand, are defined in this section. Each index parameter definition contains

the recommended index parameter type, the index parameter interval width for index

parameters of type INTERVAL, a description of the set of valid values for the index

parameter, the treatments that the index parameter may be applied to, an indication

of whether the index parameter is used in the tree list generation database tgdb1r00,

and a brief description of the index parameter keyword. A value of “N/A” indicates

that there is no applicable value for that component of an index parameter definition.

A rationale for the selection of the interval widths used for the index parameters of

INTERVAL type is provided in the description for each such index parameter definition.

TREATMENT TYPE

Index parameter type: N/A

Index parameter interval width: N/A

Valid index parameter values: One of THINNED or UNTREATED

Applicable Treatments: All treatment types

Used in tgdb1r00: Yes

Description: The treatment type. This index parameter keyword must always

be present in a schema file or stand description file.

SITE INDEX 50

Index parameter type: INTERVAL

108

Index parameter interval width: 3.0 m (8.4 ft)

Valid index parameter values: Any numeric value ≥ 0



Description: The site index at 50 years reference age for Douglas-fir or western

hemlock stands. The interval width of 3.0 m (8.4 ft) is used because it is

approximately one third of the width of a site class for Douglas-fir [15, 4].

STAND TOTAL AGE


Index parameter interval width: 4 years




Description: The total age of a forest stand. The interval width of four (4)

years is used to avoid possible aliasing problems caused by an interval width

that is too small, given the remeasurement intervals which ranged from two (2)

to eight (8) years. This interval width also takes into account the fact that

there is some uncertainty in the stand total ages for much of the data that are

available.

STAND ORIGIN

Index parameter type: FIXED


Valid index parameter values: One of NATURAL or PLANTED

109



Description: The origin of a forest stand. Two stand origin types are provided,

NATURAL for seeded and naturally regenerated stands, and PLANTED for planted

stands.

MEAN DBH


Index parameter interval width: 4 cm (1.57 in)



Used in tgdb1r00: No

Description: The mean DBH of a forest stand. This index parameter is given

a value that represents the arithmetic mean diameter at breast height of all

trees in a stand. The interval width of 4 cm (1.57 in) is used because it is a

commonly used value for determining diameter classes.

MEAN HEIGHT


Index parameter interval width: 3 m (8.4 ft)




110

Description: The mean tree height of a forest stand. This index parameter is

given a value that represents the arithmetic mean tree height of all trees in a

stand. The interval width of 3.0 m (8.4 ft) is used because it is approximately

one third of the width of a site class for Douglas-fir [15, 4].

QMD


Index parameter interval width: 4 cm (1.57 in)




Description: The quadratic mean diameter, QMD, of a forest stand. This

index parameter is given a value that represents the diameter of a tree which

has the mean tree basal area over all the trees in a stand. The interval width

of 4 cm (1.57 in) is used because it is a commonly used value for determining

diameter classes.

TOP HEIGHT


Index parameter interval width: 3 m (8.4 ft)




Description: The mean height of the 100 (40) largest diameter trees per

hectare (acre) in a forest stand. The interval width of 3.0 m (8.4 ft) is used

111

because it is approximately one third of the width of a site class for Douglas-fir

[15, 4].

STAND DENSITY


Index parameter interval width: 200 tph (80 tpa)




Description: The stand density in trees per hectare (acre) in a forest stand.

The interval width of 200 tph (80 tpa) is used because it is approximately ±4

measured trees per plot with an average plot size of 0.0405 ha (0.10 ac) or ±10

trees per plot with an average plot size of 0.10 ha (0.25 ac).

STAND BASAL AREA


Index parameter interval width: 1.5 m2ha−1 (16.15 ft2ac−1)




Description: The basal area in per hectare (acre) for a forest stand. The

interval width of 1.5 m2ha−1 (16.15 ft2ac−1) is used because it approximates the

annual basal area increment.

PCT DF STEMS

112


Index parameter interval width: 20%

Valid index parameter values: Any numeric value between zero (0) and 100



Description: The percentage of the trees or stems per hectare (acre) that

are Douglas-fir trees. The interval width value of 20% is used because this is

consistent with the definition of a pure stand as having at least 80% of the stand

be the dominant species. If this index parameter is used the index parameters

PCT WH STEMS and PCT OT STEMS must also be used.

PCT WH STEMS






Description: The percentage of the trees or stems per hectare (acre) that are

western hemlock trees. The interval width value of 20% is used because this is

consistent with the definition of a pure stand as having at least 80% of the stand

be the dominant species. If this index parameter is used the index parameters:

PCT DF STEMS and PCT OT STEMS must also be used.

PCT OT STEMS


113





Description: The percentage of the trees or stems per hectare (acre) that are

not Douglas-fir or western hemlock trees. The interval width value of 20% is

used because this is consistent with the definition of a pure stand as having at

least 80% of the stand be the dominant species. If this index parameter is used

the index parameters PCT DF STEMS and PCT WH STEMS must also be used.

PCT DF BASAL AREA






Description: The percentage of the basal area per hectare (acre) that is com-

posed of Douglas-fir trees. The interval width value of 20% is used because this

is consistent with the definition of a pure stand as having at least 80% of the

stand basal area being the dominant species. If this index parameter is used

the index parameters PCT WH BASAL AREA and PCT OT BASAL AREA must also be

used.

PCT WH BASAL AREA


114





Description: The percentage of the basal area per hectare (acre) that is com-

posed of western hemlock trees. The interval width value of 20% is used because

this is consistent with the definition of a pure stand as having at least 80% of

the stand basal area being the dominant species. If this index parameter is used

the index parameters PCT DF BASAL AREA and PCT OT BASAL AREA must also be

used

PCT OT BASAL AREA






Description: The percentage of the basal area per hectare (acre) that is not

composed of Douglas-fir or western hemlock trees. The interval width value

of 20% is used because this is consistent with the definition of a pure stand

as having at least 80% of the stand basal area being the dominant species.

If this index parameter is used the index parameters PCT DF BASAL AREA and

PCT WH BASAL AREA must also be used

STAND TYPE

115



Valid index parameter values: One of PURE DF, PURE WH, DF DOMINANT,

WH DOMINANT, or MIXTURE



Description: The type of the stand based on species composition. The value

of PURE DF indicates that a stand is at least 80% Douglas-fir, PURE WH indicates

that a stand is at least 80% western hemlock, DF DOMINANT indicates that a

stand is at least 50% Douglas-fir, WH DOMINANT indicates that a stand is at

least 50% western hemlock, and MIXTURE indicates that a stand does not have

a Douglas-fir or western hemlock percentage above 50%, and is therefore an

arbrtrary mixture. The index parameter is required.

NUMBER OF SPECIES


Index parameter interval width: 5




Description: The number of species in a stand. The interval width of five

(5) is used because it is common to have ingrowth from several species other

than the desired dominant species, and the type of species is not necessarily

important.

116

NUMBER OF THINNINGS



Valid index parameter values: Any integer numeric value ≥ 0

Applicable Treatments: THINNED treatments


Description: The number of thinnings that have been performed on a stand

of trees.

PRE THIN DENSITY


Index parameter interval width: 200 tph (80 tpa)




Description: The number of trees per hectare (acre) on a forest stand before

a thinning operation was performed. The interval width of 200 tph (80 tpa) is

used because it is approximately ±4 measured trees per plot with an average

plot size of 0.0405 ha (0.10 ac) or ±10 trees per plot with an average plot size

of 0.10 ha (0.25 ac).

PRE THIN BASAL AREA


Index parameter interval width: 1.5 m2ha−1 (16.15 ft2ac−1)

117




Description: The basal area per hectare (acre) on a forest stand before a

thinning operation is performed. The basal area in per hectare (acre) for a

forest stand. The interval width of 1.5 m2ha−1 (16.15 ft2ac−1) is used because

it approximates the annual basal area increment.

PCT STEMS REMOVED






Description: The percentage of trees or stems removed from a forest stand

during a thinning operation. The interval width of 10% is used because the

percentage of trees removed during a particular thinning event is expected to

be less than 50% of the standing trees, and 10% interval widths should provide

a reasonable resolution.

PCT BASAL AREA REMOVED





118


Description: The percentage of the basal area per hectare (acre) removed from

a forest stand during a thinning operation. The interval width of 10% is used

because the percentage of basal area removed during a particular thinning event

is expected to be less than 50% of the standing basal area, and 10% interval

widths should provide a reasonable resolution.

YEARS SINCE TREATMENT




Applicable Treatments: All treatment types except the UNTREATED type


Description: The number of years since a treatment was performed. The

interval width of four (4) years is used to avoid possible aliasing problems caused

by an interval width that is too small, given the remeasurement intervals which

ranged from two (2) to eight (8) years. This interval width also takes into

account the fact that there is some variation in the early effects of treatments.

119

Chapter 3

METHODS, DATA, TESTING AND VALIDATION

This chapter describes the theoretical and practical motivation for the tree list

generation methodology proposed in Chapter 1 and implemented in the tree list gen-

eration database system described in Chapter 2. Following the description of the tree

list generation database methodology is a brief description of a set of stand measure-

ment data from Oregon, Washington, and southern British Columbia that were used

to populate a tree list generation database, tgdb1r00, created using the methods

described and the tree list generation database software. Next, the effectiveness of

the tree list generation methodology is demonstrated by using the tree list genera-

tion database tgdb1r00 and the tree list generation database software to generate

simulated stand measurements or tree lists which mimic each actual stand measure-

ment. The actual and simulated stand measurements are then compared using several

criteria. The actual stand vs. simulated stand comparisons are the primary testing

and validation procedures for the tree list generation database methodology, yielding

empirical misclassification, or error, rates as an indication of overall performance.

Finally, the chapter concludes with a summary of the tree list generation database

validation results, some caveats for its use, and several other considerations.

3.1 Methods

In the abstract, there are five requirements for generating simulated stands or tree

lists. First, is a stand classification procedure which partitions forest stands into

clusters or groups based on their similarity, as determined by some set of stand at-

120

tributes. Second, is a stand selection procedure which identifies a group of stands, or

several groups of stands, from a stand classification that have similar attributes to a

specified stand for which a tree list is desired. Third is a method for representing a

forest stand encompasses the wide variety of size, age, and species combinations that

are found in actual stands. Fourth is a method for generating individual trees from

the forest stand representation. Finally, there must be a wide variety of measurement

data available to calibrate a tree list generation procedure. The fifth requirement is

taken for granted, and the first four must be specified to produce a tree list generation

methodology.

Given the broad scope of the abstract tree list generation problem, there are

many possible alternatives for specifying the first four requirements in a solution. To

narrow the scope of the simulated stand or tree list generation problem, several design

criteria have been imposed on the stand classification and stand selection procedures

and the stand representation and tree generation procedure. The design criteria

are intended to address shortcomings of the classical approach to the problem, the

Weibull parameter recovery approach, or to make the most effective use of the stand

measurement data that are available. The following design criteria were used to help

guide the development of the tree list generation methodology.

• The utility of the stand measurement data used to calibrate the tree list gener-

ation procedure should be maximized. Tree data are expensive to collect, and

should therefore be put to the best possible use.

• Individual trees in a simulated stand or tree list should be generated as mul-

tidimensional objects directly. A tree is generally represented by a number of

attributes, e.g., height, DBH, and species, and it should be possible to use more

than just diameter, for example, when generating a tree list.

• The stand representation should be flexible enough to represent multimodal

121

and relatively flat size distributions, as well as unimodal size distributions, and

arbitrary species compositions. Forest stands, including plantations, may obtain

a wide variety of size distributions and species compositions.

• To the extent possible, simulated trees should be physically realizeable, that

is, it should be possible to find an actual stand where a simulated tree could

have occurred. Thus the tree list generation methodology should only generate

realistic trees.

• Multiple treatments must be supported in a consistent manner. The self–

consistency of a tree list generation database and a tree list generation procedure

across treatments must be guaranteed.

• The addition of new stand measurement data should be easy to perform. Stand

measurement data are collected on a regular basis, and it should be possible to

easily update a tree list generation database as new data are collected.

• The addition of new stand measurement data should not modify the existing

stand classification structure, except possibly by the creation of a new stand

class. A particular tree list generation database and a tree list generation pro-

cedure must remain consistent, in terms of the simulated stands produced, as

new data are added.

• The simulated stand or tree list generation framework should be as similar

as possible to the standard framework of psuedorandom number generation.

The simulated stand generation procedure should be placed firmly in a well

understood computational and conceptual setting.

The stand classification and stand selection procedures, and the stand representa-

tion and individual tree generation procedure, selected to solve the tree list generation

122

problem follow a nonparametric statistical theme. The stand classification procedure

is based on a multidimensional histogram, which is a consistent probability density

function estimator. The stand selection procedure is based on a nearest neighbor

approach, selecting the multidimensional histogram bins nearest to a desired stand

description. The stand representation is an implicitly defined, i.e., data based, mix-

ture distribution. The tree generation procedure is also based on a nearest neighbor

approach using actual tree data. The nonparametric methods used allow the stand

measurement data to speak for itself in determining a stand classification and in gen-

erating individual trees, as well as providing a great deal of flexibility to meet the

design criteria.

There are three additional reasons for the nonparametric theme. First, traditional

statistical approaches to these types of problems generally require a wide variety of

assumptions involving the point estimates of parameters which characterize a gen-

erally one dimensional, unimodal distribution assumed to underlie a phenomenon or

process, e.g., the Weibull or normal distribution. In addition, they require some sort

of regression procedure and all of its requisite assumptions, e.g., the normality of

the distribution of residuals. Second, nonparametric statistical techniques for solving

these types of problems require far fewer assumptions than their classical counter-

parts. Third, the tremendous increases in computer processing power and storage

capacities, coupled with significant reductions in cost and the ubiquitous presence of

desktop computers, have made the storage and processing of large amounts of data

possible in real time, or nearly real time, i.e., seconds or minutes rather than minutes

to hours or hours to days.

The nonparametric approach to the tree list generation problem and the ready

availability of high speed, large capacity computers provides the freedom to focus

attention on the essential nature of the problem with a minimum number of necessary

statistical or modeling assumptions. The nonparametric approach has the added

benefit that it will work equally well for managed, planted forest stands and natural

123

forest stands. With this in mind, the remainder of this section provides details on

the stand classification and stand selection procedures and the stand representation

and tree generation procedure that have been used to develop the tree list generation

database methodology and its supporting software, described in Chapter 2. When

reading the tree list generation database component descriptions, keep in mind the

two fundamental assumptions FA1 and FA2: forest stands with similar attributes

are similar, and sampled forest inventory data may be extended to a larger forested

area as individual trees, respectively. The fundamental assumptions may be found at

the end of Section 1.1.

3.1.1 Stand classification procedure

The tree list generation database stand classification procedure is based on a mul-

tidimensional histogram whose dimensions are defined by a particular set of stand

attributes. The multidimensional histogram bin boundaries define the set of stand

classification rules. Individual stands are classified as belonging to a particular mul-

tidimensional histogram bin, with other similar stands being assigned to the same

bin. This binning of similar stands plays the same role as the individual parameter

regression models, and their requisite assumptions, in the classical parameter recovery

approach, without the problems associated with regression in many dimensions, e.g.,

the flatness of regression surfaces and the large data requirements, commonly referred

to as the curse of dimensionality.

The multidimensional histogram classification or partitioning of the stand mea-

surement data is performed using an a priori set of selected stand attributes, called

stand index parameters, and their computed values for each set of stand measure-

ment data. This type stand classification procedure provides a basic aggregation of

the data, grouping stands which fall into the same multidimensional histogram bin to-

gether. Aggregating the data in this manner produces an average or canonical stand,

evening out any eccentricities of the individual stands sampled for a particular stand

124

condition. The aggregation of the individual stand measurements in this way has the

added benefit of reducing the size of the search index when performing a search to

select stands similar to a specified stand description.

The stand index parameters may be continuous variables, e.g., QMD or mean tree

height, or they may be categorical variables , e.g., stand type or stand origin. The

histogram bins for the continuous stand index parameters are defined by specifying

an appropriate bin width for that dimension of the multidimensional histogram, and

they are called interval index parameters. The histogram bins for the categorical

index parameters consist of a small set of distinct, fixed values, and they are called

fixed index parameters. The fixed stand index parameters do not actually form bins,

being single points, but they still partition the stand measurement data into separate

classes, and they will be referred to as forming histogram bins.

The multidimensional histogram was chosen as the stand classification procedure

for four reasons. First is its simplicity, both conceptually and computationally. Sec-

ond, the statistical properties of the histogram as a probability density function es-

timator are well understood in a variety of different contexts [28, 32, 25, 10, 9, 24],

including the consistency of the histogram and its asymptotic properties as the sample

size approaches infinity. Third, the limited range of stand attribute values possible

for forest stands, whether natural or managed, lends itself to a sequence of simple

classification rules in each variable for descriminating between similarand dissimilar

stand types. The histogram provides such a sequence of classification rules via its

bin boundaries. Fourth, new stand measurement data may easily be added using this

type of classification method, and only one histogram bin is affected by the addition

of the new data: an existing bin, if the new stand attributes are similar to at least

one stand that is already classified, or a new bin, if the new stand attributes are not

similar to any previously classified stands.

The grouping of similar stands into the same multidimensional histogram bin is

not necessary for the proposed tree list generation algorithm. An alternative would

125

be to simply store and index each stand individually by its specific attributes. The

stand selection procedure defined in Section 3.1.2 is unchanged, except that stands

are selected based on their individual attributes rather than selecting stands based on

the centers of the multidimensional histogram bins which contain them. Additionally,

the similarity function may be changed to a pure Euclidean distance function, rather

than the weighted Euclidean distance function described in the next section. If there

is a single stand in each histogram bin and the stand selection procedure below is

used with an Euclidean distance function to choose the single closest stand for the

desired stand attributes, the most similar neighbor method for selecting simulated

stands results if the selected stand is simply copied [29].

3.1.2 Nearest neighbor stand selection procedure

The tree list generation database stand selection procedure is based on the idea of

finding the multidimensional histogram bins that are similar to a requested stand

description. The similarity is based on the set of stand index parameters in use and

a similarity function. The similarity function is the bin width weighted Euclidean

distance from the desired stand description, defined by the values of its stand index

parameters, to the centers of the histogram bins in a tree list generation database

stand classification. Let x be the vector of n stand index parameter values for the

desired stand. The similarity score for determining the nearest neighbors is computed

using the formula

Sj(x) =n∑

i=1

wi(xi − cij)2, j = 1, 2, . . . , N,

with

wi =

1hi

if xi is an interval valued index parameter

10000 if xi is a fixed valued index parameter, e.g., species

where hi is the multidimensional histogram bin width for an interval index parameter,

the values for the cij are the histogram bin centers, and N is the number of multidi-

126

mensional histogram bins in a tree list generation database for a specified treatment.

The scores Sj(x) are ordered from smallest to largest. The five (5) or fewer multidi-

mensional histogram bins having similarity scores less than five (5.0), Sj(x) <= 5.0,

are selected to be used as the basis for generating a simulated stand or tree list using

the procedure defined in Section 3.1.4. The multidimensional histogram bins used are

those most similar to the desired stand description.

Once appropriate stands, multidimensional histogram bins, have been selected

based upon their similarity scores, their nearness to the desired set of stand attributes,

the individual tree data for each selected stand, histogram bin, are combined to form

a lookup table of tree DBH and height measurements and tree species. This table is

then treated as representing a canonical stand that is representative of the desired

stand, and is used to randomly generate the individual trees for a simulated stand.

A similarity score in excess of 10000 indicates there were no matching stands in the

tree list generation database, and no tree list can be generated. This can only occur

if a value for a fixed index parameter is supplied and there are no corresponding

values for that index parameter in a tree list generation database. This condition

will only occurs if there are no data for the desired stand condition. The values of

the maximum acceptable similarity score and the maximum number of qualifying

stands are somewhat arbitrary, and any reasonable values may be used for the the

stand selection criteria. The values of five (5.0) for the maximum similarity score and

five (5) for the maximum number of similar stands were determined to be reasonable

through trial and error with a visual assessment of the quality of the simulated stands.

Values in the range of five(5) to ten (10) are generally considered to be reasonable

for the number of nearest neighbors to include for smoothing [31]. These values

may be modified when using the TGRAND command line options -nn max score and

-nn max stands, respectively.

If the maximum number of qualifying stands is set to one, using the command

line option -nn max stands, or there is only a single multidimensional histogram

127

bin with a similarity score less than the default value of five (5.0), or some other

maximum similarity score defined by the command line option -nn max score, this

stand selection procedure is like the most similar neighbor method [29]. The proposed

stand selection procedure is identical to the most similar neighbor method if there

is a single stand in each multidimensional histogram bin and an Euclidean distance

similarity function is used.

3.1.3 Stand representation

A forest stand may be composed of many tree species, with one or more cohorts, or

age classes, within each species, and a variety of tree sizes for each species or cohort

grouping. The inherent variability in a forest stand is nicely captured by the notion

of a finite mixture density or mixture distribution [33, 2, 26, 3], and this is the stand

representation used for the tree list generation database. A finite mixture density is

a probability density function of the form

p(x | (α, φ)) =m∑

i=1

αipi(x, φi))

where each αi > 0, i = 1, 2, . . . , m,∑m

i=1 αi = 1, and where each function pi is

itself a probability density function parameterized by a scalar or vector φi [33, 26, 2].

The function p would represent the overall structure of a forest stand, with each

function pi representing, possibly, a different species, cohort, or size class and the

αi representing the proportion of the stand associated with that particular species,

cohort, or size class. The individual probability density functions pi may be the same

e.g., the Weibull density function [2, 3], or the normal density function, but this is not

necessary. Each pi could be specifically selected to suit a particular stand attribute.

Additionally, the pi need not be parametric density functions, like those mentioned,

they may be nonparametric density function estimates [28, 9, 10, 26, 24, 25] created

for a particular stand component. The mixture distribution is also well suited for

representing forest stands since the general problem of estimating its parameters

128

is well understood [26], as is the problem of generating random numbers from a

particular mixture distribution.

Given a mixture density as the representation of a forest stand, it may be speci-

fied in two ways: explicitly or implicitly. If the mixture distribution for the stand p

is specified explicitly, there are again two approaches: a parametric approach and a

nonparametric approach. For the parametric approach individual probability density

functions pi must be chosen, and their individual parameters φi and the mixing co-

efficients αi estimated [26]. Once all of the parameters for the mixture distribution

have been estimated, it may be used to generate a tree list. The traditional Weibull

parameter recovery method used for diameter distribution modeling is a form of the

parametric approach, though with additional assumptions and data processing steps

[2, 3]. For the explicit nonparametric approach, the data for a stand are used to

estimate the parameters for a particular type of nonparametric probability density

function estimator [28, 32, 25, 10, 9, 24]. The nonparametric density function is then

used to generate a tree list for a particular stand description. A combined approach

using both parametric and nonparametric probability density functions is also possi-

ble.

The implicit approach for specifying a mixture density to represent a forest stand is

also a nonparametric approach. The individual tree data, DBH, height, and species

for a stand are stored and then used with an appropriate procedure to generate

random trees having the proper statistical characteristics for the stand. This is done

either by forming an explicit, local representation for the mixture density p, using

local nonparametric density estimation methods, or by using the data directly to

generate simulated trees. The latter approach is the preferred method, making the

most effective use of the tree measurement data, provided a procedure for generating

the individual trees is available. Section 3.1.4 describes one such method.

The implicit mixture density representation of a stand was selected for use in

the tree list generation database. The implicit mixture density approach has the

129

advantage that no parameter estimation must be performed to determine a stand

description; the data implicitly define a stand. Given a data based tree generation

procedure there is no actual effort that must be expended to use the implicit stand

representation, it simply provides a convenient and powerful conceptual model for

describing a forest stand. The drawback to the approach is that all of the individual

tree data must be available in order to generate simulated trees. This is not, however,

a major constraint given the inexpensive, large capacity storage available for desktop

computers.

So, given the implicit mixture density based stand representation, simulated trees

may be generated by first selecting an appropriate set of individual trees, and then

using a data based tree generation procedure. An appropriate set of individual trees,

defined as compatible DBH and height measurements with tree species, are selected,

e.g., using the procedures from Section 3.1.2. The tree generation procedure described

in Section 3.1.4 may then be used with the selected trees to directly generate simulated

trees.

3.1.4 Tree list generation procedure

The tree list generation database tree generation procedure is based on a method

of generating simulated data vectors having the characteristics of a particular data

set [31]. The tree generation procedure is first presented in its generality, and then

the modifications necessary to specialize the procedure for the tree list generation

problem are presented. The description of the algorithm is summarized here, with

an emphasis on its practical and computational aspects. For a complete discussion of

the procedure, the reader is referred to the original paper [31].

Let Xj ∈ �k, j = 1, 2, . . . , n be a random sample of size n from an unknown

multidimensional probability density function. Perform a variance normalizing trans-

formation on the data vectors, X tΣ−1, where X t = {X tj} is the n × k data matrix,

130

and

Σ−1ii = 1

si, i = 1, 2, . . . , k

Σ−1ij = 0, i, j = 1, 2, . . . , k, i �= j,

where si is the standard deviation of data dimension i, to account for any differ-

ences in variance among the k coordinate dimensions. Next select at random one

of the n data points, say Xr, and determine its m nearest neighbors. The near-

est neighbors are determined by computing the Euclidean distance from Xr to Xj,

Drj =√

(Xr − Xj)t(Xr − Xj), for j = 1, 2, . . . , n and then selecting the m smallest

distances. The initially selected data vector Xr is always selected as one of the nearest

neighbors; it is the closest vector ot itself with an Euclidean distance of zero. For

convenience, relabel the m nearest neighbors Yj , j = 1, 2, . . . , m.

The m nearest neighbor data vectors, Yj, are transformed by subtracting their

sample mean, Y = 1m

∑mj=1 Yj, yielding the data vectors Y ′

j = Yj − Y , j = 1, 2, . . . , m.

Next a random sample u1, u2, . . . , um is generated from the uniform distribution

U

1

m−

[3(m − 1)

m2

] 12

,1

m+

[3(m − 1)

m2

] 12

,

selected to ensure appropriate properties for the first and second moments, the mean

and covariance, of the simulated data vectors. The uniform random numbers uj are

combined with the transformed vectors Y ′j to get a simulated data vector

Y ′ =m∑

j=1

ujY′j

which is transformed into its original coordinates

Y = Σ(Y ′ + Y

)

to obtain a simulated data vector that is representative of the original multivariate

distribution of the data Xj, j = 1, 2, . . . , n. This procedure is repeated for each de-

sired simulated data vector, selecting a new initial data vector Xr with replacement.

131

The random vectors Y generated by this procedure are uncorrelated and they approxi-

mately reproduce the covariance structure of the original data Xj ∈ �k, j = 1, 2, . . . , n

[31].

A theoretical, asymptotic dependence between m, the smoothing or averaging

parameter, and the sample size n exists, as for the sample size and the bin width

or number of bins for histograms and a variety of other nonparametric probability

density estimators [32, 28, 25, 10, 9, 24]. For practical purposes a fixed value of m may

usually be chosen without incurring a significant penalty. Empirical results support

this conclusion, and a value of m = 10 has been shown to give good results [31].

The value for m = 10 has therefore been chosen for use in the tree list generation

algorithm of the tree list generation database.

A tree list is a three dimensional vector, with two continuous dimensions, tree

DBH and height, and one discrete or categorical dimension, tree species. To generate

a simulated tree list, an actual tree list, i.e., a list of actual tree DBH and height mea-

surements with species, that is presumed to be representative of the desired simulated

tree list is necessary. The DBH and height components of a tree list are considered

as a two (2) dimensional vector and correspond to the Xj in the random vector gen-

erating algorithm. The species of a simulated tree is determined by the species of the

first randomly selected tree, corresponding to the Xr in the random vector generating

algorithm, for each simulated tree.

The actual DBH height, and species values may be obtained by the procedure

described in Section 3.1.2 or some other procedure. Let this actual tree list contain

n trees. The tree list generation procedure is then given by the following algorithm.

Step 1 Generate a random integer r between 1 and n. This selects one of the n trees

from the canonical stand or actual tree list.

Step 2 Assign the species of the tree selected in Step 1 to the tree being simulated.

132

Step 3 Use the random vector generation procedure with the actual DBH and height

measurement data to generate a simulated pair of DBH and height measure-

ments.

Step 4 If either of the simulated DBH or height measurements is not positive, repeat

Step 3 until both of the simulated values are positive.

Step 5 Repeat Step 1 through Step 4 until all desired trees have been generated.

The species of a simulated tree cannot be generated using the random vector

generation procedure since the procedure may only be applied to continuous variables.

This necessitates the two part procedure outlined by Step 1 and Step 2. Step 4 simply

guarantees that simulated DBH and height measurements will always be positive.

Two modifications of the tree list generation algorithm defined by Step 1-Step 5

will prove useful. These modifications address the issue of generating a tree list that

contains only one species, that is, a tree list that is 100% pure, and the issue of using

estimated tree heights to generate simulated trees. The need for the first modification

should be readily apparent. The need for the second modification may need a bit of

motivation.

The majority of individual tree heights in a sample are not measured, but are

estimated from height-diameter relationships determined by a small set of trees with

measured heights. These estimated tree heights typically follow a smooth curve,

giving the estimated tree heights less variability relative to the actual variability of

tree heights observable for a particular DBH. Given this reduction in variability, a

method for adding variability or jittering the simulated tree heights is desirable to

more accurately represent the possible natural variability of the tree heights for a

given DBH.

The first modification to the tree list generation algorithm allows the generation

of a 100% pure stand, and is controlled by the value of the RANDOM MF PURE keyword

133

in a stand description file supplied to TGRAND. See Section 2.11.2 for details. For a

stand with a type of PURE DF or PURE WH, all tree records for species other than the

dominant tree species are removed from the actual list of trees before continuing with

the algorithm. Stand types other than PURE DF and PURE WH are unaffected by this

procedure. This introduces a Step 0 to the algorithm.

Step 0 If a 100% pure stand of trees is to be generated, all of the tree records for

tree species other than the dominant, desired tree species are removed from the

actual tree list. Proceed to Step 1.

The second modification to the tree list generation algorithm adds a small amount

of random jitter to each simulated tree height in an attempt to more adequately rep-

resent the natural variation in tree heights in a simulated stand. The addition of jitter

to the tree heights is controlled by the value of the RANDOM MF JITTER keyword in a

stand description file supplied to TGRAND. See Section 2.11.2 for details. The ability to

jitter the simulated tree heights is important because most tree heights are estimated

from height-diameter models, and hence follow a smooth curve, giving tree heights

very little variability relative to the actual variability in tree heights for a particular

diameter. The jitter is computed using the specified stand age, from a stand descrip-

tion file, and a simulated tree height to compute a mean annual height increment

(MAI) for each tree. A uniform random number u is then generated from the interval

(−2 hMAI, 2 hMAI), and added to the simulated tree height. This introduces a Step 3a

to the algorithm.

Step 3a If a simulated tree height h is to be jittered, then

3a-1 Compute the mean annual height increment using the simulated tree height

h from Step 3 and the stand total age a, hMAI = ha.

3a-2 Generate a uniform random number u from the interval (−2 hMAI, 2 hMAI).

134

3a-3 Let h = h + u.

3a-4 Proceed to Step 4.

A nonparametric procedure for generating simulated tree lists has been proposed.

The underlying representation of a forest stand is that of a mixture distribution. The

mixture distribution is represented as a multidimensional histogram and used as a

stand classifier. A nearest neighbor stand selection procedure selects stands that are

most similar to a specified set of stand attributes. These stands are then used to gen-

erate a simulated stand having attributes similar to those desired. Two modifications

to the basic procedure are also provided. These permit the generation pure or single

species simulated stands, and the generation of simulated stands with jittered tree

heights. The tree list generation procedures outlined have been implemented in For-

tran 90/95 and will be applied to a large data set that is representative of Douglas-fir

and western hemlock plantations in the Pacific Northwest.

3.2 Data

The tree list generation database methodology is intended to provide a means for

generating simulated stands of trees, or tree lists, that are representative of the Pa-

cific Northwest region west of the Cascade Mountains, and extending from southern

Oregon to southern British Columbia. Figure 3.1 presents a map of the appropriate

region, showing the locations from which data were obtained. To achieve the goal of

regional tree list generation, stand measurement data were provided by the organiza-

tions listed in Table 3.1.The data consisted of stand measurements from pure Douglas-

fir stands, pure western hemlock stands, and stands with mixtures of Douglas-fir and

western hemlock.

The ability to generate simulated stands or tree lists for both untreated stands

and thinned stands was of primary interest for the tree list generation database.

Individual tree data, consisting of compatible DBH and height measurements with an

135

−130 −128 −126 −124 −122 −120 −11838

40

42

44

46

48

50

52

Site locations for the stand measurement data

Degrees West Longitude

Deg

rees

Nor

th L

atitu

de

Figure 3.1: Tree measurement data locations for the untreated and thinned standdata used to create the tree list generation database tgdb1r00.

Table 3.1: List of data sources for the tree list generation database.

Data Sources

British Columbia Ministry of Forests

Washington State Department of Natural Resources

Canadian Forest Service

Oregon State University, Corvalis, Oregon

USFS Pacific Northwest Research Station

Port Blakely Tree Farms

Regional Forest Nutrition Research Project (RFNRP), SMC

Stand Management Cooperative (SMC), University of Washington

Weyerhaeuser Company

136

indication of tree species, for these two treatments were extracted from the data sets

provided and were used to create the tree list generation database tgdb1r00. Only

live trees were used to create the tree list generation database. Table 3.2 contains the

breakdown of stands by stand type and treatment. For a more detailed summary of

the data used to create the tree list generation database tgdb1r00 see Section 2.10.

Table 3.2: Stand type summary for all stands used to create the basic tree list gener-ation database. Percentages are rounded to the nearest 0.1%, so the row and columnpercent totals may not agree exactly.

Untreated stands Thinned stands Total


PURE DF 3404 35.3 3431 35.6 6835 70.8

PURE WH 800 8.3 480 5.0 1280 13.3

DF DOMINANT 691 7.2 417 4.3 1108 11.5

WH DOMINANT 178 1.8 86 0.9 264 2.7

MIXTURE 136 1.4 26 0.3 162 1.7

Total 5209 54.0 4440 46.0 9649 100.0

Given the wide variety of data sources, a myriad of data collection and sampling

strategies were likely employed, with differing assumptions, and a variety of unknown

height-diameter relationships would have been used to estimate most of the individ-

ual tree heights. If the data collection histories, data sampling schemes, and other

statistical procedures were known, it would be nearly impossible to reconcile all of the

discrepancies that are undoubtedly present, and it would be impossible without that

information, which is the case with the majority of this data. Therefore, no attempt

was made to address the statistical compatibility of the sampling strategies or other

statistical procedures that were employed historically for these data sets, and the data

were assumed to be compatible and used as is . To help ensure the reliability of the

tree list generation database, only trees from sample plots that were at least 0.0405

ha (0.10 ac) were used.

The stand measurement data supplied contained breast height ages, but stand

137

total ages were desired for the tree list generation database. Stand total ages for

Douglas-fir were computed from stand breast height ages using Bruce’s equations [4].

Stand total ages for western hemlock stands were computed in the same manner, for

lack of a breast height age to total age conversion formula for western hemlock. The

breast height age to stand total age conversion equation used was:

AT = ABH + 13.25 − SI

20,

where AT is stand total age, ABH is breast height age, and SI is site index at age 50.

For pure Douglas-fir or pure western hemlock stands, a species specific 50 year site

index value was usually provided. For mixed Douglas-fir and western hemlock stands,

if two site index values were provided, the site index value used was the value for the

species with the greater percentage of stand basal area. If a single species specific site

index value was provided for a stand, and that species had the lower percentage of

stand basal area a site index conversion equation was used to convert the given site

index value to a site index value for the more dominant species [21]. The equation

SIWH = 3.2808(0.432 + 0.899

SIDF

3.2808

)

was used to convert a Douglas-fir site index value to a western hemlock site index

value, and the equation

SIDF = 3.2808(0.480 + 1.11

SIWH

3.2808

)

was used to convert a western hemlock site index value to a Douglas-fir site index

value. In both equations the site index values SIDF and SIWH are assumed to be in

feet and for a reference age of 50 years. If the site index values are in meters, replace

the value 3.2808 in the equations by the value 1. If no site index value was provided

for a stand, that stand was removed from the data set and not considered further.

An indication of stand origin was provided for each stand. Stand origins were

one of natural regeneration, seeded, planted, or unknown. The tree list generation

138

database provides two stand origin types: natural and planted. Stand origins other

than these two values must, therefore, be appropriately converted before the data for

a stand may be used in a tree list generation database. Stand origins from the data

that were either unknown or seeded were assumed to be naturally regenerated. Any

stand having an unrecognized stand origin was removed from the data set and not

considered further.

For each thinned stand, pre-thinning density, pre-thinning basal area, and the

years since thinning were provided in addition to the post-thinning list of tree mea-

surements and species. No pre-thinning list of tree measurements and species was

provided. No attempt was made to simulate a pre-thinning stand from the infor-

mation provided; only the post-thinning tree information was used for the tree list

generation database. The lack of a complete pre-thinning list of tree measurements

and species had the effect of limiting the years since treatment for the tree list gen-

eration database to a value of zero for many of the thinned stands, regardless of the

number of prior thinnings. Both pre-thinning and post-thinning lists of tree measure-

ments and species should be kept in the future to alleviate this type difficulty.

3.3 Testing and validation

The testing and validation of the tree list generation database has two components.

First is verifying that the tree list generation database software, the executable pro-

grams and through them the subroutine library, work as expected. Second is verifying

that a tree list generated from a tree list generation database is a reasonable approx-

imation of the stand that it was generated to mimic. The primary emphasis for

the testing and validation of the tree list generation database is the second compo-

nent, since this is of greatest importance for determining whether the overall tree list

generation methodology is valid. Testing of the executable program interfaces and

the subroutine library was accomplished by using the tree list generation database

139

software to generate simulated tree lists for the validation of the tree list generation

methodology. For more information on the software testing procedures and philoso-

phy see Section 3.3.5.

To determine whether a tree list generated using a tree list generation database

is a reasonable approximation to an actual stand, as described by the stand charac-

teristics specified in a stand description file, the notion of reasonable must be defined

quantitatively. A standard approach for this type of problem is to place it in the

context of determining the goodness of fit between an actual data set, or collection

of data sets, and simulated data generated to mimic the actual data set, or data sets.

This is the general approach taken here [3, 27, 1].

Specifically, for each actual stand measurement, a simulated stand measurement

was generated to mimic its attributes using a tree list generation database. The

actual and simulated stand measurements were then compared using one or more

goodness of fit criteria. An error rate, or misclassification rate, for the comparison of

actual and simulated stands is determined from the goodness of fit testing methods

and criteria applied. This provided an indication of the robustness of the tree list

generation database methodology at the stand level. The overall robustness of the

tree list generation methodology, or the regional accuracy, was assessed by performing

a simple linear regression analysis comparing the actual and simulated QMD and mean

height values for each stand, and by comparing the overall distributions of these two

stand attributes for the actual and simulated stands.

The stand attributes or characteristics used to generate the simulated tree lists

are the stand index parameters defined for a particular tree list generation database.

The goodness of fit testing methods and criteria used are described in Section 3.3.1.

The results of the goodness of fit testing analyses for untreated stands and thinned

stands follow in Section 3.3.3 and Section 3.3.4, respectively. The goodness of fit

testing results presented are for the tree list generation database tgdb1r00. See the

specific results sections for the list of stand attributes used as the tree list generation

140

database stand index parameters.

3.3.1 Goodness of fit testing methods and criteria

The classical goodness of fit testing scenario is based on determining whether a set

of observed data values are in agreement with a set of expected data values. The

degree of agreement may be defined in terms of point estimates of the parameters of

a distribution or density function derived from the data values in each data set, such

as a mean and variance, or in terms of the actual distributions of the data values in

each data set. In the first case, commonly called parametric testing, knowledge of

an underlying probability density function is assumed, or known, a priori, and the

goodness of fit testing is reduced to a comparison of estimated parameters for the

underlying probability distribution that are derived from the observed data values

and the expected data values. An example of the parametric approach is the t-test

[1]. In the second case, knowledge of an underlying probability density function is

not necessarily assumed, or known, and the goodness of fit testing is based on a

comparison of the distributions of the data values. Examples of this approach are

Pearson’s chi-square test for categorical data and the nonparametric Kolmogorov-

Smirnov test for continuous or discrete ordinal data [1].

The goodness of fit testing procedures used to compare the actual stands and the

simulated stands generated using the tree list generation database tgdb1r00 must

emphasize the distributions of the tree DBH and height measurements, both contin-

uous variables, and the species composition or distribution, a categorical variable.

The comparison of the distributions, rather than point estimates of parameters, is

essential for determining whether the tree list generation methodology of the tree list

generation database methodology is appropriate. The tree size distributions and the

species composition within a forest stand are its fundamental measured components,

and are used for the goodness of fit comparisons. Any other type of comparison must

make less effective use of the data due to information lost by data aggregation, e.g.,

141

binning continuous data or reducing a data set to a few parameters such as a mean

and variance.

The classical methods for performing goodness of fit tests, whether producing

point estimates or comparing distributions are, unfortunately, not applicable to, or

are of rather limited value for, the tree list generation database goodness of fit testing.

Point estimates such as the mean and variance, as used in a t-test, for DBH or height

distributions generally prove to be inadequate when describing forest stands. The

distributions of both DBH and height are frequently not unimodal and may be mul-

timodal or highly skewed. In addition, the classical goodness of fit testing procedures

may be inappropriate in situations where sample sizes are large, large sometimes be-

ing as small as 50 or 100 sample points, spuriously indicating that two data sets are

different when they are in fact obtained from the indistinguishable underlying distri-

bution [1, 6]. The nonparametric Kolmogorov-Smirnov goodness of fit test statistic

comes closest to meeting the requirements for comparing the continuous DBH and

height distributions, but it only takes into account the maximum difference between

the two cumulative distribution functions [1], and not the differences across the entire

range of the data sets being compared.

The Kolmogorov-Smirnov test statistic does, however, provide the motivation for

another nonparametric test statistic that will overcome the spurious rejection prob-

lems and sample size limitations of the classical goodness of fit tests, such as the t-test

and the chi-square test [1, 6], as well as emphasizing the complete distribution of the

data in a particular data set. The test statistic is the integrated absolute error, or

difference, between two functions f and g, which is defined by the integral

iae(f, g) =∫ ∞

−∞|f(x) − g(x)| dx.

The integrated absolute error is obviously related to the Kolmogorov-Smirnov test

statistic, and has been used as an index for comparing diameter distributions [3, 27].

If the functions f and g are probability density functions, the integrated absolute

142

error computes the proportion of probability mass that the two functions do not

have in common, or the percent difference in the area under the curves of the two

probability density functions. Values of the integrated absolute error range from a

minimum value of zero, indicating that the functions f and g are identical, to a

maximum value of two (2), indicating that the functions f and g are completely

different, i.e., have no overlap. To avoid any potential confusion in the interpretation

of this test statistic, a transformation of the integrated absolute error is performed to

obtain a statistic the same interpretation as the classical p-value, and the formula for

the piae-value

piae = 1 − 1

2

∫ ∞

−∞|f(x) − g(x)| dx,

is obtained.

The piae-value has a range from zero, indicating that two distributions are com-

pletely different, to one, indicating that the two distributions are identical. The

piae-value may be interpreted as the proportion of probability mass that two proba-

bility density functions f and g have in common, or the percent similarity in the area

under the curves of the two probability density functions. This interpretation of the

piae-value is consistent with that of the classical p-value. Unlike most classical test

statistics, the piae-value automatically works for multimodal distributions as well as

unimodal distributions, both of which are common in forestry data. In addition, this

type of test statistic is quite flexible, and may be used in a mixed form containing

both continuous and discrete probability density functions without introducing any

mathematical, statistical, or interpretive difficulties.

The piae-value formula has three primary variants which, span the range of data

types from continuous data to categorical data, or continuous probability density

functions to discrete probability density functions. The first piae-value variant is for

continuous probability density functions f and g defined over all, or some subset, of

143

the real line, and is given by

piae = 1 − 1

2

∫ ∞

−∞|f(x) − g(x)| dx. (3.1)

This form of the piae-value statistic is appropriate for continuous data, such as tree

DBH and height measurements.

The second piae-value variant is for discrete probability density functions f and g

defined over a set of distinct values or sampling universe, X = {x1, x2, x3, . . . , xn} for

a finite set of values or X = {x1, x2, x3, . . .} for an infinite set of values, and is given

by

piae = 1 − 1

2

∑all xi

|f(xi) − g(xi)| . (3.2)

This form of the piae-value statistic is appropriate for categorical data, such as tree

species.

The third piae-value variant is based on the histogram approximation to probability

density functions, and is generally applicable to data from continuous distributions

that have been binned to obtain bin frequencies. This variant of the piae-value formula

has a strong resemblance to the formula for computing the chi-square statistic. Let a

and b be the lower and upper limits for the range of the histograms for two probability

density functions f and g, each having m bins, with bin width h = b−am

, and bin

boundaries defined by xi = a + ih, i = 0, 1, 2, . . . , m. Let fi and gi be the bin

frequencies for the ith bin of the histogram approximations to f and g, respectively.

The frequency values may be obtained as approximate areas under the probability

density curves, or as the bin frequencies from two data sets with nf and ng data

points, respectively, fi =nfi

nand gi =

ngi

n, where and nfi and ngi are the bin counts

for the ith histogram bin for each function. The piae-value is then given by

piae = 1 − 1

2

m∑i=1

|fi − gi| . (3.3)

The determination of a piae-value from two sets of continuous data or discrete

data, one data set from f and one from g, using the piae formula for discrete or

144

categorical data, Equation 3.2, or the histogram based formula, Equation 3.3, is a

straightforward application of the appropriate formula. Obtaining a piae-value from

two sets of continuous data using the integral based piae formula, Equation 3.1, is

only slightly more involved, consisting of two steps. The first step involves estimating

a continuous probability density function for each of the two data set [28, 32, 24, 25,

10, 9]. The second step performs a numerical integration to obtain the piae-value.

The probability density function estimator used in the first step is a nonparametric

probability density estimation technique based on cubic B-splines [24, 25, 10, 9],

though any appropriate nonparametric probability density estimation technique could

be used in its place. The numerical integration performed in the second step is

a straightforward midpoint based method, which may be obtained from any basic

calculus text, for determining the area under a curve. Note that the piae-value formula

given by Equation 3.3 is a special case of the piae-value formula in Equation 3.1, since

the histogram is a nonparametric probability density function estimator, with the

summation taking the place of the integration.

3.3.2 The piae-value as a test statistic

The piae-value is the test statistic that will be used to determine the empirical mis-

classification rates from the testing and validation of the tree list generation database

methodology. The misclassification rates are derived from the individual tree DBH,

height, and species goodness of fit comparisons between actual stands and simulated

stands. The piae-value will be used most effectively as a test statistic if an appropriate

piae cutoff value, or critical piae-value, can be determined. The critical piae-value will

perform the same role as the critical value, or p-value, in a classical goodness of fit or

hypothesis test.

The selection of a critical piae-value is performed in two stages. First, a compari-

son of the piae-value with the classical t-test is performed to provide insight into what

the piae-value actually measures in a well understood statistical setting, and to show

145

that it has the correct properties to be used as a test statistic. This demonstrates

the applicability of the piae-value as a test statistic. Second, a simulation of an actual

sampling scenario is performed using the standard normal distribution. The simula-

tion is used to characterize the distribution of the piae-value under a null hypothesis

of identical distributions, and to guide the selection of an appropriate critical piae-

value. A two sided t-test is also applied to the simulated data generating p-values

to provide a comparison of the two statistics. This demonstrates the feasibility of

using the piae-value as a test statistic. The two stages use the piae-value formula in

Equation 3.1 and the nonparametric probability density function estimation and nu-

merical integration procedures described. The behavior and properties of the other

two piae-value formulae are similar to the one being tested.

To set the stage for the comparison of the piae-value and the t-test, a brief review of

the classical hypothesis testing framework will prove useful, defining the terminology

used in the comparison. In the classical hypothesis testing framework, an observation

X is made, where X is assumed to be distributed according to Pθ, a distribution

parameterized by some θ ∈ Θ, where Θ is the parameter space. Further, there are

two possible states of nature described by the relations Θ0 ⊂ Θ and Θa ⊂ Θ, where

Θ0 and Θa are disjoint subsets of Θ, i.e., Θ0 ∩ Θa is empty. The sets Θ0 and Θa

may be single points, intervals, or arbitrary disjoint subsets of Θ. The objective of

testing an hypothesis is to determine whether the distribution of the sample X, Pθ, is

described by θ ∈ Θ0, called the null hypothesis H0, or by θ ∈ Θa, called the alternative

hypothesis Ha. Only one of the two states of nature θ ∈ Θ0 or θ ∈ Θa may be the

true state of nature from which X was observed. The classical hypothesis testing

framework is concerned with determining decision rules and test statistics which can

distinguish between the two states of nature, selecting the most likely state of nature

for the observation X.

Given a decision rule, a test statistic, and an observation X, a decision about

which of θ ∈ Θ0 or by θ ∈ Θa is the true state of nature has four possible outcomes,

146

two correct outcomes and two incorrect outcomes, only one of which may occur at a

time. The two correct outcomes are: θ ∈ Θ0 and the true state of nature is described

by Θ0 and θ ∈ Θa and the true state of nature is described by Θa. The incorrect

outcomes indicate an error and are: θ ∈ Θa and the true state of nature is described by

Θ0, a false rejection of H0, and θ ∈ Θ0 and the true state of nature is described by Θa,

a false acceptance of H0. The first incorrect outcome is called a Type I error and the

second incorrect outcome is called a Type II error. The probability of a Type I error,

P (Type I error), is commonly denoted by α, and is called the Type I error rate. The

probability of a Type II error, P (Type II error), is commonly denoted by β, and is

called the Type II error rate. The total error rate for a particular decision rule and test

statistic, or hypothesis test, is given by P (Type I error) + P (Type II error) = α + β.

The Type I and Type II error rates, and hence the total error rate, are usually tightly

linked to each other, to the underlying distribution Pθ, and to the distribution of the

test statistic under the null hypothesis H0.

When selecting a test, a test statistic and decision rule, to discriminate between

the two states of nature, θ ∈ Θ0 or θ ∈ Θa, it seems natural to choose a test which

would minimize the total probability of error. Unfortunately, it is impossible, in

general, to simultaneously have both P (Type I error) = 0 and P (Type II error) = 0;

if P (Type I error) = 0 then P (Type II error) = 1 and if P (Type II error) = 0 then

P (Type I error) = 1 [1]. So, for all practical purposes a statistical test must have

some nonzero total error rate.

The composition of the total error rate for a test, in terms of the Type I and

Type II error rates, may be controlled, to some degree, by placing the emphasis on

one or the other type of error when using a particular test statistic. The standard

approach in scientific studies is to emphasize the Type I error rate, selecting an a priori

tolerance for false rejection of the null hypothesis H0, commonly called the α-level

of the test. The null hypothesis H0 is rejected in favor of the alternative hypothesis

Haif the p-value of the test is less than the chosen α-level. For the classical statistical

147

tests, this approach allows some control over the Type II error rate as well: choosing

a larger value for α reduces the value of β, and choosing a smaller value of α increases

the value of β, demonstrating the coupling of the Type I and Type II errors.

Having completed the brief review of the classical hypothesis testing framework,

the applicability of the piae-value as a test statistic will be demonstrated through a

comparison with the classical t-test applied to two normal distributions with equal

variances. In this setting, the t-test is the optimal statistical test for random samples

drawn from the two normal distributions [1], and the exact probability of committing

a Type I or Type II error may be computed. The piae-value is clearly related to the t-

test error rates in this setting: it has the same theoretical value as the minimum total

probability of error, or the total t-test error rate, P (Type I error)+P (Type II error),

of the t-test.

The comparison the piae-value with the total error rate of the classical t-test was

performed using eight tests. Each test compared two normal distributions with unit

variance and different separations between their means. The first normal distribution

in each comparison was a standard normal distribution, N(µ1, 1) = N(0, 1), and the

second was the normal distribution N(µ2, 1), where µ2 takes on the values 0, 1, 2, . . . ,

7. The first normal distribution is assumed to be the true, or reference, distribution

for a t-test null hypothesis H0 and the second distribution is the alternative distri-

bution for a t-test alternative hypothesis Ha. Each test was performed by drawing a

pair of random samples, one from each of the two normal distributions N(µ1, 1) and

N(µ2, 1) for each value of µ2, and computing the piae-value for that pair of samples.

In each case, the random draw used a sample size of 1000 points, to ensure good

nonparametric probability density estimates for each normal distribution, and was

repeated three times to obtain a mean and standard deviation for each piae-value.

Figure 3.2 through Figure 3.4 present graphically the nonparametric probability

density estimates for the testing procedure and µ2 = 0, 3, and 7, along with the true

t-test total error rate and the mean and standard deviation for the each piae-value.

148

−6 −4 −2 0 2 4 6 8 10 12 140

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

Densities, piae

−values, and total error for mean difference of 0 units

total error = 1.0000

piae

avg = 0.9785

piae

std = 0.0154

N(µ1,1)

N(µ2,1)

Figure 3.2: Nonparametric probability density estimates with the true t-test totalerror rate and the nonparametric piae-value (mean of three trials) from two normaldistributions N(µ1, 1) and N(µ2, 1) where µ1 = µ2 = 0. The nonparametric probabil-ity density function estimates were generated from samples of 1000 points from eachnormal distribution.

Figure 3.5 and Table 3.3 present the true t-test total error rates and the piae-values

for each of the eight tests. These figures and the table clearly indicate that the

piae-value has the appropriate properties to be used as a test statistic, decreasing

rapidly from a value near one for no difference in means to a value near zero for a

difference in means of seven (7) standard deviations, and closely following the t-test

total error rate. The small standard deviations for the computed mean piae-values in

each comparison indicate that this type of test statistic is also quite robust, at least

for the large sample sizes used.

Having shown that the large sample, or asymptotic properties, of the piae-value

149

−6 −4 −2 0 2 4 6 8 10 12 140

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

Densities, piae



piae

avg = 0.2141

piae

std = 0.0117

N(µ1,1)

N(µ2,1)

Figure 3.3: Nonparametric probability density estimates with the true t-test totalerror rate and the nonparametric piae-value (mean of three trials) from two normaldistributions N(µ1, 1) and N(µ2, 1) where µ1 = 0 and µ2 = 3. The nonparametricprobability density function estimates were generated from samples of 1000 pointsfrom each normal distribution.

150

−6 −4 −2 0 2 4 6 8 10 12 140

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

Densities, piae



piae

avg = 0.0029

piae

std = 0.0008

N(µ1,1)

N(µ2,1)

Figure 3.4: Nonparametric probability density estimates with the true t-test totalerror rate and the nonparametric piae-value (mean of three trials) from two normaldistributions N(µ1, 1) and N(µ2, 1) where µ1 = 0 and µ2 = 7. The nonparametricprobability density function estimates were generated from samples of 1000 pointsfrom each normal distribution.

151

−1 0 1 2 3 4 5 6 7 80

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Total error and piae

−value vs. difference in means, 3 samples

Difference in means

Tot

al e

rror

and

pia

e−va

lue

Total error p

iae−value

Figure 3.5: True t-test total error and nonparametric piae-value (mean of threesamples) for two normal distributions N(µ1, 1) and N(µ2, 1) where µ1 = 0 andµ2 = 0, 1, 2, . . . , 7. The nonparametric probability density function estimates weregenerated from samples of 1000 points from each normal distribution. Table 3.3presents the numeric values for these data.

152

Table 3.3: True t-test total error and nonparametric piae-value for comparing twonormal distributions N(µ1, 1) and N(µ2, 1) where µ1 = 0 and µ2 = 0, 1, 2, . . . , 7. Thepiae-value for each pair of means is the average of three trials, and the nonparametricprobability density function estimates were generated from samples of 1000 pointsfrom each normal distribution.

Difference in means t-test Nonparametric piae-value

of normal distributions total error mean (std. dev.)

0.0000 1.0000 0.9730 (0.0102)

1.0000 0.6171 0.6705 (0.0173)

2.0000 0.3173 0.3916 (0.0153)

3.0000 0.1336 0.2095 (0.0081)

4.0000 0.0455 0.0882 (0.0024)

5.0000 0.0124 0.0353 (0.0025)

6.0000 0.0027 0.0096 (0.0008)

7.0000 0.0005 0.0028 (0.0005)

are appropriate for its use as a test statistic, a critical piae-value must now be de-

termined, generating a decision rule, before the statistic may be used in practice for

comparing distributions. To guide the selection of a critical piae-value a simulation

of a real sampling scenario was performed using the standard normal distribution

N(0, 1). The simulation consisted of repeating the following three steps 1000 times to

obtain an approximation to the distribution of the piae-value under the null hypoth-

esis of identical distributions for two random samples drawn from a standard normal

distribution.

1. Draw a random integer n between 20 and 50 for a sample size.

2. Generate two independent random samples of size n from the standard normal

distribution N(0, 1).

3. Compute the piae-value for the two samples using Equation 3.1 and compute

the p-value for the two-sided t-test.

The piae-value, as previously mentioned, approximates the total error rate for each

153

sample in this scenario; an error rate near one (1) indicating that the distributions

of the two samples are indistinguishable. The two-sided t-test is being used as a

goodness of fit test, which, though an unorthodox interpretation, is still consistent

with the assumptions and definition of the test. The necessary assumptions for the use

of the t-test are: the two samples are from normal distributions with equal variances,

having possibly different means. If the means are equal, the distributions are identical,

if the means are not equal, the distributions are different, hence the goodness of fit

interpretation of the t-test.

A histogram and an empirical cumulative distribution function for the piae-values

obtained from the simulation are presented in Figure 3.6. The distribution of piae-

values is clearly not symmetric and is strongly skewed to the left, with the bulk of the

piae-values appearing above, or to the right of, a piae-value equal to 0.75. Contrast

the distribution of the piae-values with that of the p-values for the two-sided t-test

using the same pairs of samples, given by the histogram and empirical cumulative

distribution function in Figure 3.7. The distribution of the two-sided t-test p-values

is effectively uniform between zero and one-half, and as such, it cannot provide any

insight into the selection of an appropriate critical p-value without some additional

knowledge or the arbitrary selection of a small number. This clearly identifies one of

the problems of using the t-test for comparing the actual and simulated stands.

The uniform distribution of the p-values for the t-test is to be expected, given the

standard use of the p-value and α-level in the classical two-sided t-test hypothesis

testing framework [1]. A one-sided t-test would have generated p-values that were

uniformly distributed between zero and one. The distinction between a one-sided

or two-sided formulation of the t-test is necessary because it uses point estimates to

determine the similarity or difference of distributions. The piae-value, being derived

from the data distributions themselves, does not have a one-sided or two-sided for-

mulation; two distributions are either similar or they are different, and this is clearly

indicated by the piae-value.

154

0.65 0.7 0.75 0.8 0.85 0.9 0.95 10

50

100

150

200

Histogram of nonparametric piae

−values

piae

−value

Fre

quen

cy c

ount

0.65 0.7 0.75 0.8 0.85 0.9 0.95 10

0.2

0.4

0.6

0.8

1

Cumulative distribution function for nonparametric piae

−values

piae

−value

Cum

ulat

ive

prob

abili

ty

Figure 3.6: Histogram (top) and empirical distribution function (bottom) for 1000piae-values. The piae-values were generated by choosing a random sample size n inthe range of 20 to 50 and then drawing a pair of random samples of size n from astandard normal distribution N(0, 1). A piae-value was then computed for each pairof samples.

155

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.50

10

20

30

40

50

60

70Histogram of t−test p−values

p−value

Fre

quen

cy c

ount

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.50

0.2

0.4

0.6

0.8

1Cumulative distribution function for t−test p−values

p−value

Cum

ulat

ive

prob

abili

ty

Figure 3.7: Histogram (top) and empirical distribution function (bottom) for 1000t-test p-values. The p-values were generated from the same pairs of samples used forthe piae-values in Figure 3.6. The p-values were computed using a two sided t-test.

156

Figure 3.8 and Figure 3.9 present the nonparametric probability density function

estimates, sample size n, p-value, and piae-value for two very similar random samples

and two somewhat different random samples, respectively, from the sampling simula-

tion. The agreement between the two distributions is approximately 98%, a piae-value

of 0.98, but the p-value indicates a 94% agreement, a p-value of 0.47 for Figure 3.8.

The distributions in Figure 3.9 are obviously different, with an agreement of approx-

imately 74%, a piae-value of 0.74, but the t-test p-value indicates that the means,

and hence the distributions, are highly significantly different, a p-value of 0.004. This

demonstrates another common problem with the t-test, the spurious rejection of the

null hypothesis.

Given the a priori knowledge that the two samples used to obtain each piae-value

or t-test p-value were drawn from the same underlying distribution, the standard

normal distribution N(0, 1), the piae-value clearly proves to be more effective as an

indicator of the similarity of the distributions. The piae-value also provides a strong

indication of a similarity threshold value, whereas the t-test p-value does not. This

is not surprising, since the piae-value uses more of the information contained in the

samples than the t-test, so it should perform better in this regard. It is for this

particular reason that the piae-value has been chosen as the primary test statistic,

rather than the t-test and its p-value, for comparisons of the actual and simulated

stands for the validation of the tree list generation database methodology.

An appropriate critical piae-value may be chosen by considering the theoretical

interpretation of the piae-value guided by the results of the sampling simulation. Recall

that the piae-value gives a direct indication of the proportion of probability mass that

two continuous distributions have in common. A piae-value of one indicating that

the two distributions have all of their probability mass in common, i.e., they are

identical, and a piae-value of zero indicating that the two distributions have none of

their probability mass in common, i.e., they are completely different. The sampling

simulation indicates that for two samples drawn from a standard normal distribution,

157

−6 −4 −2 0 2 4 60

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

n = 44

p = 0.4738

piae

= 0.9824

Estimated density functions for two random samples

sample1sample2

Figure 3.8: Nonparametric density function estimates for two samples of size 44 from astandard normal distribution N(0, 1). The two density functions are almost identical,with a piae-value of approximately 0.98. The t-test p-value of 0.47 for the differencein means in this example, indicates that the means are not significantly different, butgives a slightly poorer indication of the actual similarity of the distributions.

most of the piae-values are greater than a piae-value of 0.75, or 75% similarity between

distributions. So, a piae-value of 0.75 seems to be a reasonable critical value to use in

practice; piae-values ≥ 0.75 indicating similarity between distributions, and piae-values

< 0.75 indicating lack of similarity between distributions. The critical piae-value of

0.75 will, in general, be conservative for distributions that are multimodal or that are

not strongly unimodal like the normal distribution; more variation in the distribution

of probability mass for a particular pair of distributions will lead to a lower critical

piae-value when comparing distributions.

The piae-value is the test statistic used to computee the empirical misclassification

158

−6 −4 −2 0 2 4 60

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

n = 37

p = 0.0004

piae

= 0.7360

Estimated density functions for two random samples

sample1sample2

Figure 3.9: Nonparametric density function estimates for two samples of size 37 froma standard normal distribution N(0, 1). The two density functions are obviouslydifferent, but the piae-value of 0.74 indicates that they have 75% of their probabilitymass in common. The t-test p-value of 0.004 for the difference in means in thisexample, indicates that the means, and hence distributions, are highly significantlydifferent.

159

rates from the comparisons between actual stands and simulated stands in the vali-

dation of the tree list generation database using the critical piae-value of 0.75. The

continuous and categorical piae-value formulae, Equation 3.1 and Equation 3.2, are

used to compute the piae-values. The histogram based formula, Equation 3.3 was pro-

vided for completeness, but it could have been used instead of the continuous formula,

Equation 3.1. Comparisons involving the actual and simulated DBH and height dis-

tributions, which are nominally from continuous distributions, Equation 3.1 and the

nonparametric probability density function estimation and numerical integration pro-

cedures described above are used. Comparisons involving the species compositions of

the actual and simulated stands, use Equation 3.2 with the proportion of each species

in either the actual or simulated stand.

For each comparison between an actual and a simulated stand, a piae-value is

computed for each of the DBH and height distributions, and species composition.

For each distribution, a piae-value greater than or equal to the critical value of 0.75

indicates that their distributions are not distinguishable, i.e are the same, and a piae-

value less than the critical value of 0.75 indicates that their distributions are different.

If a piae-value is less than the critical value of 0.75 for a particular attribute and pair

of stands, a misclassification error for that attribute has occurred; the simulated

stand was generated to mimic the characteristics of the actual stand. Empirical

misclassification rates for each attribute, and a total misclassification rate, provide

the primary form of validation for the tree list generation database methodology.

The total misclassification rate is based on the logical or ing of the three individual

classification outcomes. If at least one of DBH, height, or species composition is

misclassified for a particular pair of actual and simulated stands, this is considered

as a misclassified stand for the computation of the total misclassification rate. The

total misclassification rate, therefore, provides an upper bound on the true total

misclassification rate.

There are four tree list generation database testing scenarios. Each of the four

160

testing scenarios is applied to the two treatments in the tree list generation database

tgdb1r00 to obtain a set of misclassification rates. The testing scenarios correspond

to the four possible stand generation modes that may be specified in a stand descrip-

tion file used with TGRAND when generating a simulated stand. The four possible stand

generation modes are listed in Table 3.4, and they are determined by the values of

the RANDOM MF JITTER and RANDOM MF PURE keywords specified in a stand description

file, Section 2.11.2. When presenting the goodness of fit testing results, the misclassi-

fication rates, only figures for scenario RMF are presented. The figures for the other

testing scenarios are similar. Tables summarizing the primary goodness of fit testing

results for all four testing scenarios are presented for comparison with the results of

scenario RMF. The RMF testing scenario is the default stand generation mode and

also provides the worst case scenario for stands generated using TGRAND.

Table 3.4: The four simulated tree list generation modes for the tree list generationdatabase program TGRAND. Jittering heights adds or subtracts a small random amountto or from each height value in an attempt to represent the natural variability of treeheights. The 100% pure stand mode forces generated stands to be 100% of thedominant species if they have a stand type of PURE DF or PURE WH. For details on howto set these properties for a tree list see Section 2.11.2.

Scenario 100% pure stands Jitter heights

RMF (default) No No

RMFJ No Yes

RMFP Yes No

RMFJP Yes Yes

Results are summarized in three different ways for each treatment. First, mis-

classification rates are presented for each testing scenario in DBH, height, species

composition, and a total misclassification rate. These error rates test the within

stand agreement of each actual and simulated stand and represent the bulk of the

tree list generation database validation results. Histograms of the DBH, height, and

species composition piae-values are also presented, with the nominal piae cutoff, or

161

critical value, of 0.75, 75% probability mass overlap indicated. These figures provide

a graphical sense of the misclassification rates for different piae cutoff values. Second,

scatter plots of simulated and actual QMD and mean height are presented with lin-

ear regression coefficients and r2 values. Third, the distributions of QMD and mean

height across all actual and simulated stands are compared. These two comparisons

provide an indication of how well these stand attributes are reproduced by the simu-

lated stands. The comparisons test the ability of the tree list generation database to

reproduce the appropriate regional properties across all of the stands in a database,

or the between stand compatibility of the actual and simulated stands.

The tree list generation database goodness of fit testing results are presented in

the next two sections. First are the results for untreated stands. Second are the

results for thinned stands. See Section 3.4.1 for some additional information on the

interpretation of the results, particularly the relevance of the critical piae-value of 0.75

to forestry applications; this critical piae-value may be quite conservative. Also see

Section 3.4.2 for an example of using the tree list generation database with forest

growth and yield models or forest simulation systems.

3.3.3 Goodness of fit testing results for untreated stands

The results are for the tree list generation database tgdb1r00 which used the stand

index parameters

TREATMENT_TYPE

SITE_INDEX_50

STAND_TOTAL_AGE

STAND_ORIGIN

QMD

STAND_DENSITY

STAND_TYPE

for the stand classification and the nearest neighbor stand selection procedures. See

Section 2.11.6 for definitions of the index parameters, and Section 3.1.1 and Sec-

tion 3.1.2 for the stand classification and stand selection procedures, respectively.

162

The primary results provide empirical misclassification rates for DBH and height dis-

tributions, species composition, and a total misclassification rate derived from the

individual DBH height, and species misclassification rates. The remaining results

compare the stand level attributes of QMD and mean height.

The empirical misclassification rate results were generated by first creating a sim-

ulated stand to mimic each of the untreated stand measurements used to populaste

the tree list generation database tgdb1r00. Each pair of actual and simulated stands

were then compared by computing a set of piae-values for the DBH distribution, the

height distribution, and the species composition for each pair of actual and simulated

stand measurements. Empirical misclassification rates for each distribution were then

generated by performing hypothesis tests using the piae-values for the three individual

distributions, and the critical piae-value of 0.75. Distributions which had piae-values

≥ 0.75 were considered to be similar, and distributions which had piae-values < 0.75

were considered to be different, indicating a misclassification. An overall, or total,

misclassification rate was also computed from the individual distribution misclassi-

fication rates. This process was repeated for each of the four tree list generation

database testing scenarios in Table 3.4.

The empirical misclassification rate results for the DBH distributions, the height

distributions, and the species composition for untreated stands and testing scenario

RMF are presented in Figure 3.10, Figure 3.11, and Figure 3.12, respectively. The

figures clearly show that the bulk of the actual stand vs. simulated stand comparisons

had piae-values that were greater than or equal to the critical piae-value of 0.75, indi-

cating that the majority of simulated untreated stands were indeed similar to their

actual counterparts. The misclassification rates for testing scenario RMF were 2.59%

for DBH distributions, 13.21% for height distributions, 8.25% for species composi-

tion, and 19.72% overall. The larger misclassification rate for height distributions is

most likely due to the reduction in the natural variation in tree heights caused by the

estimation of tree heights from height-diameter relationships. Table 3.5 summarizes

163

the misclassification rate results for the other three testing scenarios, including the

results for scenario RMF for comparison.

The results for testing scenario RMF represent the worst case scenario for the

generated stands, and also represent the default mode of operation for the tree list

generation database program TGRAND. As can be seen, the worst case performance is

quite good, having an overall correct classification rate in excess of 80%, demonstrat-

ing that there is very good agreement between the actual and simulated untreated

stands at the level of individual stand measurements. The results for the other three

testing scenarios prove to be slightly better than the 80% correct classification rate of

testing scenario RMF, as expected, with the two scenarios having jittered tree heights

both performing better than their nonjittered counterparts.

Table 3.5: Misclassification rates in DBH, height, and species for untreated stands.Values are the proportion of stands misclassified for each variable. The total columngives the proportion of stands which were misclassified in at least one of DBH, height,or species, and gives an upper bound on the actual misclassification rate.

Scenario DBH Height Species Total

RMF 0.0259 0.1321 0.0825 0.1972

RMFJ 0.0261 0.1244 0.0781 0.1851

RMFP 0.0255 0.1298 0.0793 0.1929

RMFJP 0.0257 0.1221 0.0731 0.1801

Results for the comparison of the aggregate properties of the actual and simulated

untreated stands was performed using QMD and average tree height as the stand

level attributes. The comparison consisted of performing a simple linear regression

of the simulated QMD and mean height values against the actual QMD and mean

height values, respectively. For this analysis, an intercept of zero (0) and a slope of

one (1) indicate exact agreement between the simulated and actual QMD or mean

height values.

The QMD data used in the simple linear regression analysis for testing scenario

RMF are given in Figure 3.13. A cursory examination of the figure clearly indicates

164

0.4 0.5 0.6 0.7 0.8 0.9 10

200

400

600

800

1000

1200

1400

Untreated stand: DBH piae

−value histogram for scenario RMF

DBH piae

−value

Fre

quen

cy

Figure 3.10: Histogram of piae-values for tree DBH in untreated stands for scenarioRMF. The vertical line at 0.75 represents the boundary between the similar (piae-value≥ 0.75) and different (piae-value< 0.75) regions for the piae-values. The cutoffvalue of 0.75 indicates a 75% agreement between the actual and simulated DBHdistributions. The misclassification rate is 2.59%. See Table 3.5 for error rates for theother scenarios.

165

0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

200

400

600

800

1000

1200

Untreated stand: Height piae


Height piae

−value

Fre

quen

cy

Figure 3.11: Histogram of piae-values for tree height in untreated stands for scenarioRMF. The vertical line at 0.75 represents the boundary between the similar (piae-value≥ 0.75) and different (piae-value< 0.75) regions for the piae-values. The cutoffvalue of 0.75 indicates a 75% agreement between the actual and simulated heightdistributions. The misclassification rate is 13.21%. See Table 3.5 for error rates forthe other scenarios.

166

0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

200

400

600

800

1000

1200

1400

1600

1800

Untreated stand: species piae


Species piae

−value

Fre

quen

cy

Figure 3.12: Histogram of piae-values for species composition in untreated stands forscenario RMF. The vertical line at 0.75 represents the boundary between the similar(piae-value≥ 0.75) and different (piae-value< 0.75) regions for the piae-values. Themisclassification rate is 8.25%. See Table 3.5 for error rates for the other scenarios.

167

0 10 20 30 40 50 60 70 800

10

20

30

40

50

60

70

80Untreated stand: simulated vs. actual QMD for scenario RMF

Actual QMD (cm)

Sim

ulat

ed Q

MD

(cm

)

Figure 3.13: Plot of simulated vs. actual QMD values (cm) from untreated stands forscenario RMF. Linear regression coefficients for the model y = a+ bx applied to thesedata yields the model y = 0.7766 + 0.9578x with r2= 0.9558. See Table 3.6 for linearregression coefficients and r2 values for the other scenarios.

a strong linear relationship between the actual and simulated values for this stand

attribute. The QMD intercept of 0.7766, the slope of 0.9578 and the r2 value of 0.9558

support the observation of strong linearity. Table 3.6 presents the linear regression

coefficients and r2 values for the other three testing scenarios, as well as the results

for scenario RMF for comparison. These results indicate that there is very good

agreement for QMD measurements between the actual and simulated untreated stands

for all four testing scenarios.

The mean tree height data used in the simple linear regression analysis for testing

scenario RMF are given in Figure 3.14. Again, a cursory examination of the figure

clearly indicates a strong linear relationship between the actual and simulated values

168

Table 3.6: Linear regression coefficients and r2 values for the model y = a+bx appliedto the simulated (y) and actual (x) QMD values (cm) for untreated stands. A perfectfit would give values a = 0 and b = 1.

Scenario Intercept (a) Slope (b) r2

RMF 0.7766 0.9578 0.9558

RMFJ 0.8073 0.9547 0.9534

RMFP 0.7223 0.9648 0.9564

RMFJP 0.7743 0.9603 0.9535

for this stand attribute. The mean height intercept of 1.1419, the slope of 0.9403 and

the r2 value of 0.8720 support the observation of strong linearity. Table 3.7 presents

the linear regression coefficients and r2 values for the other three testing scenarios,

as well as the results for scenario RMF for comparison. These results indicate that

there is very good agreement for mean height measurements between the actual and

simulated untreated stands for all four testing scenarios.

Table 3.7: Linear regression coefficients and r2 values for the model y = a+bx appliedto the simulated (y) and actual (x) mean height values (m) for untreated stands. Aperfect fit would give values a = 0 and b = 1.


RMF 1.1419 0.9403 0.8720

RMFJ 1.1248 0.9400 0.8686

RMFP 1.0018 0.9535 0.8721

RMFJP 0.9919 0.9529 0.8693

Finally, the distributions of the aggregate stand level attributes QMD and mean

tree height for untreated stands were compared. The distributions for these stand

attributes were compared by computing a piae-value for each pair of actual and sim-

ulated QMD and mean height distributions. Figure 3.15 and Figure 3.16 present the

nonparametric probability density estimates for the actual and simulated QMD and

mean height distributions for testing scenario RMF. Visually, the actual and simu-

lated distributions are quite similar for both QMD and mean height. The piae-values

169

0 10 20 30 40 50 600

5

10

15

20

25

30

35

40

45

50Untreated stand: simulated vs. actual mean height for scenario RMF

Actual mean height (m)

Sim

ulat

ed m

ean

heig

ht (

m)

Figure 3.14: Plot of simulated vs. actual mean height values (m) from untreatedstands for scenario RMF. Linear regression coefficients for the model y = a + bxapplied to these data yields the model y = 1.1419 + 0.9403x with r2= 0.8720. SeeTable 3.7 for linear regression coefficients and r2 values for the other scenarios.

170

0 10 20 30 40 50 60 70 80 900

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

0.045

0.05Untreated stand: QMD density estimates for scenario RMF

QMD (cm)

Den

sity

Actual Simulated

Figure 3.15: Nonparametric probability density function estimates for the actual andsimulated QMD distributions for all untreated stands and scenario RMF. The piae-value for the actual vs. simulated QMD distributions is 0.9778. See Table 3.8 for theoverall QMD distribution piae-values for the other scenarios.

for the QMD and mean height distributions for scenario RMF are 0.9778 and 0.9779,

respectively, indicating that the actual and simulated QMD and mean height distri-

butions are almost identical, having approximately 98% of their probability mass in

common. Table 3.8 presents the results for the other three testing scenarios as well

as the results for scenario RMF for comparison. These results reinforce the results

from the simple linear regressions, and provide another indication of how well the

tree list generation database methodology performs. Notice, in particular, that the

distribution of mean height for the simulated stands reproduces the kink in the actual

mean height distribution near the mean height of 15 m (49 ft).

171

0 10 20 30 40 50 60 70 800

0.01

0.02

0.03

0.04

0.05

0.06Untreated stand: mean height density estimates for scenario RMF

Mean height (m)

Den

sity

Actual Simulated

Figure 3.16: Nonparametric probability density function estimates for the actualand simulated mean height distributions for all untreated stands and scenario RMF.The piae-value for the actual vs. simulated mean height distributions is 0.9779. SeeTable 3.8 for the overall mean height distribution piae-values for the other scenarios.

Table 3.8: Overall QMD and mean height distribution piae-values for actual andsimulated untreated stands for all four scenarios. See Figure 3.15 and Figure 3.16.

Scenario QMD piae-value Mean height piae-value

RMF 0.9778 0.9779

RMFJ 0.9762 0.9741

RMFP 0.9813 0.9804

RMFJP 0.9799 0.9790

172

3.3.4 Goodness of fit testing results for thinned stands

The results are for the tree list generation database tgdb1r00 which used the index

parameters

TREATMENT_TYPE

SITE_INDEX_50

STAND_TOTAL_AGE

STAND_ORIGIN

QMD

STAND_DENSITY

STAND_TYPE

NUMBER_OF_THINNINGS

PRE_THIN_DENSITY

PRE_THIN_BASAL_AREA


YEARS_SINCE_TREATMENT

for the stand classification and the nearest neighbor stand selection. See Section 2.11.6

for definitions of the index parameters, and Section 3.1.1 and Section 3.1.2 for the

stand classification and stand selection procedures, respectively. The primary results

provide empirical misclassification rates for DBH and height distributions, species

composition, and a total misclassification rate derived from the individual DBH

height, and species misclassification rates. The remaining results compare the stand

level attributes of QMD and mean height.

The empirical misclassification rate results were generated by first creating a sim-

ulated stand to mimic each of the thinned stand measurements used to populate the

tree list generation database tgdb1r00. Each pair of actual and simulated stands

were then compared by computing a set of piae-values for the tree DBH distribution,

the tree height distribution, and the species composition. Empirical misclassification

rates for each distribution were then generated by performing hypothesis tests us-

ing the piae-values for the three individual distributions, and the critical piae-value of

0.75. Distributions which had piae-values ≥ 0.75 were considered to be similar, and

distributions which had piae-values < 0.75 were considered to be different, indicating

a misclassification. An overall, or total, misclassification rate was also computed from

173

the individual distribution misclassification rates. This process was repeated for each

of the four tree list generation database testing scenarios in Table 3.4.

The empirical misclassification rate results for the DBH distributions, the height

distributions, and the species composition for thinned stands and testing scenario

RMF are presented in Figure 3.17, Figure 3.18, and Figure 3.19, respectively. The

figures clearly show that the bulk of the actual stand vs. simulated stand comparisons

had piae-values that were greater than or equal to the critical piae-value of 0.75, indi-

cating that the majority of the simulated thinned stands were indeed similar to their

actual counterparts. The misclassification rates for testing scenario RMF were 5.68%

for DBH distributions, 15.02% for height distributions, 2.75% for species composition,

and 18.47% overall. The larger misclassification rate for height distributions is most

likely due to the reduction in the natural variation in tree heights caused by the esti-

mation of tree heights from height-diameter relationships, and the further reduction

in the height variation caused by the thinning. The higher DBH misclassification rate,

compared to that of untreated stands, 2.59%, is also most likely due to smaller vari-

ation in the DBH measurements caused by the thinning. The lower misclassification

rate for species composition is also directly attributable to the thinning: undesirable

tree species would have been removed. Table 3.9 summarizes the misclassification

rate results for the other three testing scenarios, including the results for scenario

RMF for comparison.

The results for testing scenario RMF represent the worst case scenario for the

generated stands, and also represent the default mode of operation for the tree list

generation database program TGRAND. As can be seen, the worst case performance is

quite good, with an overall correct classification rate in excess of 81%, demonstrat-

ing that there is very good agreement between the actual and simulated stands at

the level of individual stand measurements for the thinned stands. The results for

two of the other three testing scenarios prove to be slightly better than the 81% cor-

rect classification rate of testing scenario RMF, as expected,but the results for the

174

0.4 0.5 0.6 0.7 0.8 0.9 10

100

200

300

400

500

600

700

800

900

1000

Thinned stand: DBH piae


DBH piae

−value

Fre

quen

cy

Figure 3.17: Histogram of piae-values for tree DBH in thinned stands for scenarioRMF. The vertical line at 0.75 represents the boundary between the similar (piae-value≥ 0.75) and different (piae-value< 0.75) regions for the piae-values. The cutoffvalue of 0.75 indicates a 75% agreement between the actual and simulated DBHdistributions. The misclassification rate is 5.68%. See Table 3.9 for error rates for theother scenarios.

remaining scenario, scenario RMFJ, are nearly identical to the results for scenario

RMF. The two testing scenarios having jittered heights performed better than their

nonjittered counterparts, but scenario RMFJ slightly outperformed scenario RMFJP

which should have been the best case scenario. The difference is small, 0.45%, and is

most likely a result of the luck of the draw when generating simulated trees

The results for the comparison of the aggregate properties of the actual and simu-

lated thinned stands was performed using QMD and average tree height as the stand

level attributes. The comparison consisted of performing a simple linear regression

175

0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

100

200

300

400

500

600

700

800

900

Thinned stand: Height piae


Height piae

−value

Fre

quen

cy

Figure 3.18: Histogram of piae-values for tree height in thinned stands for scenarioRMF. The vertical line at 0.75 represents the boundary between the similar (piae-value≥ 0.75) and different (piae-value< 0.75) regions for the piae-values. The cutoffvalue of 0.75 indicates a 75% agreement between the actual and simulated heightdistributions. The misclassification rate is 15.02%. See Table 3.9 for error rates forthe other scenarios.

Table 3.9: Misclassification rates in DBH, height, and species for thinned stands.Values are the proportion of stands misclassified for each variable. The total columngives the proportion of stands which were misclassified in at least one of DBH, height,or species, and gives an upper bound on the actual misclassification rate.

Scenario DBH Height Species Total

RMF 0.0568 0.1502 0.0275 0.1800

RMFJ 0.0595 0.1502 0.0225 0.1730

RMFP 0.0588 0.1529 0.0230 0.1804

RMFJP 0.0590 0.1509 0.0223 0.1775

176

0.4 0.5 0.6 0.7 0.8 0.9 10

500

1000

1500

2000

2500

Thinned stand: species piae


Species piae

−value

Fre

quen

cy

Figure 3.19: Histogram of piae-values for species composition in thinned stands forscenario RMF. The vertical line at 0.75 represents the boundary between the similar(piae-value≥ 0.75) and different (piae-value< 0.75) regions for the piae-values. Themisclassification rate is 2.75%. See Table 3.9 for error rates for the other scenarios.

177

of the simulated QMD and mean height values against the actual QMD and mean

height values, respectively. For this analysis, an intercept of zero (0) and a slope of

one (1) indicate exact agreement between the simulated and actual QMD or mean

height values.

The QMD data used in the simple linear regression analysis for testing scenario

RMF are given in Figure 3.20. A cursory examination of the figure clearly indicates

a strong linear relationship between the actual and simulated values for this stand

attribute. The QMD intercept of 0.6959, the slope of 0.9635 and the r2 value of 0.9496

support the observation of strong linearity. Table 3.10 presents the linear regression

coefficients and r2 values for the other three testing scenarios, as well as the results

for scenario RMF for comparison. These results indicate that there is very good

agreement for QMD measurements between the actual and simulated thinned stands

for all four testing scenarios.

Table 3.10: Linear regression coefficients and r2 values for the model y = a + bxapplied to the simulated (y) and actual (x) QMD values (cm) for thinned stands. Aperfect fit would give values a = 0 and b = 1.


RMF 0.6959 0.9635 0.9496

RMFJ 0.8613 0.9556 0.9492

RMFP 0.6912 0.9652 0.9526

RMFJP 0.6032 0.9693 0.9492

The mean tree height data used in the simple linear regression analysis for testing

scenario RMF are given in Figure 3.21. Again, a cursory examination of the figure

clearly indicates a strong linear relationship between the actual and simulated values

for this stand attribute. The mean height intercept of 0.4151, the slope of 0.9791 and

the r2 value of 0.8806 support the observation of strong linearity. Table 3.11 presents

the linear regression coefficients and r2 values for the other three testing scenarios,

as well as the results for scenario RMF for comparison. These results indicate that

178

0 10 20 30 40 50 60 70 80 900

10

20

30

40

50

60

70

80

90Thinned stand: simulated vs. actual QMD for scenario RMF

Actual QMD (cm)

Sim

ulat

ed Q

MD

(cm

)

Figure 3.20: Plot of simulated vs. actual QMD values (cm) from thinned stands forscenario RMF. Linear regression coefficients for the model y = a+ bx applied to thesedata yields the model y = 0.6959+0.9635x with r2= 0.9496. See Table 3.10 for linearregression coefficients and r2 values for the other scenarios.

179

0 10 20 30 40 50 600

10

20

30

40

50

60Thinned stand: simulated vs. actual mean height for scenario RMF

Actual mean height (m)

Sim

ulat

ed m

ean

heig

ht (

m)

Figure 3.21: Plot of simulated vs. actual mean height values (m) from thinned standsfor scenario RMF. Linear regression coefficients for the model y = a + bx applied tothese data yields the model y = 0.4151+0.9791x with r2= 0.8806. See Table 3.11 forlinear regression coefficients and r2 values for the other scenarios.

there is very good agreement for mean height measurements between the actual and

simulated thinned stands for all four testing scenarios.

Table 3.11: Linear regression coefficients and r2 values for the model y = a+bx appliedto the simulated (y) and actual (x) mean height values (m) for thinned stands. Aperfect fit would give values a = 0 and b = 1.


RMF 0.4151 0.9791 0.8806

RMFJ 0.5230 0.9744 0.8837

RMFP 0.3184 0.9866 0.8840

RMFJP 0.3342 0.9859 0.8845

180

Finally, the distributions of the aggregate stand level attributes QMD and mean

tree height for thinned stands were compared. The distributions for these stand

attributes were compared by computing a piae-value for each pair of actual and sim-

ulated QMD and mean height distributions. Figure 3.22 and Figure 3.23 present the

nonparametric probability density estimates for the actual and simulated QMD and

mean height distributions for the testing scenario RMF. Visually, the actual and sim-

ulated distributions are quite similar for both QMD and mean height. The piae-values

for the QMD and mean height distributions for scenario RMF are 0.9811 and 0.9777,

respectively, indicating that the actual and simulated QMD and mean height distri-

butions are almost identical, having approximately 98% of their probability mass in

common. Table 3.12 presents the results for the other three testing scenarios as well

as the results for scenario RMF for comparison. These results reinforce the results

from the simple linear regressions and provide another indication of how well the tree

list generation database methodology performs for thinned stands.

Table 3.12: Overall QMD and mean height distribution piae-values for actual andsimulated thinned stands for all four scenarios. See Figure 3.22 and Figure 3.23.

Scenario QMD piae-value Mean height piae-value

RMF 0.9811 0.9777

RMFJ 0.9778 0.9755

RMFP 0.9794 0.9784

RMFJP 0.9812 0.9791

181

0 10 20 30 40 50 60 70 80 900

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04Thinned stand: QMD density estimates for scenario RMF

QMD (cm)

Den

sity

Actual Simulated

Figure 3.22: Nonparametric probability density function estimates for the actual andsimulated QMD distributions for all thinned stands and scenario RMF. The piae-valuefor the actual vs. simulated QMD distributions is 0.9811. See Table 3.12 for the overallQMD distribution piae-values for the other scenarios.

182

0 10 20 30 40 50 60 70 800

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

0.045Thinned stand: mean height density estimates for scenario RMF

Mean height (m)

Den

sity

Actual Simulated

Figure 3.23: Nonparametric probability density function estimates for the actualand simulated mean height distributions for all thinned stands and scenario RMF.The piae-value for the actual vs. simulated mean height distributions is 0.9777. SeeTable 3.12 for the overall mean height distribution piae-values for the other scenarios.

183

3.3.5 Software testing procedures

When testing software there are two criteria that must be satisfied. First is that the

software meets the requirements of its specification, i.e., does it simply work correctly.

This is a verification the programs work correctly in a mechanical sense. Second is

that the software actually solves the problem that it was intended to solve. This is a

verification of the intent of the software. Both of these tasks are generally nontrivial to

accomplish with any reasonably complex software system [23, 7, 19, 14]. When testing

the tree list generation database software, both of these issues were addressed, but the

latter was considered to be of much greater importance and therefore comprises the

bulk of the testing. For more information on software testing and the software testing

ideology adopted for the tree list generation database software, see the references

cited.

The testing of the tree list generation database software consisted of four major

phases. The first phase was concerned with the design and rapid prototyping of a

tree list generation database to detect any early design flaws. The second phase was

concerned with the correct operation of the subroutines and modules implemented

for the tree list generation database subroutine library, commonly called unit testing.

The third phase was concerned with integrating the modules in the tree list genera-

tion database subroutine library into the programs that access and manipulate a tree

list generation database, commonly integration testing. The fourth phase consisted

of validating that the tree list generation database programs worked, i.e., that the

programs performed as expected in terms of their input and output, generating rea-

sonable tree lists for the specified input stand descriptions. This last phase is the

most important, as it demonstrates the effectiveness of the overall tree list generation

database methodology, rather than a specific implementation, and encompasses the

bulk of the testing.

In the first phase of testing a written specification of the capabilities desired for

184

the tree list generation database was created, and a basic partitioning of the tree list

generation problem into more basic components and their implementation. These two

activities are naturally coupled, so a prototype of the tree list generation database

was implemented to determine design flaws. A more or less complete prototype of

the tree list generation databse methodology for untreated stands was implemented

using MATLAB [34] as a rapid prototyping environment. This rapid prototyping

allowed modifications to the tree list generation database design to be performed

easily, allowing a short turnaround time when problems with the initial tree list

generation database specification were discovered. This phase may be thought of as

generating a very detailed specification of the tree list generation problem with a

particular software design as a solution. After the MATLAB prototype for untreated

stands was completed and shown to be effective, the development of the tree list

generation database Fortran 90/95 subroutine library was begun, and the second

phase of testing was entered.

The second phase of testing the tree list generation database consisted of individ-

ual subroutine testing, to the extent possible or feasible, in the tree list generation

database subroutine library. During this phase, the nominal input, output, and data

processing activities for each subroutine were tested to verify correct behavior. In ad-

dition, groups of subroutines, or modules, that were required to operate together were

also tested for their correct behavior. Once the individual subroutines and modules

were demonstrated to be operating correctly, the third phase of testing began.

The third phase of testing the tree list generation database consisted of imple-

menting the executable programs and integrating the subroutines and modules to

perform the various tasks, the integration testing. Each program has had its nominal

execution path thoroughly tested. The executable programs were also used to gener-

ate all of the tree lists used in this documentation and in phase four, the validation

testing.

The final phase of the tree list generation database testing consisted of verifying

185

that the simulated tree lists generated by the system were reasonable, that is that

the tree list generation methodology as implemented actually worked. This is the

validation testing that comprises the bulk of this chapter.

3.4 Conclusions, caveats and other considerations

The tree list generation database methodology that has been described and imple-

mented meets the objectives outlined in Section 3.1. Maximum use of the measured

tree data is attained by storing the actual tree measurements. Trees are naturally

represented and generated as multidimensional objects having a DBH, a height, and

a species identifier. The nearest neighbor tree generation procedure provides the flex-

ibility to represent a wide variety of stand types and tree size distributions. The

trees generated for simulated stands are physically realizeable; they are generated

from actual tree measurements, so they must be realizeable. The tree list generation

database methodology represents different treatments in a consistent manner. Fi-

nally, the addition of new stand measurement data to a tree list generation database

is simple and does not affect the overall stand classification.

The tree list generation database methodology that has been described and imple-

mented appears to work quite well for both untreated and thinned stands. Simulated

stands or tree lists generated using the tree list generation database methodology

generally agreed with the actual stands they were generated to mimic at the indi-

vidual tree level, defined by the DBH distribution, the height distribution, and the

species composition. At least 80% of the simulated stands were found to be statis-

tically indistinguishable from their actual, measured counterparts for both untreated

and thinned stands using a critical piae-value of 0.75; the piae-value based misclassi-

fication or error rates were at most 20%. The regional, or overall, distributions of

the stand level attributes QMD and mean height of the simulated and actual stands

also agreed quite well, having approximately 98% agreement in their distributions for

186

both untreated and thinned stands, with the agreement in species composition also

well in excess of 90%.

The results as presented are not broken down by stand type. The data used

to populate the tree list generation database tgdb1r00 consisted predominantly of

pure Douglas-fir and pure western hemlock stands with much smaller percentages of

Douglas-fir and western hemlock mixtures, see Table 3.2. Given the high degree of

agreement between the actual and simulated stands, for both untreated stands and

thinned stands, it seems unlikely that partitioning the data by stand type would give

results that differ greatly from the aggregated results given here. A visual exami-

nation of the results broken down by stand type was performed to verify this claim.

The visual assessment confirmed the statement that no benefit would be gained by

partitioning the results by stand type. If the overall agreement were poorer than it

is, splitting the results by stand type could have been useful for identifying situations

where the tree list generation database methodology performed poorly.

In closing, there are two issues that remain to be considered. First, the total

misclassification or error rates are conservative, i.e., too large. Given this, how much

do they over estimate the actual total misclassification rates. A brief examination of

this issue is presented in Section 3.4.1. Second is the use of the tree list generation

database with forest growth and yield models and forest simulation systems. This

will be addressed briefly in Section 3.4.2.

3.4.1 How conservative are the total error rates?

The 20% total misclassification rates observed for simulated untreated and thinned

stands may be quite conservative, i.e., too large, for three reasons. First, the total

misclassification rates were computed by performing a logical or ing of the three in-

dividual attribute misclassification events. This method of computing the total error

rates must inflate their values when compared to computing an error simultaneously

rate in the three dimensions DBH, height, and species simultaneously; the probability

187

mass in three dimensions is spread much more thinly than in one dimensional dis-

tributions, which decreases the probability that the actual and simulated stands will

be statistically different. There is simply much more room to move around in two or

more dimensions.

Second, in all of the piae-value based comparisons, the misclassification rates for

tree height distributions were much greater than the misclassification rates for DBH

distributions or species composition. An explanation of this phenomenon for both

untreated stands and thinned stands is readily available: most of the tree heights

are estimated from height-diameter models. The estimation of tree heights from a

height-diameter model will dramatically reduce the variation of tree heights for a par-

ticular set of stand measurements relative the actual height variation for the stand

at the time of its measurement. This supposition is partially supported by the fact

that the height distribution misclassification rate is approximately 0.75% lower for

the testing scenarios which jitter heights than for their non-jittered equivalents. For

thinned stands there is an additional component that influences the height distribu-

tion misclassification rate: thinnings remove the undesirable, generally smaller, trees

from a stand, which also reduces the variation in height, hence increasing the height

distribution misclassification rate. This effect may be seen in the slightly higher height

error rate for the thinned stands. If the height distribution misclassification rates were

comparable to the DBH distribution error rates, the total error rates could be reduced

by a factor of up to two, becoming approximately 10% to 15% misclassification.

Third, the critical piae-value may be too large, causing stands which are actually

similar to be identified as different. The critical piae-value was determined through

a sampling simulation based on the standard normal distribution which is strongly

unimodal. In many forestry situations, both the DBH and height distributions may

be multimodal or even somewhat uniform in shape. In both cases the critical piae-

value for distinguishing between similar and dissimilar stands would be smaller than

the value of 0.75 that was used.

188

The magnitude of the conservative effect on the misclassification rate caused by

too large a critical piae-value may be examined using actual stand measurement data.

Douglas-fir diameter data from the Stand Management Cooperative (SMC) were used

to determine how small an appropriate critical piae-value may be for pure Douglas-fir

stands. The diameter data consisted of individual tree DBH measurements from two

to four permanent study plots on each of 30 SMC Type I pure Douglas-fir installations.

The plots chosen from each installation had one to three commensurate measurement

dates. The selected data provided 352 within stand comparison which were used to

compute piae-values. The SMC Type I installations are all on commercial forest land,

are naturally regenerated or planted juvenile stands, and have at least 90% Douglas-

fir measured in stems per unit area. The site index at 50 years for the installations

ranged from 25.9 m (85 ft) to 44.2 m (145 ft), with stand densities of approximately

250 tph (100 tpa) to 1850 tph (750 tpa), and breast height ages ranging from three

years to 21 years.

The individual plot DBH distributions were compared for plots within the same

installation and having the same measurement years to obtain piae-values. The piae-

values were computed using the same procedures outlined in Section 3.3.1. A his-

togram of the computed piae-values is given in Figure 3.24. The histogram clearly

indicates that the piae-value of 0.75 is conservative, and that a reasonable critical

piae-value may be as small as 0.55, though there is little data for this small a piae-

value. Thus, a critical piae-value of 0.70 or 0.65 may be more appropriate, and still

somewhat conservative, choice for determining the similarity or difference of forest

stands than the value of 0.75. Figure 3.25 plots the computed piae-values against

the stand breast height ages. Notice the relatively uniform range of the piae-values

between 0.70 and 1.00 over time. This would seem to indicate that the diameter

distributions of the sample plots are at least maintaining their relative differences, as

measured by the piae-value, over time. The Douglas-fir stands used for this compar-

ison are quite young with relatively high site index values, and have not entered or

189

0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 10

10

20

30

40

50

60

piae

−value histogram comparing plots within homogeneous stands

piae

−value

Cou

nt

Figure 3.24: Histogram of DBH piae-values for actual stands. This plot indicates thatthe piae cutoff for comparing stands may be as low as 0.55, and that a value of 0.70 oreven 0.65 may be more appropriate than the critical piae-value of 0.75. See Table 3.13for the effect of lower piae cutoff values on the misclassification rates.

have just begun to enter a size differentiation stage of growth. As the size differention

progresses within these stands, the critical piae-value for determining their similarity

or difference would continue to decrease, possibly to a piae-value near 0.50.

The use of a lower critical piae-value would greatly reduce the observed total

misclassification rates. See Table 3.13 for the misclassification rates that would be

obtained for critical piae-values of 1.00 through 0.00 in steps of 0.05. Note in par-

ticular the very rapid decline of the misclassification rates to zero for critical piae-

valuesbetween 0.75 and 0.50. A reduction of the critical piae-value from 0.75 to 0.70

cuts the total stand misclassification rate in half, reducing it to approximately 10%.

190

2 4 6 8 10 12 14 16 18 20 220.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

piae

−value vs. breast height age for the comparison plots

Breast height age (years)

p iae−

valu

e

Figure 3.25: Plot of DBH piae-values vs. breast height age for actual stands. This plotindicates that the DBH piae-values may consistently maintain a fairly wide distributionthrough time.

191

A further reduction of the critical piae-value to 0.65, which seems justifiable given the

DBH distribution comparison just performed, would again halve the total misclassi-

fication rate, reducing it to approximately 6%. If the critical piae-valueis as small is

0.50, which is a strong possibility, the total misclassification rate could be as small

as 1%. Given this investigation, a critical piae-value of 0.65 seems reasonable, and

obtians a total percentage of correctly classified simulated stands well in excess of

90%.

Table 3.13: Total misclassification rates for different critical piae-values and tree listgeneration scenarios. Multiplying the misclassification rate by 100 gives the percent-age of simulated stands that would be considered different from the actual standsthey were generated to mimic. Note the rapid decline in the misclassification rate asthe critical piae-value is reduced.

Untreated stands Thinned stands

Cutoff RMF RMFJ RMFP RMFJP RMF RMFJ RMFP RMFJP

1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000

0.9500 0.9791 0.9823 0.9781 0.9806 0.9743 0.9694 0.9705 0.9712

0.9000 0.7946 0.7955 0.7802 0.7846 0.7628 0.7601 0.7568 0.7601

0.8500 0.5285 0.5256 0.5145 0.5139 0.5032 0.4863 0.4917 0.4858

0.8000 0.3219 0.3258 0.3148 0.3166 0.3054 0.2982 0.2957 0.3000

0.7500 0.1972 0.1851 0.1929 0.1801 0.1800 0.1730 0.1804 0.1775

0.7000 0.1094 0.1071 0.1064 0.1052 0.0993 0.0977 0.1011 0.0980

0.6500 0.0603 0.0618 0.0584 0.0609 0.0523 0.0495 0.0550 0.0500

0.6000 0.0317 0.0313 0.0328 0.0305 0.0284 0.0243 0.0275 0.0259

0.5500 0.0157 0.0144 0.0157 0.0150 0.0140 0.0122 0.0144 0.0124

0.5000 0.0073 0.0056 0.0071 0.0056 0.0061 0.0045 0.0063 0.0063

0.4500 0.0029 0.0027 0.0035 0.0027 0.0029 0.0027 0.0036 0.0029

0.4000 0.0010 0.0010 0.0010 0.0010 0.0014 0.0018 0.0020 0.0016

0.3500 0.0006 0.0006 0.0008 0.0008 0.0011 0.0005 0.0007 0.0005

0.3000 0.0002 0.0002 0.0002 0.0002 0.0002 0.0002 0.0002 0.0002

0.2500 0.0000 0.0000 0.0000 0.0000 0.0002 0.0000 0.0000 0.0000

0.2000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000

0.1500 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000

0.1000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000

0.0500 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000

0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000

192

3.4.2 Simulating forest stands using a tree list generation database

A typical forest growth and yield simulation would project a stand, generally repre-

sented by the trees on a single hectare (acre), some specified number of years into the

future to obtain, for example, an estimate of the standing wood volume for harvest

scheduling or other economic analyses. The results of this analysis would then be

extended to the entire stand in order to make the appropriate business decisions. The

tree list generation database has been designed to provide an easy way to generate

forest stands as input to individual tree based forest growth and yield simulators,

such as ORGANON [11], FVS [8, 35, 36] or TASS [20], or for use with other tree

based forest stand or forest economics simulation systems.

In a classic forest growth and yield simulation setting, the simulation or growth

projection of a single stand is used to represent all similar stands by scaling the

simulation results to agree with the appropriate land area [5]. Unfortunately, this

single simulation run does not provide any indication of the variation within stands

that may be due to genetics, climate, differences in site quality, spatial distribution

of the trees, or micro site variation. The tree list generation database may be used in

conjunction with a forest growth and yield model or forest stand simulation system

to obtain an indication of the possible variation in an actual stand through repeated

simulations. Each simulation would use a different simulated stand, generated to

match a particular set of stand attributes, as its input. The variation in the output

from the repeated simulations should then approximate the variation for an actual

stand after a similar time duration.

Figure 3.26 presents nonparametric probability density function estimates for the

DBH and height distributions of an actual stand and ten simulated stands that were

generated to mimic it using the tree list generation database tgdb1r00. The ten simu-

lated stands for this comparison were generated with the default options of TGRAND, so

the 100% species composition and jitter height options were both not used. Notice in

193

−40 −20 0 20 40 60 80 100 1200

0.005

0.01

0.015

0.02

0.025

0.03

0.035Actual and simulated DBH distributions

DBH (cm)

Den

sity

Actual Simulated

−30 −20 −10 0 10 20 30 40 50 600

0.02

0.04

0.06

0.08

0.1Actual and simulated height distributions

Height (m)

Den

sity

Actual Simulated

Figure 3.26: Plot of DBH and height density function estimates for 10 simulatedstands for a particular stand and the DBH and height density function estimates forthe actual stand. The piae-values for DBH, height, and species composition for eachsimulation are given in Table 3.14.

particular that the simulated stands seem to be generally reasonable approximations

of the actual stand. This is supported by the DBH distribution, height distribution,

and species composition piae-values obtained by comparing the actual stand with each

of the simulated stands presented in Table 3.14. Nonzero values of the nonparametric

density estimates in the figure for negative DBH or height values is an artifact of the

probability density estimation technique; there were no negative tree diameters or

heights generated.

There are several different interpretations that may be placed on the variation

obtained by using different simulated initial stands to generate growth and yield

predictions. The simulated initial stands may be thought of as accounting for local

194

Table 3.14: piae-values for DBH height, and species for 10 simulations of a particularstand. The DBH and height density function estimates for each simulation are givenin Figure 3.26.

Simulation DBH piae-value Height piae-value Species piae-value

1 0.9302 0.8745 0.9592

2 0.8985 0.9038 0.8980

3 0.9539 0.9293 0.9592

4 0.9531 0.8704 0.9592

5 0.8733 0.8619 0.9592

6 0.9406 0.9233 0.9592

7 0.9705 0.9441 0.8571

8 0.9142 0.8972 0.9184

9 0.8762 0.9100 0.9592

10 0.9236 0.9132 0.9592

climate variation, accounting to some degree for the influences of climate on tree and

stand growth, since most growth and yield models do not include climate. The tree

list generation database is particularly apropos for this interpretation since it makes

use of actual tree measurements, which contain an integrated record of local climate,

to generate simulated stands. Not every hectare (acre) in a stand is the same, due

to genetics, tree spcing, and micro site variation. Simulated stands from the tree list

generation database may be interpreted as representing the actual variability within

stands, and hence giving, possibly, more reliable estimates of standing wood volume,

or some other stand characteristic.

Given the simultaneous similarity and variation in the ten simulated stands, it

seems likely that a particular, but small, set of stand attributes, e.g., QMD, average

height, and species composition, could simultaneously be met at some point with

repeated draws. Thus, if it was important to nearly exactly match a few particular

numerical attributes of a stand, say to within ±10%, this could be done by using the

following algorithm.

1. Generate a simulated stand.

195

2. Compare the numerical values of the simulated attributes with those desired.

3. If the attributes agree to within the appropriate tolerances, stop; otherwise

repeat the process.

If arbitrarily small tolerances are used it may not be possible to generate a stand with

similar attributes, so some care must be taken when choosing a reasonable tolerance.

Finally, if a large gap exists in the data contained in a tree list generation database,

a trusted forest growth and yield model or forest stand simulation system may be used

to simulate the missing stand conditions and augment a tree list generation database.

This would be acceptable if data were very expensive to collect, or if simulated stands

for a very severe, and hence impractical to perform, set of silvicultural practices were

desired. In either case, access to the resulting tree list generation database must be

carefully controlled since it will contain a mixture of actual and simulated data, and

results derived from such a database may not be reliable.

3.4.3 Limitations of the tree list generation database

As is true with any large or sufficiently complex data processing or simulation system,

there are a variety of limitations affecting its use, and the tree list generation database

is no different in this regard. The limitations of the tree list generation database may

be broadly grouped into two categories: software limitations and data limitations,

though there is some unavoidable overlap in these categories.

The primary software limitation of the tree list generation database is that it

currently only supports pure Douglas-fir stands, pure western hemlock stands, and

stands containing one of these two species as a dominant component in a mixed species

stand. This limitation may be overcome by extending the tree list generation database

software to include stands with arbitrary species composition. A secondary software

limitation is that all of the tree data are stored, so a tree list generation database could

become quite large. Given the availability low cost, high volume storage devices, this

196

limitation is of little practical concern. If the database size becomes large enough to

degrade performance, more efficient sorting and searching algorithms may, however,

need to be put into place.

The data limitations for the tree list generation database methodology should be

obvious: if there are no data near a desired stand description, no simulated stand

may be generated, or a generated simulated stand may not adequately match the

attributes of the desired stand. For the near future this may be of some concern, but

over time this limitation should effectively disappear as new stand measurement data

are added to a tree list generation database to fill in gaps in its data coverage. This is

particularly true for commercial forestry, where the range of possible stand conditions

is rather tightly constrained. This limitation would be of much greater importance for

a tree list generation database used to simulate natural forests, though this limitation

may be overcome here as well by the collection of new data.

The tree list generation database methodology has been shown to be quite ef-

fective in simulating forest stands. Neither the software limitations of the current

implementation nor the data limitations are insurmountable; they may both be over-

come through the application of time and effort, and, should not impede the use of

the tree list generation database methodology in practice.

197

Chapter 4

TREE LIST GENERATION DATABASE

PROGRAMMER’S REFERENCE

A Fortran 95 library implementing the basic tree list generation database capabil-

ities has been developed and tested. The library consists of 132 function points and

approximately 39,000 lines of well documented Fortran 95. The library is approxi-

mately 99% portable to other Fortran 95 compilers, with all nonportable code isolated

in subroutines or modules. The tree list generation database subroutine library was

used to implement the eight tree list generation database programs.

This chapter presents a brief, one or two line, description of each subroutine,

function, or entry point, broken down by library module. A brief description of each

major subroutine group is also provided. These descriptions provide a basic refer-

ence manual for the tree list generation databse subroutine library. The executable

programs are not documented here; the descriptions provided in Chapter 2 should

provide adequate introduction.

4.1 Tree list generation database subroutine library

The tree list generation database subroutine library subroutines, functions, and entry

points are briefly described in this section. The individual subroutine, function, and

entry point descriptions are broken down by their function, or use, in the tree list

generation database subroutine library. For low level details on the file formats or im-

plementation, see the appropriate Fortran 95 source files; they are the sole repository

of the specific details.

198

4.1.1 Subroutines for reading and writing measurement files

The subroutines listed define the interface for reading and writing tree list generation

database stand measurement files. Stand measurement files are the primary input

and output file format for actual and simulated stand measurement data.

TGMFKMGR SUBROUTINE This is the umbrella procedure to allow the sharing of data

among the tree list generator measurement file keyword type conversion proce-

dures. This procedure should not be called directly.

TGMFKI2N ENTRY in TGMFKMGR Tree list generator convert measurement file keyword

integer ID code to its equivalent measurement file keyword name for a specified

treatment.

TGMFKN2I ENTRY in TGMFKMGR Tree list generator convert measurement file keyword

name to its equivalent integer ID code for a specified treatment.

TGMFKNUM ENTRY in TGMFKMGR Tree list generator get the number of measurement file

keywords for a specified treatment.

TGMFRD SUBROUTINE Tree list generator, read a stand measurement file. Output units

for numeric values are metric.

TGMFWR SUBROUTINE Write a tree list generator measurement file. Input units are

assumed to be metric.

4.1.2 Subroutines for reading and writing description files

The subroutines listed define the interface for reading and writing tree list generation

database stand description files. Stand description files are used to specify the stand

attributes for generating a simulated stand or tree list.

199

TGDFRD SUBROUTINE Tree list generator, read a stand description file. Output units

for numeric values are metric.

TGDFWR SUBROUTINE Write a tree list generator description file. Input units are as-

sumed to be metric.

TGDFWRT SUBROUTINE Write a tree list generator description file template.

4.1.3 Subroutines for reading and writing schema files

The subroutines listed define the interface for reading and writing tree list genera-

tion database schema files. Schema files are used to specify the treatments, index

parameters, and multidimensional histogram bin widths used in a tree list generation

database.

TGSCHRD SUBROUTINE Tree list generator, read a schema file.

TGSCHWR SUBROUTINE Tree list generator, write a new schema file.

TGSCHWRT SUBROUTINE Tree list generator, write a new schema template file.

4.1.4 Binary search tree subroutines

The subroutines listed define the binary search tree interface used by the tree list

generation database for classifying stands by their attributes. These subroutines

maintain the index of unique multidimensional histogram bins defining the stand

classification. This index is also used to determine the nearest neighbors for pooling

tree data to create a canonical stand.

TGBSTADD SUBROUTINE Add the key and data portions of a double precision record to

a binary search tree implemented on the DAS file architecture.

200

TGBSTARD SUBROUTINE Add a double precision record to a binary search tree file.

TGBSTCLS SUBROUTINE Close an open binary search tree file.

TGBSTFRD INTEGER FUNCTION Find a double precision record in a binary search tree

file.

TGBSTOPN SUBROUTINE Open and initialize a new binary search tree file.

TGBSTOPR SUBROUTINE Open an existing binary search tree file for reading.

TGBSTOPW SUBROUTINE Open an existing binary search tree file for writing.

TGBSTRDD SUBROUTINE Read the data portion of a double precision record from a

binary search tree implemented on the DAS file architecture.

TGBSTRDK SUBROUTINE Read the key portion of a double precision record from a

binary search tree implemented on the DAS file architecture.

TGBSTRRD SUBROUTINE Read a double precision record from a binary search tree im-

plemented on the DAS file architecture.

TGBSTRSZ SUBROUTINE Read the number of records and record sizes from a binary

search tree implemented on the DAS file architecture.

TGBSTWRD SUBROUTINE Write a double precision record to a binary search tree imple-

mented on the DAS file architecture.

TGBSTWSZ SUBROUTINE Write the number of records and record sizes to a binary search

tree implemented on the DAS file architecture.

201

4.1.5 Bucket data mapping subroutines

The subroutines listed define the interface to the tree data bucket mapping used by

the tree list generation database for mapping multidimensional histogram bins, or

data buckets, to the individual tree data for that histogram bin. These subroutines

provide an index from the multidimensional histogram bins to the individual tree data

for each stand measurement file in the bins.

TGBDMABD SUBROUTINE Add bucket mapping data to the bucket mapping file.

TGBDMCLS SUBROUTINE Close an open bucket data mapping file.

TGBDMFND SUBROUTINE Search for a bucket index, returning the beginning and ending

bucket data mapping records if the id is found in a bucket data mapping file.

TGBDMIDX SUBROUTINE Index the bucket data mapping records in a bucket data map-

ping file.

TGBDMOPN SUBROUTINE Open and initialize a new bucket data mapping file.

TGBDMOPR SUBROUTINE Open an existing bucket data mapping file for reading.

TGBDMOPW SUBROUTINE Open an existing bucket data mapping file for writing.

TGBDMRDR SUBROUTINE Tree list generator bucket data mapping read data record.

TGBDMRMR SUBROUTINE Read a bucket data mapping record from a bucket data map-

ping file.

TGBDMSRT SUBROUTINE Sort the bucket data mapping in a bucket data mapping file.

TGBDMWDR SUBROUTINE Tree list generator bucket data mapping write data record.

202

TGBKTNN SUBROUTINE Find the nearest buckets to the requested index parameter

vector in a specified parameter index data set.

TGBKTSCR DOUBLE PRECISION FUNCTION Compute a bucket separation score for two

sets of index parameter bucket centers and a a treatment description.

4.1.6 Tree measurement data subroutines

The subroutines listed define the interface to the tree measurement data file used by

the tree list generation database for storing and retrieving the individual tree data for

each stand. The random or simulated tree lists are also created using the TGTMDRND

subroutine from this section.

TGTMDAMD SUBROUTINE Tree list generator tree measurement data add tree measure-

ment data to a tree measurement data file.

TGTMDCLS SUBROUTINE Close an open tree measurement data file.

TGTMDEXT SUBROUTINE Extract tree measurement data for the specified bucket IDs

from a tree measurement data file to a another tree measurement data file.

TGTMDOPN SUBROUTINE Open and initialize a new tree measurement data file.

TGTMDOPR SUBROUTINE Open an existing tree measurement data file for reading.

TGTMDOPS SUBROUTINE Open and initialize a new tree measurement data file as a

scratch file.

TGTMDOPW SUBROUTINE Open an existing tree measurement data file for writing.

TGTMDRMD SUBROUTINE Tree list generator tree measurement data read tree measure-

ment data from a tree measurement data file.

203

TGTMDRMR SUBROUTINE Tree list generator tree measurement data read a data record.

TGTMDRND SUBROUTINE Compute a set of random trees.

TGTMDWMR SUBROUTINE Tree list generator tree measurement data write a data record.

4.1.7 Database information subroutines

The subroutines listed define the interface to the tree list generation database in-

formation file. The information file contains the basic information about a tree list

generation database, such as the treatments, index parameters used for each treat-

ment, creation date, and other general information.

TGINFCLS SUBROUTINE Close an open tree list generator database information file.

TGINFMGR SUBROUTINE This is the umbrella procedure to allow the sharing of data

among the tree list generator information file data structures and access proce-

dures. This procedure should not be called directly.

TGINFLDF ENTRY in TGINFMGR Load a tree list generator information file into memory.

TGINFUDF ENTRY in TGINFMGR Update a tree list generator information file from mem-

ory.

TGINFTRT ENTRY in TGINFMGR Get or set the tree list generator information for a

treatment.

TGINFBKT ENTRY in TGINFMGR Get or set the tree list generator primary or secondary

bucket count for a treatment.

TGINFSRC ENTRY in TGINFMGR Get or set the tree list generator source file count for

a treatment.

204

TGINFTRS ENTRY in TGINFMGR Get or set the tree list generator tree count for a treat-

ment.

TGINFLCK ENTRY in TGINFMGR Get or set the tree list generation database lock status.

TGINFTIM ENTRY in TGINFMGR Get one of the tree list generator times: create time,

modified time, or locked/unlocked time.

TGINFHVT ENTRY in TGINFMGR Return a logical flag indicating whether a treatment is

available.

TGINFOPN SUBROUTINE Open and initialize a new tree list generation database infor-

mation file.

TGINFOPR SUBROUTINE Open an existing tree list generation database information file

for reading.

TGINFOPW SUBROUTINE Open an existing tree list generation database information file

for writing.

4.1.8 Unit conversion subroutines

The subroutines listed define the interface to the tree list generation database units

conversion procedures. The units conversion procedures provide a relatively painless

interface for performing units conversions between the Imperial and metric systems

of measurement.

TGCHUNIT SUBROUTINE Convert measurement units to another scale, either within a

particular system of measurements, e.g., metric, or between systems of mea-

surement, e.g., imperial and metric.

205

TGMUNITS SUBROUTINE This is the umbrella procedure to allow the sharing of data

between the tree list generator measurement unit conversion procedures. This

procedure should not be called directly.

TGUNTI2N ENTRY in TGMUNITS Tree list generator convert measurement units integer

ID code to its equivalent measurement units name.

TGUNTN2I ENTRY in TGMUNITS Tree list generator convert measurement units name to

its equivalent integer ID code.

TGPRMCF SUBROUTINE This is the umbrella procedure to allow the sharing of data

among the tree list generator index parameter unit conversion procedures. This


TGPCFI2M ENTRY in TGPRMCF Tree list generator index parameter conversion factor for

Imperial to metric conversion

TGPCFM2I ENTRY in TGPRMCF Tree list generator index parameter conversion factor for

metric to conversion.

4.1.9 Index parameter keyword subroutines

The subroutines listed define the interface to the tree list generation database index

parameter keywords and their associated integer ID codes. The index parameter

names are used externally in files and reports, but integer ID codes are used internally

for speed.

TGPRMCTR SUBROUTINE Compute the tree list generator bin centers for a parameter

vector.

TGPRMPK SUBROUTINE Pack a full tree list generator vector of stand parameter values

into the vector of parameter values that are in use.

206

TGPRMTYP SUBROUTINE This is the umbrella procedure to allow the sharing of data

between the tree list generator parameter type conversion procedures. This


TGPTI2N ENTRY in TGPRMTYP Tree list generator convert parameter type integer ID

code to its equivalent parameter type name.

TGPTN2I ENTRY in TGPRMTYP Tree list generator convert parameter type name to its

equivalent parameter type integer ID code.

TGPRMUPK SUBROUTINE Unpack the stand parameter values that are in use into the a

full tree list generator vector of vector of parameter values.

TGPRMVEC SUBROUTINE Compute an index parameter vector for a data set for a stand

measurement.

4.1.10 Species mapping subroutines

The subroutines listed define the interface for using the tree list generation database

species mapping files. Species mapping files are the means by which specific numeric

ID codes may be associated with individual tree species.

TGSPMMGR SUBROUTINE This is an umbrella procedure that contains all of the species

map manipulation procedures.

TGSPMLD ENTRY in TGSPMMGR Load a species map file into memory.

TGSPCC2I ENTRY in TGSPMMGR Convert a species two character code into its corre-

sponding integer ID code.

TGSPI2CC ENTRY in TGSPMMGR Convert a species integer ID code into its corresponding

two character code.

207

TGSPI2CN ENTRY in TGSPMMGR Convert a species integer ID code into its corresponding

Common name.

TGSPI2LC ENTRY in TGSPMMGR Convert a species integer ID code into its corresponding

4-6 character code.

TGSPI2LN ENTRY in TGSPMMGR Convert a species integer ID code into its corresponding

Latin name.

TGSPLC2I ENTRY in TGSPMMGR Convert a species 4-6 character code into its corre-

sponding integer ID code.

TGSPMNUM ENTRY in TGSPMMGR Return the number of entries in the species map table.

TGSPMRD SUBROUTINE Read a tree list generator species map file.

TGSPMWR SUBROUTINE Write a tree list generator species map file.

4.1.11 Stand type subroutines

The subroutines listed define the interface to the tree list generation database stand

types and their associated ID codes.

TGSTTYP SUBROUTINE This is the umbrella procedure to allow the sharing of data be-

tween the tree list generator stand type conversion procedures. This procedure

should not be called directly.

TGSTTI2N ENTRY in TGSTTYP Tree list generator convert stand type integer ID code

to its equivalent stand origin name.

TGSTTN2I ENTRY in TGSTTYP Tree list generator convert stand type name to its equiv-

alent integer ID code.

208

4.1.12 Index parameter subroutines

The subroutines listed define the interface to the tree list generation database index

parameters and their associated ID codes.

TGIDXPRM SUBROUTINE This is the umbrella procedure to allow the sharing of data

AMONG the tree list generator index parameter type conversion procedures.

This procedure should not be called directly.

TGPRMI2N ENTRY in TGIDXPRM Tree list generator convert index parameter integer ID

code to its equivalent index parameter name for a specified treatment.

TGPRMN2I ENTRY in TGIDXPRM Tree list generator convert index parameter name to

its equivalent index parameter integer ID code for a specified treatment.

TGPRMNUM ENTRY in TGIDXPRM Tree list generator get the number of index parameters

for a specified treatment.

4.1.13 Stand origin subroutines

The subroutines listed define the interface to the tree list generation database stand

origins and their associated ID codes.

TGSTORIG SUBROUTINE This is the umbrella procedure to allow the sharing of data be-

tween the tree list generator stand origin conversion procedures. This procedure

should not be called directly.

TGSTOI2N ENTRY in TGSTORIG Tree list generator convert stand origin integer ID code

to its equivalent stand origin name.

TGSTON2I ENTRY in TGSTORIG Tree list generator convert stand origin name to its

equivalent integer ID code.

209

4.1.14 Treatment type subroutines

The subroutines listed define the interface to the tree list generation database treat-

ment types and their associated ID codes.

TGTRTTYP SUBROUTINE This is the umbrella procedure to allow the sharing of data

between the tree list generator treatment type conversion procedures. This


TGTRTI2D ENTRY in TGTRTTYP Tree list generator convert treatment type integer ID

code to its treatment directory name.

TGTRTI2N ENTRY in TGTRTTYP Tree list generator convert treatment type integer ID

code to its equivalent treatment type name.

TGTRTN2I ENTRY in TGTRTTYP Tree list generator convert parameter type name to its

equivalent parameter type integer ID code.

TGTRTHVN ENTRY in TGTRTTYP Tree list generator return logical flag indicating whether

a treatment is valid.

TGTRTHVI ENTRY in TGTRTTYP Tree list generator return logical flag indicating whether

a treatment ID is valid.

4.1.15 Miscellaneous supporting subroutines

The subroutines listed are the miscellaneous supporting subroutines and functions

used by the tree list generation database software. They include subroutines to

compute standard forest measurements, e.g., QMD or basal area, and routines for

performing various sorting and searching tasks. Some of the routines listed here also

define some of the nonportable parts of the tree list generation database subroutine

library.

210

BASARE SUBROUTINE Compute the individual tree basal area for a set of diameter at

breast height measurements.

CHKFNM 1 SUBROUTINE Check a filename or path to be sure that it is valid.

DOMHT DOUBLE PRECISION FUNCTION Compute the dominant height for a list of trees.

Dominant height is the average height of the 40 tallest trees per acre or the

average height of the 100 tallest trees per hectare.

DSORTD SUBROUTINE Sort in place double precision records packed into a single dimen-

sioned double precision array according to a dictionary ordering on the record

components.

DSORTI SUBROUTINE Sort in place integer records packed into a single dimensioned

integer array according to a dictionary ordering on the record components.

DSRTCD INTEGER FUNCTION Compare two arrays of double precision numbers using

dictionary order on the array elements.

DSRTCI INTEGER FUNCTION Compare two arrays of integers using dictionary order on

the array elements.

DUNI DOUBLE PRECISION FUNCTION This routine generates double precision uniform

random numbers on [0,1).

DUSTAR ENTRY in DUNI This routine sets the seed for DUNI.

DUNIB ENTRY in DUNI This routine sets the mantissa precision for DUNI.

FUNIQI SUBROUTINE Find the unique integer values in an array of integers.

211

PRSCML 1 SUBROUTINE Parse a program command line into options, option values, if

present, and command line arguments.

QMD DOUBLE PRECISION FUNCTION Compute the quadratic mean diameter for an ar-

ray of diameter at breast height measurements.

TGBSLASH CHARACTER*(1) FUNCTION Return the backslash character, ’

’.

TGCOMCHR CHARACTER*(1) FUNCTION Return the tree list generator comment charac-

ter. The comment character appears as the first nonblank character on a line

and indicates that the line is to be ignored.

TGGETMU SUBROUTINE Scan a Tree list generator input file looking for the keyword

’MEASUREMENT UNITS’ and returning its value.

TGGETTRT SUBROUTINE Scan a Tree list generator input file looking for the keyword

’TREATMENT TYPE’ and returning its value.

TGPTHCHR CHARACTER*(1) FUNCTION Return the tree list generator path element sep-

arator character. The path element separator is usually a backslash character, ’

’, or a forward slash character, ’/’, depending on the computing environment.

TGPTHFIX SUBROUTINE Fix the path element separation character in a path or file-

name. This subroutine allows either a forward slash, ’/’, or a backslash, ’

’, to be used as the path separation character.

TGTABCHR CHARACTER*(1) FUNCTION Return the tab character for use in the tree list

generator. The tab character is not part of the FOrtran character set, so this

function is provided to easily permit its use in programs.

212

TOPHT DOUBLE PRECISION FUNCTION Compute the top height for a list of trees. Top

height is the average height of the 40 largest diameter trees per acre or the

average height of the 100 largest diameter trees per hectare.

TXTNLN INTEGER FUNCTION This routine will scan the text file specified by FNAME,

returning the number of lines in the file.

TXTNLU INTEGER FUNCTION This routine will scan the text file attached to the Fortran

logical unit UNIT, returning the number of lines in the file.

TXTOPS SUBROUTINE Open a scratch text file for subsequent write access.

VMEANG DOUBLE PRECISION FUNCTION Compute the mean of a vector of double pre-

cision numbers.

VSTDG DOUBLE PRECISION FUNCTION Compute the standard deviation of a vector of

double precision numbers.

VSUMG DOUBLE PRECISION FUNCTION Compute the sum of a vector of double precision

numbers.

4.2 Future extensions and enhancements

This section contains a list of enhancements that could be made to the tree list

generation database subroutine library. The list of possible enhancements is provided

primarily as a guide to be used, or ignored, by the next programmer who modifies or

enhances the tree list generation database subroutine library. The enhancements or

modifications are not listed in any particular order.

• Add more information to the random stand measurement files when they are

generated, such as a statistical summary of the trees used in the random tree

213

generation procedure and statistics for the bin characteristics of the selected

data. Tree list generation databse version information should also be added to

the files.

• Create a subroutine TGTMDADD for adding tree measurement data to a tree list

generation database. This would provide a symmetry of interface for generating

simulated stands and adding data. The program TGADD is currently the only

way to add stand measurement data to a tree list generation database; there

is no direct subroutine interface. For the near term, this should only prove a

minor inconvenience – generating simulated stand via a single subroutine call

was considered to be much more important.

• Hide the details of opening and closing the individual files used to represent a

tree list generation database by creating a file manager. When working with a

tree list generation database, it is the database that is either open or not open,

not the individual files that comprise the database. The file manager would

greatly simplify the high level interface.

• Add a delayed database commit for new data by staging the data before adding

it to an existing tree list generation database. This entails creating a temporary

tree list generation database that is always searched before the existing database

is searched when adding data. In this case the existing tree list generation

database would only be updated if all of the stand measurement files had been

successfully added to the temporary database. The two database could then be

merged to produce an updated database.

• For thinned stand measurement or description files, verify that the pre-thin

stand density and basal area values are less than their post-thin values. This is

not currently done.

214

• For each stand description keyword or index parameter keyword, define a set of

valid input values. For categorical parameters with a small number of possible

values, this could be an explicit list. For numeric values, this could be a valid

range for the inputs. This information could be stored in a tree list generation

database and used for input or output validation. This could possibly be done

by creating a valid values entry point in the index parameter definition module.

• TGDFRD, TGDFWR, TGMFRD, TGMFWR, TGSCHRD, TGSCHWR all need do more stringent

input and output value checking.

• Improve the error handling.

– Not all errors are fatal errors.

– Add a logging capability to the software so that errors during batch pro-

cessing have a place to go, allowing processing to continue.

• Version information and modification information need to be added to the tree

list generation database binary files. There is currently no version checking

(there is only a single version). This would obviate some of the naming con-

vention issues by potentially allowing a software detection of version or revision

changes and changes in file contents.

• Add a way of annotating a tree list generation database. This capability exists

in the NAIF SPICE DAS file architecture, but an interface to using it with a

tree list generation database has not been implemented. This would allow a

tree list generation database to carry a pedigree along with its data.

• Allow TRUE and FALSE as substitutes for YES and NO, respectively, for the ap-

propriate index parameter keywords.

215

• Add default or required types for index parameters to prevent improper param-

eter type definitions. This would allow the detection of possible parameter type

related problems, e.g., using a type of INTERVAL for a categorical variable.

• Allow the generation of dead but standing trees as well as live trees for a stand.

Currently only live trees may be generated. This would involve a backwardly

incompatible change by storing an additional data column with each tree indi-

cating the status of the tree. It is well within the real of possibility.

• Along similar lines to the previous item, the definition of a tree as a multidimen-

sional object needs to be generalized. Currently a tree may only be represented

by a DBH, height, and species. This seems a bit too restrictive for the long

run, however, and should be generalized. Missing value problems will need to

be dealt with if this sort of generalization is performed.

• Permit arbitrary stand types to be defined by the dominant or codominant tree

species. This would affect the stand type and percentage by basal area and

stems index parameters in a backwardly incompatible manner. This could be

implemented by storing a stand type, PURE, DOMINANT, or MIXTURE with the

species ID code for the tree species having the greatest percentage of the stand

basal area. The interpretations of the PURE, DOMINANT, or MIXTURE would remain

unchanged, but there would now be two index items representing stand type

rather than a single stand type index parameter.

– If this modification is made, the six percentage based index parameters

for species composition should also be generalized. The same pattern will

work: store the percentage and the tree species ID as a two part index

item.

• Store the index parameter keywords used in the tree list generation database file

216

rather than coding them in a set of subroutines. This would increase the flexi-

bility and overall applicability of the tree list generation database methodology,

possibly, trivially, opening up applications in nonforestry fields.

– This feature could even be extended to developing a mini- language for

specifying index parameters based on input data types and simple com-

mands for performing computations with data columns. The measurement

file format appears to be quite flexible in this regard.

• The addition of a pre-thinning stand type could be useful. It is possible that

during a thinning the stand type could change, e.g., from Douglas-fir dominant,

≥ 50% by basal area, to pure Douglas-fir, ≥ 80% by basal area. Most of the

stands used did not have pre-thinning tree lists, so a pre-thin stand type could

not be determined.

– If this modification is made, the six percentage based index parameters for

refining the species composition of a stand, should also be implemented,

allowing the pre-thin stand to be as well defined as the post-thin stand.

• Addition of other index parameter keywords, such as GIS polygon ID, latitude

and longitude, or other mechanisms to identify geographically similar data.

• Site index may prove to be problematic as a stand classification variable, so its

continued use should in this regard considered carefully. Different species will

have different reference ages, and for mixed species stands the definition of a site

index value may be difficult or impossible. The use of an appropriate real stand

measurement, such as top height, should be a more than adequate replacement

for site index, given a multidimensional classification scheme such as that used

in a tree list generation database.

217

• Improve the accuracy of the Imperial to metric and metric to Imperial con-

versions. The values used were obtained from convenient sources and are not

necessarily the most accurate values available.

218

GLOSSARY

BACK-UP: A copy of a file or database to be used when catastrophe strikes the

original. Always make one before adding data to a tree list generation database

COMMAND LINE ARGUMENT: A program argument specified on a command line

to provide information, such as input or output file names, to a program. Com-

mand line arguments always follow the command line options.

COMMAND LINE OPTION: a program option identified by a short name preceded

by a dash or minus sign that is specified on the command line for a program.

Command line options typically modify the behavior of the program or provide

additional information to the program. Command line options must precede

any command line arguments.

CANONICAL STAND: This represents an aggregation of trees selected as the nearest

neighbors to a desired set of stand attributes from a tree list generation database.

The trees in a canonical stand are used as a lookup table to generate simulated

tree lists that are similar to the specified stand attributes.

DATA BUCKET: Describes a multidimensional histogram bin defined by the in-

dex parameters and their interval widths as specified in a tree list generation

database schema file.

DELIMITER: Something, often a single character, that indicates the beginning or

ending of an argument or field in a file or on a command line. More generally, a

delimiter is a field separator. Commonly used delimter characters are a space,

comma, tab, or semi- colon.

219

DESCRIPTION FILE: A text file with a keyword equals value based format that

provides a description of a stand of trees that is to be generated. The descrip-

tion of the stand is accomplished by defining appropriate values for the index

parameter keywords for a particular tree list generation treatment and database.

IMPERIAL UNITS: System of meaurements based on the foot and pound.

INDEX PARAMETER: A stand attribute, e.g., stand type, site index, or trees per

hectare or acre, that is used to create an index for the tree measurement data

stored in the database. The index parameters are used to classify and select

appropriate tree measurement data for the generation of a simulated stand of

trees.

KEYWORD: An identifiable token in a file that may be used as a marker for an

associated value. The keyword is usually a human understandable item.

KEYWORD EQUALS VALUE: A file format which makes use of an equal sign to

associate a value, whether text or numeric, with a keyword.

LATEX: A typsetting macro package developed by Leslie Lamport for computerized

typesetting based on TEX [17]. The document prepartaion system used for this

document.

MEASUREMENT FILE: A text file which contains the stand description keywords

and tree measurement data for anobserved or simulated stand. A simulated,

or random, stand measurement file may be distinguished from an actual stand

measurement file by the word random which appears in its header comments.

METRIC UNITS: System of measurement based on the meter and kilogram.

220

PATH: A path is defined to be a string of path elements separated by a path

separation character, commonly a forward slash ‘/’ or a backslash ‘\’, and ending

in a valid path element or filename. A valid path may not end in a path

separation character.

RANDOM STAND MEASUREMENT FILE: A text file which contains a set of stand

description keywords and simulated tree measurement data for a simulated

stand. A simulated, or random, stand measurement file may be distinguished

from an actual stand measurement file by the word random which appears in

its header comments.

SCHEMA FILE: A text file defining the index parameters and their attributes for a

tree list generation database. A schema file is used to create a tree list generation

database.

SPECIES MAP: A mapping from short species identification codes, e.g., DF or WH,

to numeric codes for effecient storage and use of species information within the

the tree list generation database and its supporting software.

SPECIES MAPPING FILE: A text file which defines a species mapping for a tree

list generation database. The file contains five columns: a unique integer tree

species ID code, a unique two (2) letter tree species code, a unique 4-6 letter

tree species code, the common name for a species, and the Latin name for a

species.

STAND DESCRIPTION KEYWORD: A keyword such as SITE INDEX, STAND ORIGIN,

or TREATMENT TYPE, which identifies information about a stand of trees that

cannot be computed from a tree list. This information must appear in any

stand measurement file that is to be added to a tree list generation database.

The specific stand description keywords will vary depending on treatment type.

221

STAND DESCRIPTION FILE: See description file.

STAND INDEX PARAMETER: See index parameter.

STAND MEASUREMENT FILE: See measurement file.

TEX: A computerized typesetting system developed by Donald E. Knuth [16].

TREE LIST: A list of compatible diameter at breast height (DBH) and height mea-

surements with an associated tree species.

VALUE: The text or numeric item on the right hand side of the equal sign in a

keyword equals value format.

222

BIBLIOGRAPHY

[1] P.J. Bickel and K.A. Doksum. Mathematical Statistics. Holden-Day, 1978.

[2] G.S. Biging, T.A. Robards, E.C. Turnblom, and P.C. VanDeusen. The predictive

models and procedures used in the forest stand generator (STAG). Hilgardia,

61(1):36 pp., 1994.

[3] Bruce E. Borders and William D. Patterson. Projecting stand tables: A compar-

ison of the weibull diameter distribution method, a percentile-based projection

method, and a basal area growth projection method. Forest Science, 36(2):413–

424, 1990.

[4] David Bruce. Consistent height-growth and growth-rate estimates for remeasured

plots. Forest Science, 27(4):711–725, 1981.

[5] Jerome L. Clutter, James C. Fortson, Leon V. Pienaar, Graham H. Brister,

and Robert L. Bailey. Timber Management: A Quantitative Approach. Kreiger

Publishing Company, reprint edition, 1992.

[6] William G. Cochran. The χ2 test of goodness of fit. Annals of Mathematical

Statistics, 23(3):315–345, 1952.

[7] John E. Dennis and Robert B. Schnabel. Numerical methods for unconstrained

optimization and nonlinear equations. Prentice-Hall, Inc., 1983.

[8] D.M. Donnelly. Pacific Northwest coast variant of the forest vegetation simulator.

WO-Forest Management Service Center, USDA-Forest Service, Fort Collins, CO.,

1997. Available on the Web.

223

[9] Kevin R. Gehringer. Nonparametric probability density estimation using nor-

malized B-Splines. Master’s thesis, The University of Tulsa, 1990.

[10] Kevin R. Gehringer and Richard A. Redner. Nonparametric probability den-

sity estimation using normalized B-splines. Comm. Statist. Simulation Comput.,

21(3):849–878, 1992.

[11] D.W. Hann, A.S. Hester, and C.L. Olsen. ORGANON User’s manual Edition

6.0. Dept. Forest Resources, Oregon State University, Corvallis, OR 97331-5703,

1997.

[12] D. Kahaner, C. Moler, and S. Nash. Numerical Methods and Software. Prentice

Hall, 1988.

[13] David Kahaner and George Marsaglia. double precision function duni. NIST

Guide to Available Math Software, package NMS., 1988. Computes double pre-

cision uniform numbers on [0, 1).

[14] Brian W. Kernighan and Rob Pike. The Practice of Programming. Addison-

Wesley Professional Computing Series. Addison-Wesley, 1999.

[15] James E. King. Site Index Curves for Douglas-fir in the Pacific Northwest.

Number 8 in Weyerhaeuser Forestry Paper. Weyerhaeuser Company, 1966.

[16] Donald E. Knuth. The TEX book. Addison-Wesley, 1984.

[17] Leslie Lamport. A Document Preparation System LATEX: User’s Guide and Ref-

erence Manual. Addison-Wesley, 2nd edition, 1994.

[18] George Marsaglia. Comments on the perfect uniform random number generator.

Unpublished notes, Washington State University.

224

[19] Steve C. McConnell. Code Complete : A Practical Handbook of Software Con-

struction. Microsoft Press, 1993.

[20] Kenneth J. Mitchell. Dynamics and Simulated Yield of Douglas-fir. Monograph

17. Forest Science, 1975.

[21] Gordon D. Nigh. The geometric mean regression line: A method for develop-

ing site index conversion equations for species in mixed stands. Forest Science,

41(1):84–98, 1995.

[22] Chadwick D. Oliver and Bruce C. Larson. Forest Stand Dynamics. John Wiley

& Sons, Inc., update edition, 1996.

[23] P.J. Plauger. The Standard C library. Prentice-Hall, 1991.

[24] Richard A. Redner. Convergence rates for uniform B-spline density estimators.

I. One dimension. SIAM J. Sci. Comput., 20(6):1929–1953 (electronic), 1999.

[25] Richard A. Redner and Kevin Gehringer. Function estimation using partitions

of unity. Comm. Statist. Theory Methods, 23(7):2059–2078, 1994.

[26] Richard A. Redner and Homer F. Walker. Mixture densities, maximum likelihood

and the EM algorithm. SIAM Review, 26(2):195–239, 1984.

[27] Marion R. Reynolds, Jr., Thomas E. Burk, and Won-Chin Huang. Goodness-of-

fit tests and model selection procedures for diameter distribution models. Forest

Science, 34(2):373–399, 1988.

[28] B.W. Silverman. Density Estimation for Statistics and Data Analysis. London:

Chapman Hall, 1986.

225

[29] Albert R. Stage and Melinda Moeur. Most similar neighbor: An improved

sampling inference procedure for natural resource planning. Forest Science,

41(2):337–359, 1995.

[30] Lahey Computer Systems. Lahey/Fujitsu Fortran 95 Express User’s Guide. La-

hey Computer Systems, LF95 express, version 5.5 edition, 1999.

[31] Malcolm S. Taylor and James R. Thompson. A data based algorithm for the

generation of random vectors. Computational Statistics & Data Analysis, 4:93–

101, 1986.

[32] James R. Thompson and Richard A. Tapia. Nonparametric Function Estimation,

Modeling, and Simulation. SIAM: Society for Industrial and Applied Mathemat-

ics, 1990.

[33] D.M. Titterington, A.F.M. Smith, and U.E. Makov. Statistical Analysis of Finite

Mixture Distributions. John Wiley & Sons, 1985.

[34] The Math Works. Using Matlab, Version 5.3. The Math Works, Inc., 1999.

[35] W. R. Wykoff. Supplement to the User’s Guide for the stand PROGNOSIS model

- Version 5.0. USDA FS Intermountain For. and Range Exp. Stn., Ogden, UT.

Gen. Tech. Rep. INT-208, 1986.

[36] W. R. Wykoff, N.L. Crookston, and A.R. Stage. User’s guide to the stand prog-

nosis model. USDA FS Intermountain For. and Range Exp. Stn., Ogden, UT.

Gen. Tech. Rep. INT-133, 1982.

226

Appendix A

TGSUMRY OUTPUT FOR tgdb1r00

This appendix contains the contents of the summary file created by the TGSUMRY

program command given in Figure 2.18 in Section 2.5.2. The TGSUMRY command

executed was

PROMPT> tgsumry -file summary.txt tgdb1r00

and the contents of the file summary.txt appears below.

Tree list generation database summary.

Summary created on: 06/28/2000 at 13:06:44.32

Tree list generation database path: current directoryTree list generation database name: tgdb1r00

Tree list generation database created on : 06/27/2000 at 17:17:27.82Tree list generation database modified on: 06/28/2000 at 13:06:12.85

Tree list generation database lock status : lockedTree list generation database lock/unlock date: 06/28/2000 at 13:06:12.85

The units used in this summary are metric.

Number of treatments: 2

Treatment : UNTREATED

Number of index parameters : 7

227

Number of tree data buckets : 3139Number of source files : 5209Number of tree records : 573036

Index parameters and parameter types:

Parameter Name Used Type Width-------------------------------- ------- --------- ----------TREATMENT_TYPE YES FIXED N/ASITE_INDEX_50 YES INTERVAL 3.00STAND_TOTAL_AGE YES INTERVAL 4.00STAND_ORIGIN YES FIXED N/AMEAN_DBH NO INTERVAL 4.00MEAN_HEIGHT NO INTERVAL 3.00QMD YES INTERVAL 4.00TOP_HEIGHT NO INTERVAL 3.00STAND_DENSITY YES INTERVAL 200.00STAND_BASAL_AREA NO INTERVAL 1.50PCT_DF_STEMS NO INTERVAL 20.00PCT_WH_STEMS NO INTERVAL 20.00PCT_OT_STEMS NO INTERVAL 20.00PCT_DF_BASAL_AREA NO INTERVAL 20.00PCT_WH_BASAL_AREA NO INTERVAL 20.00PCT_OT_BASAL_AREA NO INTERVAL 20.00STAND_TYPE YES FIXED N/ANUMBER_OF_SPECIES NO INTERVAL 5.00

Treatment : THINNED

Number of index parameters : 12

Number of tree data buckets : 3716Number of source files : 4440Number of tree records : 237245

Index parameters and parameter types:

Parameter Name Used Type Width-------------------------------- ------- --------- ----------TREATMENT_TYPE YES INTERVAL 2.00SITE_INDEX_50 YES INTERVAL 3.00STAND_TOTAL_AGE YES INTERVAL 4.00STAND_ORIGIN YES FIXED N/ANUMBER_OF_THINNINGS YES FIXED N/A

228

PRE_THIN_DENSITY YES INTERVAL 200.00PRE_THIN_BASAL_AREA YES INTERVAL 1.50PCT_STEMS_REMOVED NO INTERVAL 10.00PCT_BASAL_AREA_REMOVED YES INTERVAL 10.00YEARS_SINCE_TREATMENT YES INTERVAL 4.00MEAN_DBH NO INTERVAL 4.00MEAN_HEIGHT NO INTERVAL 3.00QMD YES INTERVAL 4.00TOP_HEIGHT NO INTERVAL 3.00STAND_DENSITY YES INTERVAL 200.00STAND_BASAL_AREA NO INTERVAL 1.50PCT_DF_STEMS NO INTERVAL 20.00PCT_WH_STEMS NO INTERVAL 20.00PCT_OT_STEMS NO INTERVAL 20.00PCT_DF_BASAL_AREA NO INTERVAL 20.00PCT_WH_BASAL_AREA NO INTERVAL 20.00PCT_OT_BASAL_AREA NO INTERVAL 20.00STAND_TYPE YES FIXED N/ANUMBER_OF_SPECIES NO INTERVAL 4.00

229

Appendix B

TREE SPECIES COMMON AND LATIN NAMES

This appendix contains the tree list generation database species common and

Latin names. These names were used with the tree list generation database tgdb1r00.

This species mapping file is valid for the tree list generation database distributions

tlgv1r0d and tlgv1r0d.

Table B.1: Tree list generation database tree species codes and names. The tablepresents a short two (2) character tree species codes, a long 4-6 character tree speciescodes, the common name for a tree species and the latin name for a tree species.

Short Long Common Name Latin Name

BC POTR Black Cottonwood Populus trichocarpa

BR BEPA Birch Betula papyrifera

CA RHPU Cascara Rhamnus purshiana

CH PREM Bitter Cherry Prunus emarginata

DF PSME Douglas-fir Pseudotsuga menziesii

DM ACGL Douglas Maple Acer glabrum

DW CONU Western Flowering Dogwood Cornus nuttallii

ES PIEN Englemann Spruce Picea engelmannii

GC CACH Golden Chinkapin Castanopsis chrysophylla

GF ABGR Grand Fir Abies grandis

HD HD Other hardwoods None

HW CRDO Hawthorn Crataegus douglasii

HZ COCO2 Hazelnut Corylus cornuta

IC LIDE Incense Cedar Libocedrus decurrens

KP PIAT Knobcone Pine Pinus attenuata

LP PICO Lodgepole Pine Pinus contorta

230

Table B.1: (continued)

MA ACMA Bigleaf Maple Acer macrophyllum

MD ARME Madrone Arbutus menziesii

MY UMCA Oregon Myrtle Umbellularia californica

NF ABPR Noble Fir Abies procera

OA FRLA Oregon Ash Fraxinus latifolia

OO QUGA Garry Oak Quercus garryana

OT OT Other species or mixtures None

PC CHLA Port Orford Cedar Chamaecyparis lawsoniana

PP PIPO Ponderosa Pine Pinus ponderosa

RA ALRU Red Alder Alnus rubra

RC THPL Western Red Cedar Thuja plicata

RF ABMA California Red Fir Abies magnifica

SA ALCR2 Sitka Alder Alnus crispa

SB AMELA Servicberry Amelachier spp.

SF ABAM Pacific Silver Fir Abies amabalis

SP PILA Sugar Pine Pinus lambertiana

SS PISI Sitka Spruce Picea sitchensis

SX SALIX Willow Salix spp.

TO LIDE2 Tan Oak Lithocarpus densiflorus

VM ACCI Vine Maple Acer circinatum

WF ABCO White Fir Abies concolor

WH TSHE Western Hemlock Tsuga heteraphylla

WL LAOC Western Larch Larix occidentalis

WM MYCA Waxmyrtle Myrica californica

WP PIMO Western White Pine Pinus monticola

WY TABR Western Yew Taxus brevifolia

YC CHNO Alaska Yellow Cedar Chamaecyparis nootkatensis

AF ABLA Alpine fir Abies lasiocarpa

BO QUKE Black Oak Quercus kelloggii

231

Appendix C

A SAMPLE TREE LIST GENERATION DATABASE

SPECIES MAPPING FILE

This appendix contains the tree list generation database species mapping file used

to create the tree list generation database tgdb1r00. This species mapping file is

valid for the tree list generation database distributions tlgv1r0d and tlgv1r0s.

%% Tree list generation species mapping file.%% File Created: 08/05/1999 at 05:00:00%% DO NOT MODIFY THIS FILE UNLESS YOU KNOW WHAT YOU ARE DOING.%% This is the internal tree species ID to name mapping file for the% tree list generation database software. This file should NOT be% modified by any users of the program. Any additions or modifications% made to this file may cause unpredictable results if the file is used% with the tree list generation database software. This file is used to% map two letter and four to six letter tree species ID codes into% small integers for use in the tree list generation database software.%% It contains five columns:%% 1) The unique integer tree species ID code% 2) The unique two (2) letter tree species code% 3) The unique 4-6 letter tree species code% 4) The common name for the species.% 5) The Latin name for the species.%% The file is a comma delimited file, and the commas ARE NECESSARY.%

232

% Comment lines and blank lines are ignored. Comment lines are lines% whose first nonblank character is a percent sign (%).%% Note: If different integer ID codes are desired for a particular tree% list generation database, the integer codes in the file may be% changed and used in a NEW tree list generation database. Use of% the modified integer ID codes with an existing tree list% generation database will cause incorrect identification of the% species codes.%

1,BC,POTR,Black Cottonwood,Populus trichocarpa2,BR,BEPA,Birch,Betula papyrifera3,CA,RHPU,Cascara,Rhamnus purshiana4,CH,PREM,Bitter Cherry,Prunus emarginata5,DF,PSME,Douglas-fir,Pseudotsuga menziesii6,DM,ACGL,Douglas Maple,Acer glabrum7,DW,CONU,Western Flowering Dogwood,Cornus nuttallii8,ES,PIEN,Englemann Spruce,Picea engelmannii9,GC,CACH,Golden Chinkapin,Castanopsis chrysophylla10,GF,ABGR,Grand Fir,Abies grandis11,HD,HD,Other hardwoods,None12,HW,CRDO,Hawthorn,Crataegus douglasii13,HZ,COCO2,Hazelnut,Corylus cornuta14,IC,LIDE,Incense Cedar,Libocedrus decurrens15,KP,PIAT,Knobcone Pine,Pinus attenuata16,LP,PICO,Lodgepole Pine,Pinus contorta17,MA,ACMA,Bigleaf Maple,Acer macrophyllum18,MD,ARME,Madrone,Arbutus menziesii19,MY,UMCA,Oregon Myrtle,Umbellularia californica20,NF,ABPR,Noble Fir,Abies procera21,OA,FRLA,Oregon Ash,Fraxinus latifolia22,OO,QUGA,Garry Oak,Quercus garryana23,OT,OT,Other species or mixtures,None24,PC,CHLA,Port Orford Cedar,Chamaecyparis lawsoniana25,PP,PIPO,Ponderosa Pine,Pinus ponderosa26,RA,ALRU,Red Alder,Alnus rubra27,RC,THPL,Western Red Cedar,Thuja plicata28,RF,ABMA,California Red Fir,Abies magnifica29,SA,ALCR2,Sitka Alder,Alnus crispa30,SB,AMELA,Servicberry,Amelachier spp31,SF,ABAM,Pacific Silver Fir,Abies amabalis32,SP,PILA,Sugar Pine,Pinus lambertiana33,SS,PISI,Sitka Spruce,Picea sitchensis

233

34,SX,SALIX,Willow,Salix spp.35,TO,LIDE2,Tan Oak,Lithocarpus densiflorus36,VM,ACCI,Vine Maple,Acer circinatum37,WF,ABCO,White Fir,Abies concolor38,WH,TSHE,Western Hemlock,Tsuga heteraphylla39,WL,LAOC,Western Larch,Larix occidentalis40,WM,MYCA,Waxmyrtle,Myrica californica41,WP,PIMO,Western White Pine,Pinus monticola42,WY,TABR,Western Yew,Taxus brevifolia43,YC,CHNO,Alaska Yellow Cedar,Chamaecyparis nootkatensis44,AF,ABLA,Alpine fir, Abies lasiocarpa45,BO,QUKE,Black Oak,Quercus kelloggii

234

Appendix D

A SAMPLE TREE LIST GENERATION DATABASE

SCHEMA FILE

This appendix contains the tree list generation database schema file used to create

the tree list generation database tgdb1r00. This schema file is valid for the tree list

generation database distributions tlgv1r0d and tlgv1r0s.

%% Tree list generator schema file.%% Created on 08/08/1999 at 05:04:00%% A properly formatted tree list generation database schema file is% used to create a new tree list generation database. A schema file% specifies the treatment types that are used in a tree list generation% database, and for each treatment type it specifies which index,% parameters are to be used and the attributes for each index parameter.%% More details concerning the use of a tree list generatorschema file% and how to modify this schema template to create a valid tree list% generator schema file will be added to the file at a future date.%

MEASUREMENT_UNITS = METRICTREATMENT_TYPE = UNTREATED

INDEX_PARAMETER = SITE_INDEX_50USE_PARAMETER = YESPARAMETER_TYPE = INTERVALINTERVAL_WIDTH = 3.0

END_INDEX_PARAMETERINDEX_PARAMETER = STAND_TOTAL_AGE

USE_PARAMETER = YES

235

PARAMETER_TYPE = INTERVALINTERVAL_WIDTH = 4.0

END_INDEX_PARAMETERINDEX_PARAMETER = STAND_ORIGIN

USE_PARAMETER = YESPARAMETER_TYPE = FIXED

END_INDEX_PARAMETERINDEX_PARAMETER = MEAN_DBH

USE_PARAMETER = NOPARAMETER_TYPE = INTERVALINTERVAL_WIDTH = 4.0

END_INDEX_PARAMETERINDEX_PARAMETER = MEAN_HEIGHT


END_INDEX_PARAMETERINDEX_PARAMETER = QMD

USE_PARAMETER = YESPARAMETER_TYPE = INTERVALINTERVAL_WIDTH = 4.0

END_INDEX_PARAMETERINDEX_PARAMETER = TOP_HEIGHT


END_INDEX_PARAMETERINDEX_PARAMETER = STAND_DENSITY

USE_PARAMETER = YESPARAMETER_TYPE = INTERVALINTERVAL_WIDTH = 200

END_INDEX_PARAMETERINDEX_PARAMETER = STAND_BASAL_AREA


END_INDEX_PARAMETERINDEX_PARAMETER = PCT_DF_STEMS


END_INDEX_PARAMETERINDEX_PARAMETER = PCT_WH_STEMS

USE_PARAMETER = NO

236

PARAMETER_TYPE = INTERVALINTERVAL_WIDTH = 20

END_INDEX_PARAMETERINDEX_PARAMETER = PCT_OT_STEMS


END_INDEX_PARAMETERINDEX_PARAMETER = PCT_DF_BASAL_AREA


END_INDEX_PARAMETERINDEX_PARAMETER = PCT_WH_BASAL_AREA


END_INDEX_PARAMETERINDEX_PARAMETER = PCT_OT_BASAL_AREA


END_INDEX_PARAMETERINDEX_PARAMETER = STAND_TYPE


END_INDEX_PARAMETERINDEX_PARAMETER = NUMBER_OF_SPECIES

USE_PARAMETER = NOPARAMETER_TYPE = INTERVALINTERVAL_WIDTH = 5


TREATMENT_TYPE = THINNEDINDEX_PARAMETER = SITE_INDEX_50




237



END_INDEX_PARAMETERINDEX_PARAMETER = NUMBER_OF_THINNINGS


END_INDEX_PARAMETERINDEX_PARAMETER = PRE_THIN_DENSITY


END_INDEX_PARAMETERINDEX_PARAMETER = PRE_THIN_BASAL_AREA


END_INDEX_PARAMETERINDEX_PARAMETER = PCT_STEMS_REMOVED


END_INDEX_PARAMETERINDEX_PARAMETER = PCT_BASAL_AREA_REMOVED


END_INDEX_PARAMETERINDEX_PARAMETER = YEARS_SINCE_TREATMENT






END_INDEX_PARAMETER

238

INDEX_PARAMETER = QMDUSE_PARAMETER = YESPARAMETER_TYPE = INTERVALINTERVAL_WIDTH = 4

















239








240

Appendix E

A TREE LIST GENERATION DATABASE SCHEMA

TEMPLATE FILE

This appendix contains the schema template file for the tree list generation

database distributions tlgv1r0s and tlgv1r0d.

%% Tree list generator schema file template.%% File created: 03/24/2000 at 05:20:51.85%% This file is a schema template file for the tree list generation% database software. As written, this schema template file is not a valid% input file for the tree list generation database software, but it may% be modified using any text editor and transformed into a valid tree% list generation database schema file. Read the rest of the file header% for information which describes how to transofrm this tree list% generation database schema template file into a valid tree list% generation database schema file.%% A properly formatted tree list generation database schema file is used% to create a new tree list generation database. A tree list generation% database schema file specifies the treatment types that will be% contained in a tree list generation database, and for each treatment% type it specifies the index parameters to be used for each treatment% and the index parameter attributes that are to be used to classify each% stand measurement file that is stored in a tree list generation% databse. The index parameters that are defined in a tree list% generation database schema file, and used to create a tree list% generation database, are specify the attributes of a stand of trees% that is to be generated in the same manner as a query of a database us% used to select the data that is to be returned.%

241

% Each treatment that is defined by a tree list generation schema file% must contain the STAND_TYPE index parameter. The tree list generation% database software has been designed to work with forest stands that are% pure Douglas-fir, pure western hemlock, or Douglas-fir/western hemlock% mixtures. In order to properly classify the stand types in a tree list% generation database the STAND_TYPE index parameter must be present in% the schema file for each treatment that is defined.%% Additionally, there are two sets of three index parameters for refining% the stand type by percent basal area by species and percent of stems by% species for Douglas-fir, western hemlock, and other tree species. If% one of the three index parameters from either of these sets is used% all three must be used to provide a consistent classification of the% stands that are stored in a tree list generation database. It is also% recommended that the same interval widths be used for each of the three% index parameters if either the percent basal area or percent stems by% species index parameters are used, again for consistency, though this% is not enforced.%% More details concerning the use of a tree list generation database% schema file and how to modify a schema template file to create a valid% tree list generation database schema file may be found in the tree list% generation users guide.%

MEASUREMENT_UNITS = IMPERIAL | METRICTREATMENT_TYPE = UNTREATED

INDEX_PARAMETER = SITE_INDEX_50USE_PARAMETER = YES | NOPARAMETER_TYPE = FIXED | INTERVALINTERVAL_WIDTH = < number >


USE_PARAMETER = YES | NOPARAMETER_TYPE = FIXED | INTERVALINTERVAL_WIDTH = < number >




USE_PARAMETER = YES | NO

242

PARAMETER_TYPE = FIXED | INTERVALINTERVAL_WIDTH = < number >


















243











TREATMENT_TYPE = THINNEDINDEX_PARAMETER = SITE_INDEX_50






END_INDEX_PARAMETERINDEX_PARAMETER = NUMBER_OF_THINNINGS

USE_PARAMETER = YES | NO

244

PARAMETER_TYPE = FIXED | INTERVALINTERVAL_WIDTH = < number >

END_INDEX_PARAMETERINDEX_PARAMETER = PRE_THIN_DENSITY


END_INDEX_PARAMETERINDEX_PARAMETER = PRE_THIN_BASAL_AREA


END_INDEX_PARAMETERINDEX_PARAMETER = PCT_STEMS_REMOVED


END_INDEX_PARAMETERINDEX_PARAMETER = PCT_BASAL_AREA_REMOVED


END_INDEX_PARAMETERINDEX_PARAMETER = YEARS_SINCE_TREATMENT









245


















END_INDEX_PARAMETER

246

INDEX_PARAMETER = STAND_TYPEUSE_PARAMETER = YES | NOPARAMETER_TYPE = FIXED | INTERVALINTERVAL_WIDTH = < number >




247

Appendix F

THE SPICE SYSTEM - A BRIEF OVERVIEW

The Navigation Ancillary Information Facility (NAIF), located at the Jet Propul-

sion Laboratory in Pasadena, California, and acting under the directions of NASA’s

Office of Space Science, has built a data system, the SPICE system, to assist scientists

in planning and interpreting scientific observations from space-borne instruments.

The principal objective of this information system is to provide three dimensional

geometric and other ancillary information used to plan space science missions and

subsequently recover the full value of science instrument data collected and returned

from these missions. This includes the correlation of individual instrument data sets

with data from other instruments on the same or other spacecraft and the archiving

of data in stable well documented file formats.

The primary SPICE data sets, often called kernels or kernel files. SPICE kernels

are composed of navigation and other ancillary information that has been structured

and formatted for easy access and correct use by the space science and engineering

communities. SPICE kernel files are produced by the most knowledgeable sources of

such information, usually located at a mission operations center. They must include

or be accompanied by metadata – consistent with flight project data system standards

– that provide pedigree and other descriptive information needed by prospective users.

SPICE kernel file types and their basic contents are summarized below. In the

kernel type summaries an ephemeris is the three dimensional trajectory of a spacecraft,

solar system body, or point in space, as seen from a particular point of view through

time. The orbits of the planets around the Sun or the Moon around the Earth provide

well known examples of ephemerides. The data types of the SPICE kernel files are

248

indicated by the letters of the SPICE acronym.

S- Spacecraft ephemeris, or more generally, location of an observer relative to another

object, given as a function of time. These data are stored in a SPICE SPK,

spacecraft and planet ephemeris kernel file.

P- Planet, satellite, comet, or asteroid ephemerides, or more generally, location of a

target body, given as a function of time. These data are also stored in a SPICE

SPK, spacecraft and planet ephemeris kernel file. The P-kernel also includes a

second component containing certain physical, dynamical and cartographic con-

stants for target bodies, such as size and shape specifications, and orientation of

the spin axis and prime meridian. These target body physical and cartographic

constants are found in the SPICE PCK file.

I- Instrument description kernel, containing descriptive and operational data pecu-

liar to a particular scientific instrument, such as mounting alignment, internal

timing relative to the spacecraft clock, and field-of-view model parameters. This

instrument information is contained in the SPICE IK file.

C- Pointing kernel, containing a transformation traditionally called the C-matrix that

provides time-tagged pointing (orientation) angles for a spacecraft structure

upon which scientific instruments are mounted. Attitude data are contained in

the SPICE CK file.

E- Events kernel, summarizing mission activities-both planned and unanticipated.

Events data are contained in the SPICE EK file set which consists of three

components: Science Plans, Sequences, and Notes.

The ephemeris data for spacecraft and target bodies are normally combined in a

single file called the SPICE SPK file, though this is not necessary. An SPK file could

249

contain only target ephemerides, or only a spacecraft ephemeris, depending on the

context and the particular data access requirements.

In addition to the standard SPICE kernel files, there are several miscellaneous ker-

nel files, the spacecraft clock kernel (SCLK) file and the leapseconds kernel (LSK) file.

These files are used to facilitate time conversions between various time measurement

systems.

Also available is the SPICE Database Kernel (DBK) file. This provides a simple,

portable relational database functionality used for several purposes, such as the Se-

quence component of the E-kernel, star catalogs, image catalogs and SPICE kernel

file management. The Spice Database Kernel may also be used by customers in place

of a commercial product when wide portability and low cost are key requirements.

Perhaps the SPICE acronym should have been SPICES, with the final S standing

for Software. The SPICE system includes the NAIF Toolkit, a large collection of allied

software. The principal component of this Toolkit is a library of portable subroutines

for reading and writing the kernel files, calculating the most common observation

geometry parameters, and performing a wide variety of text and data processing tasks.

Users integrate these SPICE Toolkit subroutines into their own application programs

to compute observation geometry parameters and related information where, and as,

needed. The NAIF Toolkit was originally implemented in ANSI FORTRAN 77, now

a subset of Fortran 90/95, but is becoming available in the C language as well. A

C++ version is planned for some time in the future.

NAIF has designed the kernel file and software Toolkit architectures with porta-

bility and multimission application as principal goals. In addition, because extensive

software documentation and examples are provided with the Toolkit, with a reason-

able learning effort the software can be confidently used by the full spectrum of the

NASA-supported space science community.

A flight project’s mission operations center concentrates on producing, cataloging

and distributing complete and accurate kernels on a timely basis. Kernel updates

250

should be made promptly if/as improved data sources become available.

Users may order those SPICE kernels of interest fo use in application programs

hosted at their home sites to compute needed geometric and related ancillary in-

formation. Users may even update some kernels and produce their own versions of

other kernels to support their own analyses or to provide their colleagues with any

improvements in ancillary information resulting from their work.

Each flight project generally delivers well documented copies of all SPICE kernels

and a set of developed Toolkit software to the appropriate permanent archive facility,

assuring ready availability of this data for future use. Ideally this archive is open to

the international community of scientists and engineers. User-produced kernels may

also be similarly archived. For NASA planetary missions, the NAIF node of NASA’s

Planetary Data System (PDS) is the archive site.

Because generic ephemerides for most solar system target bodies are available at

the NAIF PDS node, SPICE is frequently used for mission design and for planning

solar system and stellar observations. In these cases the observer could be a pre-

dict spacecraft ephemeris produced by a mission design organization, a terrestrial

telescope or a user-provided instrument location. In some cases predict versions of

other SPICE kernels are also made to help simulate a full data processing system.

With this flexibility scientists may use SPICE throughout the experiment life cycle –

from mission planning to detailed observation design to instrument data analysis and

finally to correlation of results other instruments on a spacecraft or other missions.

A large set of core SPICE components is in place. Extension and adaptation of

this core system to encompass broader functionality and to meet specific needs of new

projects is an ongoing endeavor. This work will include provision of some broadly

useful application programs and development of additional kernel types. Examples

of new applications are: adaptation of SPICE for landers and rovers, and, use of

SPICE to point and tune NASA’s Deep Space Network antennas to track all customer

spacecraft.

251

The SPICE system is, or soon will be, used on numerous national and interna-

tional space missions, such as Galileo, Mars Global Surveyor, Near Earth Asteroid

Rendezvous, Mars Surveyor 98, Stardust, and Cassini. Limited data from older mis-

sions such as Voyager, Viking, and Pioneer have been restored and converted into

SPICE data formats. While the principle use of SPICE has been in the planetary

science discipline, astrophysics, space physics and even Earth science projects are also

using this technology.

In addition to providing SPICE technology to NASA and various international

space missions the NAIF Group serves as the Ancillary Data Node of NASA’s Plane-

tary Data System. In this role NAIF provides permanent archive, data distribution,

and consultation functions for planetary project ancillary data sets.

July, 1998

2.1 introductiontothetreelistgenerationdatabase............ 21 2.1.1...

Documents