2002- mcbratney- from pedotransfer functions to soil inference systems

8/10/2019 2002- Mcbratney- From Pedotransfer Functions to Soil Inference Systems

1/33

From pedotransfer functions to soil inference systems

Alex B. McBratney, Budiman Minasny*, Stephen R. Cattle,R. Willem Vervoort

Australian Cotton CRC, Department of Agricultural Chemistry and Soil Science, The University of Sydney,

Ross St. Building A03, Sydney, NSW 2006, Australia

Received 28 May 2001; received in revised form 14 December 2001; accepted 20 December 2001

Abstract

Pedotransfer functions (PTFs) have become a white-hot topic in the area of soil science and

environmental research. Most current PTF research focuses only on the development of new

functions for predicting soil physical and chemical properties for different geographical areas or soil

types while there are also efforts to collate and use the available PTFs. This paper reviews the brief

history of the use of pedotransfer functions and discusses types of PTFs that exist. Differentapproaches to developing PTFs are considered and we suggest some principles for developing and

using PTFs. We propose the concept of the soil inference systems (SINFERS), where pedotransfer

functions are the knowledge rules for inference engines. A soil inference system takes measurements

we more-or-less know with a given level of (un)certainty, and infers data that we do not know with

minimal inaccuracy, by means of properly and logically conjoined pedotransfer functions. The soil

inference system has a source, an organiser and a predictor. The sources of knowledge to predict soil

properties are collections of pedotransfer functions and soil databases. The organiser arranges and

categorises the PTFs with respect to their required inputs and the soil types from which they were

generated. The inference engine is a collection of logical rules selecting the PTFs with the minimum

variance. Uncertainty of the prediction can be assessed using Monte Carlo simulations. The inferencesystem will return the predictions of soil physical and chemical properties with their uncertainties

based on the information provided. Uncertainty in the prediction can be quantified in terms of the

model uncertainty and input data uncertainty. In order to avoid extrapolation, a method was

developed to quantify the degree of belonging of a soil sample within the training set of a PTF. With

the first approach to a soil inference system, we can optimally predict various important physical and

chemical properties from the information we have utilising PTFs as the knowledge rules. D 2002

Elsevier Science B.V. All rights reserved.

Keywords:Expert systems; Soil surveys; Decision-support systems; Error propagation; Information management

0016-7061/02/$ - see front matterD 2002 Elsevier Science B.V. All rights reserved.

PII: S 0 0 1 6 - 7 0 6 1 ( 0 2 ) 0 0 1 3 9 - 8

* Corresponding author. Tel./fax: +61-2-9351-3706.

E-mail address:[email protected] (B. Minasny).

www.elsevier.com/locate/geoderma

Geoderma 109 (2002) 4173


2/33

1. Introduction

The development of models simulating soil processes has increased rapidly in recent

years. These models have been developed to improve the understanding of important soilprocesses and also to act as tools for evaluating agricultural and environmental problems.

Consequently, simulation models are now regularly used in research and management.

However, models usually require a large number of parameters to describe the transport

coefficient, matter content in the soil, or other physical and chemical properties. Thus,

collecting soil properties become an urgent need to feed the very hungry, almost insatiable,

environmental (simulation) management models. As soil properties can be highly variable

spatially and temporally, measuring these properties is both time consuming and

expensive. As a result, the most difficult and expensive step towards the process of

environmental modelling is the collection of data.

The termpedotransfer function(PTF) was coined by Bouma (1989) as translating data

we have into what we need. The most readily available data come from soil survey, such as

field morphology, texture, structure and pH. Pedotransfer functions add value to this basic

information by translating them into estimates of other more laborious and expensively

determined soil properties. These functions fill the gap between the available soil data and

the properties which are more useful or required for a particular model or quality

assessment. Pedotransfer functions can be defined as predictive functions of certain soil

properties from other easily, routinely, or cheaply measured properties.

Currently, many pedotransfer functions are being developed to predict certain soil

properties for a geographical area. Reviews on the development and the use ofpedotransfer functions, particularly for predicting soil hydraulic properties have been

given by Rawls et al. (1991), Wosten (1997), Pachepsky et al. (1999b), and most recently

by Wosten et al. (2001).

Developing new PTFs is an arduous task, so it is sensible to utilise functions that have

already been developed. However, commonsensically a given pedotransfer function

should not be extrapolated beyond the geomorphic region or soil type from which it

was developed. Most current research focuses only on the development of new functions

for different areas or a group of soil types. Little effort has been made to integrate all the

available functions into a system that is able to tell us which function is the most suitable

to use for a particular soil type. Furthermore, soil data are bound to have uncertainties dueto measurement errors, and sometimes the data are incomplete, but uncertainty is seldom

quantified in pedotransfer functions.

This paper has the aim of describing a brief introduction to pedotransfer functions: what

has been done, principles, and approaches. We propose a soil inference system, wherein

pedotransfer functions are the knowledge rules. This approach allows one to treat all kinds

of uncertainty in a logically consistent way.

2. A brief history of pedotransfer functions

Although not formally recognised and named until 1989, the concept of the pedo-

transfer function has long been applied to estimate soil properties that are difficult to

A.B. McBratney et al. / Geoderma 109 (2002) 417342


3/33

determine. Many soil science agencies have their own (unofficial) rule of thumb for

estimating difficult-to-measure soil properties. Probably because of the particular diffi-

culty, cost of measurement, and availability of large databases, the most comprehensive

research in developing PTFs has been for the estimation of water retention. The firstattempt to use such predictions came from the study of Briggs and McLane (1907), which

was later refined by Briggs and Shantz (1912). They determined the wilting coefficient,

which is defined as percentage water content of a soil when the plants growing in that soil

are first reduced to a wilted condition from which they cannot recover in an approximately

saturated atmosphere without the addition of water to the soil, as a function of particle-

size:

Wilting coefficient0:01 sand0:12 silt0:57 clay

1F0:025

: 1

In their studies, the sand fraction is defined as particles with a diameter of 502000 Am,

silt as 550 Am, and clay as < 5 Am diameter particles.

With the introduction of the field capacity (FC) and permanent wilting point (PWP)

concepts by Veihmeyer and Hendrikson (1927), research during the period 19501980

attempted to correlate particle-size distribution, bulk density and organic matter content

with water content at field capacity (FC, h at 33 kPa), permanent wilting point (PWP, hat 1500 kPa), and available water content (AWC = FC PWP). From a study in NorthQueensland, Australia, Stirk (1957) suggested an estimate of water content at 1500 kPa

for soil containing up to 60% clay as:

PWP 2=5 clay: 2

Nielsen and Shaw (1958) presented a parabolic relationship between clay content and

PWP for 730 Iowa soils.

In the 1960s, various papers dealt with the estimation of FC, PWP, and AWC, notably

in a series of papers by Salter and Williams (1965a,b, 1967, 1969), and Salter et al. (1966).

They explored relationships between texture classes and available water capacity, which

are now known as class PTFs. They also developed functions relating the particle-sizedistribution to AWC, now known as continuous PTFs. They asserted that their functions

could predict AWC to a mean accuracy of 16%.

In the 1970s, more comprehensive research using large databases was developed. A

particularly good example is the study by Hall et al. (1977) from soil in England and

Wales; they established field capacity, permanent wilting point, available water content,

and air capacity as a function of textural class, as well as deriving continuous functions

estimating these soilwater properties. In the USA, Gupta and Larson (1979) developed

12 functions relating particle-size distribution and organic matter content to water content

at potentials ranging from 4 to 1500 kPa.

With the flourishing development of models describing soil hydraulic properties (e.g.van Genuchten, 1980) and computer modelling of soilwater and solute transport (De Wit

and van Keulen, 1975), the need for hydraulic properties as inputs to these models became

A.B. McBratney et al. / Geoderma 109 (2002) 4173 43


4/33

more evident. Clapp and Hornberger (1978) derived average values for the parameters of a

power function for the water retention curve, sorptivity and saturated hydraulic con-

ductivity (Ks) for different texture classes. In probably the first research of its kind,

Bloemen (1977, 1980) derived empirical equations relating parameters of the Brooks andCorey hydraulic model to particle-size distribution.

Lamp and Kneib (1981) formally introduced the term pedofunction, while Bouma and

van Lanen (1986) used the term transfer function. To avoid confusion with the term

transfer function used in soil physics (and in many other disciplines), Bouma (1989) later

called it a pedotransfer function.

Since then, the development of hydraulic PTFs has become a boom industry, mostly in

the US and Europe. Results of such research have been reported widely, including in the

USA (Rawls et al., 1982), the UK (Mayr and Jarvis, 1999), the Netherlands (Wosten et al.,

1995), and Germany (Scheinost et al., 1997b).

In Australia, the first comprehensive attempt to develop hydraulic PTFs appears to

have come from the study of Williams et al. (1983). They classified water retention

based on textural class, and provided average parameter values for a power function

model of the water retention curve based on texture class. Further studies by Williams

et al. (1992) estimated Campbells parameters from texture and structure information.

More recently, many hydraulic PTFs have been developed in Australia, such as those

by Cresswell and Paydar (1996), McKenzie and Jacquier (1997), and Minasny et al.

(1999).

Although most PTFs have been developed to predict soil hydraulic properties, they are

not restricted to hydraulic properties. PTFs for estimating soil physical, mechanical,chemical and biological properties have also been developed (Table 1).

The application of PTFs is not limited to predicting soil properties. The properties

predicted from PTFs are used further in modelling such as assessing regional water

flow and pesticide leaching (Petach et al., 1991), predicting pesticide leaching to

groundwater in a watershed (De Jong and Reynolds, 1995), developing regional

groundwater vulnerability maps that indicate the impact of leaching (Soutter and

Pannatier, 1996), assessing the magnitude of cadmium accumulation on a regional

scale (Tiktak et al., 1999), estimating soya bean yields in a field (Timlin et al., 1996),

estimating regional crop yields (Haskett et al., 1995), and evaluating crop yield and

nitrate leaching as a result of different management practices (Droogers and Bouma,1997).

According to Pachepsky and Rawls (1999), current research areas in pedotransfer

functions include formulating a better physical model, finding the most influential soil

properties as predictive variables, grouping soil into classes in order to minimise the

variance of prediction, and developing alternative methods to derive or fit the PTFs.

Schaap et al. (1998) developed a hierarchical approach for predicting soil hydraulic

properties. Different functions have been developed to accommodate the amount of

information we have.

There is also interest in creating large databases of soil properties, which can be

potentially used to develop pedotransfer functions. The World Inventory of SoilEmission Potentials (WISE) holds a database of 4353 soil profiles around the world

(Batjes, 1996). Databases focused on soil hydraulic properties have been created such as



5/33

UNSODA (Nemes et al., 2001) and GRIZZLY (Haverkamp et al., 1997) with data fromdifferent parts of the world, and HYPRES (Wosten et al., 1999) from European

countries.

Table 1

Examples of pedotransfer functions

Predicted soil properties Predictor variables Authors

Physical propertiesInfiltration rate at a given time initial water content, moisture deficit,

total porosity, non capillary porosity,

hydraulic conductivity

Canarache et al. (1968)

Soil thermal conductivity texture, organic matter content,

water content

De Vries (1966)

Water repellency organic matter content, clay content Harper and Gilkes (1994)

Bulk density particle size distribution Rawls (1983)

Infiltration parameters particle size distribution,

bulk density, organic C content,

initial water content, root content

van de Genachte et al. (1996)

Mechanical properties

Soil mechanical resistance organic carbon content, clay content Da Silva and Kay (1997)

Volumetric shrinkage,

liquid limit, plastic limit,

plasticity index

organic matter content,

clay content, CEC

Mbagwu and Abeh (1998)

Degree of overconsolidation bulk density, void ratio McBride and Joosse (1996)

Rate of structural change organic matter content,

clay content, pH

Rasiah and Kay (1994)

Soil erodibility factor geometric mean particle-size,

clay and organic matter content

Torri et al. (1997)

Chemical propertiesCation exchange

capacity (CEC)

clay content, organic matter content Bell and van Keulen (1995)

Critical P level,

P buffer coefficient

clay content Cox (1994)

Soil organic matter soil colour Fernandez et al. (1988)

P sorption pH in NaF Gilkes and Hughes (1994)

pH buffering capacity organic matter content, clay content Helyar et al. (1990)

Al saturation base saturation,

organic carbon content, pH

Jones (1984)

P saturation extractable P, Al Kleinman et al. (1999)

K/Ca exchange clay content, extractable K Scheinost et al. (1997a)

Nitrogen-mineralizationparameters

CEC, total N, organic carbon content,silt and clay content

Rasiah (1995)

As and Cd sorption clay content, pH , organic carbon

content, dithionite extractable Fe

Schug et al. (1999)

Phosphorous (P) adsorption clay content, pH, soil colour Sheinost and Schwertmann

(1995)

Cd sorption coefficient clay content, organic carbon

content, pH

Springob and Bottcher

(1998)

Haematite content soil colour Torrent et al. (1983)



6/33

3. Principles

As seen in Table 1, many pedotransfer functions have been developed over the past few

decades. At this point, it seems helpful to define two principles of useful or efficient PTFs,in order to avoid misuse and abuse of the concept. The principles relate to effort and

uncertainty.

3.1. Principle 1effort

Our first principle of PTFs is: Do not predict something that is easier to measure than

the predictor.

Since the objective of pedotransfer functions is to predict properties that are difficult or

expensive to measure in general, the predictor should be more easily measured. The cost

and effort to obtain the information on the predictor should be much less than that to

obtain information on the predicted. In other words, if we define efficiency (Minasny and

McBratney, 2002a) as:

Efficiency 1 quality of information=effort

Efficiency 2 quality of information=cost of information 3

the ratio between the efficiency of the predicted over the efficiency of the predictor shouldbe > 1 for justification as a PTF.

For example, to predict the saturated hydraulic conductivity (Ks) of a soil from its

structural features, as measured by image-analysis, would not constitute an efficient PTF.

Although there is a good relationship between the image-analysis parameters and Ks, it

takes more effort to use the image-analysis technique.

Pachepsky et al. (1999a) predicted Ks from the Brooks Corey water retention

parameters. If we measure the water retention curve only to predict Ks, this is not an

efficient PTF, as the cost of measuring a water retention curve is greater than measuringKsitself. However, if water retention data are already available, then using them to predict

both water retention and Ksconstitutes a PTF. Using the available soil information to fill inthe gaps of other soil properties also constitutes a PTF.

3.2. Principle 2uncertainty

Do not use PTFs unless you can evaluate the uncertainty, and for a given problem, if a

set of alternative PTFs is available, use the one with minimum variance.

3.2.1. Sub-principles

Principle 2 implies two sub-principles, namely,

Uncertainty of PTFs should be quantified, and if a set of alternative PTFs is available, use the one with minimum variance.



7/33

Many PTFs have been developed to predict the same or similar soil properties. For

example, in Australia at least 10 functions are available for the prediction of water

retention, while worldwide there are more than 100 functions for this property.

Therefore, it is wise to choose the function where training data is similar to theone we wish to predict. We should choose the function that has the smallest error

variance.

Many papers deal with the reliability of published PTFs to predict properties in certain

areas or soil types (Ungaro and Calzolari, 2001; Wagner et al., 2001; Cornelis et al., 2001).

These investigations point out that published PTFs may not always be applicable to data

from particular areas. Therefore, it is wise to select PTFs that have similar properties to the

one we wish to predict.

The uncertainty of a PTF can be due to the uncertainty of the model, and

uncertainty in the input data. The uncertainty associated with the model can be

calculated from the non-parametric bootstrap method (Efron and Tibshirani, 1993), or

first-order analysis if the PTFs are generated using the least-squares method. The

uncertainty of input data can be easily computed using Monte Carlo simulation. This is

done by sampling repeatedly from the assumed distribution of the input data and

evaluating the output of the PTF.

The uncertainty of the model is usually not defined and sometimes the statistics of the

training data are also not defined. We should minimise extrapolation of the soil properties,

and therefore we propose an uncertainty assessment to consider the range of the data in the

training set.

3.2.2. A scheme for defining uncertainties of data inside/outside the training set

In order to assess whether a sample is within or outside the training set, to avoid

extrapolation, a method that penalizes the prediction uncertainty when the sample is

outside the training set should be developed. We use a simplification of fuzzy k-means

with extragrades (De Gruijter and McBratney, 1988) to achieve this. N the fuzzy 1-

mean with extragrades (F1me), we define only one class, within the class is the

training set, while samples outside the class boundary are considered as extragrades

(outliers).

The membership is controlled by a fuzziness exponent/, which defines the smoothness

of the membership, and the parametera which defines the distance from the data centre(mean) for a data point to be considered as an outlier, (0 < a< 1). The smaller the value of

a, the further the distance before a point is classified as an outlier. The membership of a

sample in the defined dataset is defined as

m d2=/1

d2=/1 1aa

d2=/1 1=/1 4

while the membership defining a sample as being outside of the dataset is:

m*1aa d

2

1=/1

d2=/1 1aa

d2=/1 1=/1 5



8/33

where d is the Mahalanobis distance between the sample x and the center (mean) of the

data c:

d

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffixcTV1xc

q 6

whereV is the variance-covariance matrix of the data. This distance takes into account the

correlation among variables.

The standard deviation of the prediction ra can be calculated as:

ra ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffir2p

1m* 0:0001r2ps 7

where rp2 is the variance of the PTF. The first term of the denominator in Eq. (7)

calculates the belongingness of a new sample in the training set, while the second

term avoids infinite calculation when m* = 1. When a sample is within the training

set, ra =rp; but the prediction error will increase if the data is outside the training

set.

We seek the values for the / and a parameters such that 95% of the training data is

considered as within the data set and 5% is considered outside (outliers). We found that a

/ value of 1.3 provides a good representation of the membership. The value a can befound by optimization such that it provides an average m* value of 0.05. We found that

values around 0.01 to 0.02 satisfy this criterion.

As an example, we have a dataset of sandy materials and we wish to develop a PTF

to predict field capacity from bulk density and clay and sand content (mean

values = 1.5 Mg m 3; 16%, 70%). We use the value of /= 1.3 and a = 0.014. The

extragrade membership m* for different clay contents at constant mean values of sand

and bulk density is given in Fig. 1a. As the clay content moves away from the mean

value, m* increases rapidly. The resulting prediction error is shown in Fig. 1b. The

standard deviation of prediction, as calculated from the least-squares estimates, only has

a value of 0.013 m3

m 3

for a clay content of 40%; the suggested modificationaccounts for the extrapolation (m* = 0.906) and penalizes the prediction error as it is

outside the boundary of training set (the standard deviation of prediction

becomes = 0.042 m3 m 3).

3.2.3. PTF quality assurance

In accordance with our second principle, we need to consider the quality assurance

of published PTFs. We suggest that we need a minimum level of repetition when

publishing newly developed PTFs. When describing new PTFs, the authors should

give the statistics of the training data set (mean, standard deviation, median,

minimum and maximum, and correlations among variables). Authors should alsocalculate the uncertainty of the model using the standard first-order Taylor analysis

(Heuvelink, 1998) or the bootstrap method (Efron and Tibshirani, 1993). If the PTFs



9/33

were generated using a least-squares or maximum likelihood method, the authors

should list the standard error of the parameters along with their variance covariance

matrix.

4. Classification of PTFs

Wosten et al. (1995) recognized two types of PTF based on the amount of available

information, namely, class and continuous PTFs. Class PTFs predict certain soil properties

based on the class (textural, horizon, etc.) to which the soil sample belongs. Continuous

PTFs predict certain soil properties as a continuous function of one or more measured

variables.

However, the situation is a little more complex. We can classify PTFs further by

looking at the predictor and predicted variables. The predictor/predicted variables can be ahard or fuzzy class, continuous or fuzzy variables, or mixed class-continuous. Thus, we

have 25 possibilities, but in accordance with the principles, it is not sensible to consider all

Fig. 1. (a) Extragrade membership as a function of clay defining the outliers from the training dataset; (b)

uncertainty in the prediction as a function of clay content.



10/33

the combinations as efficient PTFs. The predictor variables should be acquired with

substantially less effort than the predicted variables (Efficiency equation (Eq. (3))>1).

Using different continuous variables (e.g. sand and clay content) to predict a class or

category (e.g. texture class) does not constitute a PTF. The interesting combinations are

marked as * in Table 2 and the uninteresting ones are labelled o.

The most common PTFs are continuous variables predicting continuous soil properties,

and are usually called continuous PTFs. Other examples are shown below.

.A look-up table that predicts soil properties from the class to which the soil belongs is

a hard class (predictor), continuous (predicted) PTF. For example, Carsel and Parrish

(1988) presented the van Genuchten parameters (predicted) based on texture classes

(predictor).

. McKenzie and Jacquier (1997) presented a regression tree predicting Ks from field

morphology classes (texture classes, structural grade, areal porosity). This is classified as a

hard class (predictor), continuous (predicted) PTF. In other words, the regression tree is a

multi-level look-up table.

. Pachepsky and Rawls (1999) predicted water content at potentials of 33 and 1500 kPa from basic soil properties by grouping the soil on the basis of soil taxonomicunit, soil moisture regime, soil temperature regime, and texture class. This is classified as a

mixed: hard class and continuous (predictor), continuous (predicted) PTF.

. Minasny et al. (1999) predicted the van Genuchten parameters from basic soil

properties and fuzzy texture classes. This is a mixed: fuzzy class and continuous(predictor), continuous (predicted) PTF.

. Torri et al. (1997) predicted the soil erodibility factor (K) from mean geometric

particle diameter, organic matter content, and clay content. The input variables were

fuzzified using the membership function of each variable and results were aggregated to

predict the fuzzy number of K. A hard value can be obtained by defuzzifying the

membership output. This is a: fuzzy (predictor) and fuzzy (predicted) PTF.

5. Approaches for the various classes

There are various means to arrive at the different kinds of PTF. Generally, these can be

classified into two approaches.

Table 2

Different combinations of predictor and predicted variables

Predictor

Class Continuous Fuzzy Mixed

Hard Fuzzy

Predicted hard class * o o o o

fuzzy class * * o o o

continuous * * * * o

fuzzy * * * * o

mixed * * * * o

The combinations that are considered as PTFs are marked as * and the ones that are not PTFs are marked as o.



11/33

. Mechanistic-empirical approach. This approach attempts to describe a physical or

chemical model relating the basic properties to the predicted properties. For example, Arya

and Paris (1981) translated particle-size distribution data into a water retention curve by

means of the capillary equation.. Empirical approach. This is the most common method, i.e. correlating the basic soil

properties to the more difficult-to-measure soil properties by using different numerical

fitting methods. The most frequently used technique for fitting or deriving PTFs is

multiple linear regression (e.g. Wosten et al., 1999), and a more recent one is artificial

neural networks (e.g. Schaap et al., 1998). Other modern numerical and statistical methods

can also be applied such as Generalised Linear Models (McCullagh and Nelder, 1989),

General Additive Models (GAM, Hastie, 1992), the Group Method of Data Handling

(GMDH, Farrow, 1984; Pachepsky and Rawls, 1999), Multiple Adaptive Regression

Splines (MARS, Friedman, 1991), and regression trees (Breiman et al., 1984).

5.1. Regression trees

A regression tree is a special type of decision tree that can predict continuous variables.

As an example, we present a regression tree from the data of Forrest et al. (1985). We wish

to predict log10(Ks) from field texture, clay content and bulk density. Using the regression-

tree function from the program S-plus (Mathsoft, 1999), we arrive at the model shown in

Fig. 2a. Although the prediction is a continuous variable, the predicted value is a constant

for each node (leaf). The sharp cut-off for bulk density (1.43 Mg m 3) in the clay

texture grade results in an abrupt prediction ofKs

. The difference in predicted Ks

for claysoil with bulk density less or greater than 1.43 Mg m 3 is about 35 fold (4 and 158 mm

h 1). The response surface can be seen in Fig. 3a.

An alternative to improve the regression trees is to build multivariate linear models in

each node (leaf). This type of model tree is analogous to using piecewise linear functions

(Quinlan, 1992). Using the same data, a more continuous hybrid regression-tree model, as

computed using the program Cubist (RuleQuest Research, 2000), is given in Fig. 2b. The

surface of this prediction is shown in Fig. 3b. Although the hybrid regression-tree model

allows a smooth transition from bulk density 1.3 to 1.4 Mg m 3, both methods still

exhibit a sharp cut-off at a clay content of 48%.

To further improve the prediction, we could apply fuzzy membership to allow asmoother transition from one class to another (Jang, 1997). Here we use the sigmoidal

membership function for clay content:

mx>c

0, if xVcw

12

xcww

2

if cw< xVc

1 12

cwxw

2if c< xVcw

1, if x> cw

8>>>>>>>>>>>>>>>:

8



12/33

wheremx>cis the membership function for the class at which clay content >c, andw is the

parameter controlling the width of the transition. The membership for the class where

clay < cis given by 1 mx>c. The prediction is the sum of the predictive function at each

class weighted by its membership. The resulting prediction is given in Fig. 3c.

5.2. Fuzzy systems

An alternative to using continuous functions is to use fuzzy systems. Torri et al.

(1997) defined the eight membership functions of the soil erodibility factor K as a

function of geometric mean particle diameter, clay content and organic matter content.

The predicted K from each input variable is aggregated and the centroid of the output is

computed. The adaptive neuro-fuzzy inference system (ANFIS; Jang, 1993) can be

applied as a further alternative technique. Similar to neural networks, the neuro-fuzzy

system constructs inputoutput membership function relationships. The fuzzy system isset to learn from the training data by adjusting the parameters of the membership

functions using the least-squares method. The trained system can be used as a PTF to

Fig. 2. (a) A regression tree to predict saturated hydraulic conductivity from field texture grade (Text: S = sand,

LS = loamy sand, L = loam, CL = clay loam, LiC = Light clay, C = clay), bulk density (BD), and clay content.

Values in nodes are the predicted log10(Ks) (in mm/day), values underneath the nodes are the standard deviation of

prediction. (b) A hybrid regression tree with continuous piecewise functions.



13/33

predict properties of an unknown soil sample. An example is given in Fig. 4a, where we

used the anfis function of the fuzzy logic toolbox from the program Matlab (Mathworks,

1998). Using the same data as in Section 5.1., we predict saturated hydraulic con-

ductivity (Ks) as a function of clay content and bulk density. Three classes with

membership functions for each input variable were generated and with the fuzzy rules

of the Takagi Sugeno inference system (Takagi and Sugeno, 1985). Each of the

membership functions is correlated with a linear model output. The memberships of

all classes were aggregated and defuzzified to obtain an estimate of log10(Ks). The outputof this model as a function of clay content and bulk density is shown in Fig. 4b, which is

more variable when compared with the result given by the regression tree. Because there

is a correlation between bulk density and clay content, the training data only covers part

of the response surface (represented by a line surrounding the data points); hence,

prediction outside the training set is not accurate.

The problem of using this approach is the large number of parameters required to build

a fuzzy system; each input variable requires a certain number of membership functions and

each function is made up of two to three parameters. The output model also consists of

linear functions of the input membership values. However, the fuzzy system can be

potentially useful for incorporating structure classes or grades. For example, PTFspredicting Ks from structure classes of Lilly (2000) can be extended by incorporating

membership functions to the structure classes.

Fig. 3. A comparison of saturated hydraulic conductivity prediction as a function of clay and bulk density for a

light clay using the various regression trees with terminal nodes characterised by: (a) constants, (b) linear

functions, and (c) fuzzy linear functions.



14/33

5.3. Parametric PTFs

Looking at the type of data we wish to predict, PTFs can also be classified as single

point regressions, parametric and physico-empirical PTFs. Single point PTFs predict a soilproperty, while parametric PTFs predict parameters of a model. A parametric PTF is

usually a relationship between dependent (x) and independent variables (y), e.g. the water

Fig. 4. (a) A fuzzy system that consists of three rules of a membership function for each predicted variables to

predict saturate hydraulic conductivity (log10 (mm/day)). The system calculates the membership of the input

variables for each of the rule, then apply fuzzy operator to obtain the membership of the output function. The

output for each rule is calculated according to a linear model, and the prediction is the weighted average of each

output. (b) Surface of the prediction using the fuzzy system, the dots represent the training data. Area surrounded

by the line represents where prediction is more accurate.



15/33

retention curve (water content and potential relationship). It assumes that a model

y=f(x,p), which is a closed-form equation with parameters p, can adequately represent

the data.

Estimating the parameters of a soilwater retention function is the most commonlyformulated parametric PTF. Nevertheless, other relationships have also been studied, e.g.

phosphorous sorption characteristics (Scheinost and Schwertmann, 1995), potassium

characteristics (Scheinost et al., 1997a), soil mechanical resistance (Da Silva and Kay,

1997), and the soil shrinkage curve (Crescimanno and Provenzano, 1999).

The usual steps for deriving parametric PTFs are estimating the parameters of a model

by fitting it to data, and then forming empirical relationships between basic soil properties

and the parameters. The latter step can be achieved by various mathematical methods, e.g.

multiple linear regression, or artificial neural networks. However, many authors have

reported difficulties when trying to correlate the parameters to the basic soil properties.

This is because the parameters are merely fitting coefficients and could be highly

correlated. Different authors have come up with alternative solutions.

Williams et al. (1992) assumed a power function model for the water retention curve

and developed PTFs from texture and organic matter content. The prediction variables

were inserted into the water retention model and the parameters were estimated using

multiple linear regression. Their model is:

lnhh a0 a1v1a2v2. . . c0 c1v1c2v2. . .lnh

where a, c are parameters and vare the predictor variables.On the other hand, Scheinost et al. (1997b) proposed the following approach: set-up the

expected relationship between the parameters of the model and soil properties, and then

insert the relationship into the model and estimate the parameters of the relationship by

fitting the extended model using nonlinear regression. Minasny and McBratney (2002b)

also suggested a similar approach using neural networks. As a general guide, parametric

PTFs should be modelled to fit the independent variables rather than the estimated

parameters, as is illustrated in Fig. 5.

Fig. 5. A scheme for deriving parametric PTFs.



16/33

Other approaches that can be used to predict the water retention curve are interpola-

tionextrapolation using one or two points from a curve (McQueen and Miller, 1974), or

using a one-parameter function (Gregson et al., 1987).

6. A scheme for developing new PTFs

Knowing whether to develop new pedotransfer functions requires the answer to the

question: Under what circumstances might one wish to develop a new PTF?. The

answers to this question could be: I have a model and it needs certain parameters. Do I

have them? Do I need PTFs?

A scheme is illustrated in Fig. 6, where one might wish to perform the following steps:

Literature search, to search for available pedotransfer functions. Database compilation: search for an existing, or create a new, database.

Fig. 6. A scheme for developing pedotransfer functions.



17/33

Search for the best numerical method, including grouping the soil properties based

on, e.g. geographic location, texture, or functional layer. Generate the new pedotransfer functions and perform uncertainty analysis.

7. Soil inference systems

7.1. Theory

Currently, there are many similar pedotransfer functions generated using new or

existing datasets. This seems to be a continuing process, while there seems to be much

less effort in gathering and using the available PTFs. The only approach to utilizing

published PTFs is to compile the available PTFs that predict soil hydraulic properties in a

computer program, such as SoilPar (Donatelli et al., 1996) and SH-Pro (Cresswell et al.,

2000). These programs simply predict hydraulic properties given the required inputs and

do not have a system that can select which function to use that will produce estimates with

the minimum variance.

Here we propose the concept of soil inference system (SINFERS), where pedotransfer

functions are the knowledge rules for soil inference engines. A soil inference system takes

measurements we more-or-less know with a given level of (un)certainty, and infers data

that we do not know with minimal inaccuracy, by means of properly and logically

conjoined pedotransfer functions.

Dale et al. (1989) discussed the role of expert systems in soil classification, and similarprinciples can be applied to the proposed inference system. This is illustrated in Fig. 7,

where our system has a source, an organiser and a predictor. The sources of knowledge to

predict soil properties are collections of pedotransfer functions and soil databases. The

organiser arranges and categorizes the pedotransfer functions with respect to their required

inputs and the soil types from which they were generated. The inference engine is a

Fig. 7. A soil inference system (modified and updated from Dale et al., 1989, Fig. 1).



18/33

collection of logical rules selecting the pedotransfer functions with the minimum variance.

The rules can simply be a collection of ifthen statements, or based on probabilistic

Bayesian inference. Uncertainty of the prediction can be assessed using Monte Carlo

simulations. The inference system operates through a user interface that will return thepredictions of soil physical and chemical properties with their uncertainties based on the

information provided.

The first approach we have taken towards building a soil inference system using a

collection of PTFs is to create a very rudimentary system in the form of a specially adapted

MicrosoftR Excel spreadsheet (see Table 3), which has no formal interface. It has two

essentially new features; firstly, it contains a suite of published pedotransfer functions. The

output of one PTF can act as the input to another functions (if no measured data are

available). Secondly, the uncertainties in estimates are inputs and the uncertainties of

subsequent calculations are performed.

In the first column, the spreadsheet consists of the essential soil properties, followed

by their estimates. Typically, hydraulic PTFs require four basic soil properties: clay and

sand content, bulk density and organic carbon content. The steps in SINFERS are the

following.

(1) Make sure that there are estimates for clay and sand content, bulk density and

organic carbon content.

(2) If one or more of these properties are not available, then estimate them from the

available information using a PTF. For example, clay and sand content can be estimated

from the average values of a texture class, and bulk density can be estimated from sand

and clay content.(3) Once these values have been obtained, other derived values can be calculated

arithmetically, such as the silt content (silt % = 100 clay % sand %), the geometricmean of particle diameter, the sand-sized fraction according to the USDA system, etc.

(4) The final step is to predict more difficult-to-measure soil properties such as the

water retention curve, hydraulic conductivity, soil strength, cation exchange capacity, P

buffering capacity and pH of buffering capacity.

Uncertainty can be quantified in terms of the model uncertainty and input uncertainty. If

the PTFs are trained/calibrated adequately, the uncertainty in the model is usually smaller

than the uncertainty in the inputs. In the case where a predictor in a PTF is not measured

but estimated from other PTFs, the overall uncertainty is important. Using the risk-analysissoftware @RISK (Palisade, 2000), which is attached to the MS Excel spreadsheet as an

add-in, the overall uncertainties in the prediction can be calculated. The user can define the

distribution for each input variable and their correlations. We use a normal distribution for

the input variables, but some distributions need to be truncated to avoid negative values

(such as clay and organic matter content). This probably suggests that these variables have

a logistic normal distribution, i.e. normal after log-ratio transformation. The program

@RISK uses Latin hypercube sampling to sample the multivariate joint distribution of the

input variable and a Monte Carlo simulation to calculate the distribution of the prediction.

This is achieved by sampling repeatedly from the assumed probability distribution of the

input variables and evaluating the result of the PTF for each sample. The distribution ofthe results, along with the mean, standard deviation and other statistical measures can then

be estimated.



19/33

Table 3

Soil properties measured and predicted from the soil inference system

Soil Properties Units Measured/

estimated

References

Clay


20/33

Table 4

Predicted soil properties of a clay using the soil inference system (Example 1)

Soil properties Measurement SD meas. Primary

Estimate SD model SD input % Error

Clay < 2 60 1 60.00 1.00 1.7

Silt 2 20 24 1 24.00 0.62 2.6

Silt 2 50 33.76 0.728 0.88 2.6

Sand 20 2000 16 1 16.00 1.00 6.3

Sand 50 2000 6.24 0.728 0.58 9.3

dg 0.01

sg 12.08

Bulk density 1.30 0.05 1.30 0.05 3.8

Particle density 2.65 0.05 2.65

Porosity 0.51 0.02 4.3

h 10 kPa 0.448 0.007 0.008 1.7h 33 kPa 0.426 0.006 0.007 1.7h 1500 kPa 0.313 0.007 0.003 1.0AWC 0.135 0.009 0.009 6.8

Air capacity (at 10 kPa) 0.061 0.015 25.0

Ks 0.714 1.305 0.204 28.6

Water retention

hr 0.006 0.002 28.9

hs 0.534 0.010 1.8

a 0.029 0.006 19.5n 1.078 0.002 0.2

m 0.072

h 10 kPa 0.526 0.008 1.5h 1500 kPa 0.399 0.005 1.3AWC 0.126 0.007 5.7

Tillage h

h OPT 0.441 0.009 1.9

h dry 0.417 0.008 1.9

h wet 0.478 0.009 1.9

h Tillage range 0.061 0.009 14.9

Soil resistance

c1 0.0000

c2 7.69c3 9.63

SR at 10 kPa 0.06 10000 0.038 63.5SR at 1500 kPa 0.93 10000 0.395 42.3h at 3 MPa 0.27 10000 0.015 5.5

LLWR

lower limit 0.41 0.022 5.3

upper limit 0.31 0.003 1.0LLWR 0.10 0.023 23.8



21/33

The inference engine will work in the following manner:

1. Predict all the soil properties using all possible combinations of inputs and PTFs.

2. Select the combination that leads to a prediction with the minimum variance.

7.2. Examples

7.2.1. Example 1

We present three examples of the use of the rudimentary soil inference system,

SINFERS, to predict some important soil physical and chemical properties. In the first

example, we wish to predict important soil chemical and physical properties, especially

hydraulic properties with respect to the availability of water to plants and water transport

in a clay soil. The data available from the laboratory measurement of this soil are particle-

size distribution (sand, silt, and clay content), bulk density, organic carbon content andfield pH (Table 4). These inputs were placed in the estimate column along with their

standard deviation (r).

The program calculates physical properties such as water retention at 10, 33 and 1500 kPa from point regression PTFs (Minasny et al., 1999) and also the van Genuchtenwater retention parameters using neural networks (Minasny and McBratney, 2002b).

Using the point regression method, the inference system predicts an available water

capacity of 0.135 m3 m 3, while using the parametric PTFs (which used the van

Genuchten function) the value is 0.126 m3 m 3. It is preferable to use point regression,

because the uncertainty is smaller (5.7% as opposed to 6.8%), and it is calibrated on the

specific potentials while the parametric PTF is a compromise to predict the whole curve.The predicted van Genuchten parameters can be used to estimate other useful proper-

ties, such as the optimum water content for tillage (Dexter and Bird, 2001). The optimum h

Table 4 (continued)

Soil properties Measurement SD meas. Primary

Estimate SD model SD input % Error

Organic C (Walkley & Black) 1.10 0.5 1.10 0.50 45.5

Total C 1.43 0.67 46.6

OM 2.51 1.17 46.6

CEC 475.8 3.28 11.99 2.5

pH field 6.0 0.1 6.00 0.10 1.7

pH 1:5 water 6.11 0.06 1.0

pH 1:5 CaCl2 5.03 0.06 1.3

pH buffering capacity 32.24 3.49 10.8

Net acid addition rate 2.5Time for pH to drop by 1 unit 25.15 2.69 10.7

Critical P level 4.09 0.141 3.4

P buffer coefficient 0.13 0.004 2.8



22/33

for tillage is defined as the inflection point of the water retention curve, with the lower

(dry) and upper (wet) limits calculated according to Dexter and Bird (2001). The plastic

limit occurs at a water content of 0.44 m3 m 3, and the optimal range for cultivation is

between 0.42 and 0.49 m3 m 3, which is drier than field capacity (0.53 m3 m 3).The least limiting water range (LLWR) (Letey, 1985) is the range in soil water content

least limiting to plants. The limit at the upper end is defined by the soil water content at a

value of non-limiting air-filled porosity (0.10 m3 m 3) or at field capacity ( 10 kPa),whichever is smaller, and at the lower end by the water content at a value of non-limiting

soil strength (3.0 MPa) or the water content at wilting point ( 1500 kPa), whichever islarger. Soil strength at field capacity and wilting point were evaluated using the soil

mechanical resistance PTFs of Da Silva and Kay (1997):

Rs c1hc2

qc

3 R2

0:86, n 256 9

with the following values forc1, c2, c3:

c1 exp3:670:765OC0:145P


23/33

Other chemical properties that can be predicted are soil pH buffering capacity (pHBC;

Noble et al., 1997):

pHBC mmol H

kg1

pH1

6:380:08P


24/33

Table 5

Predicted soil properties of a clay loam using the soil inference system (Example 2)

Soil Properties Measurement SD Primary Secondary

meas. Estimate SDinput

% Errorinput

Estimate SDmodel

SDinput

% Errorinput

Clay < 2 38 8 38.00 8.00 21.1 38.00 8.00 21.1

Silt 2 20 35 5 35.00 5.25 15.0 35.00 5.25 15.0

Silt 2 50 52.91 5.95 11.2 52.91 0.406 5.95 11.2

Sand 20 2000 27 8 27.00 8.00 29.6 27.00 8.00 29.6

Sand 50 2000 9.09 5.09 56.0 9.09 0.406 5.09 56.0

Bulk density 1.28 0.07 5.3 1.28 0.028 0.07 5.3

Particle density 2.65 0.05 2.65 2.65

Porosity 0.52 0.03 5.2 0.52 0.03 5.2

h 10 kPa 0.440 0.005 0.021 4.8h 33 kPa 0.401 0.005 0.026 6.6h 1500 kPa 0.255 0.004 0.029 11.3AWC 0.185 0.006 0.015 7.9

Air capacity

(at 10 kPa)0.077 0.013 17.5

Ks 5.517 1.2147 5.007 90.8

Water retention

hr 0.017 0.007 42.5 0.004 0.006 131.0

hs 0.497 0.036 7.2 0.496 0.027 5.4

a 0.053 0.003 5.1 0.223 0.007 3.1n 1.150 0.047 4.1 1.100 0.020 1.8

h 10 kPa 0.473 0.038 8.1 0.444 0.028 6.4h 33 kPa 0.435 0.039 8.9 0.403 0.031 7.7h 1500 kPa 0.265 0.039 14.6 0.279 0.037 13.4AWC 0.207 0.044 21.4 0.165 0.016 9.5

Tillage h

h optimal 0.379 0.027 7.3 0.396 0.030 7.6

h dry 0.341 0.042 12.4 0.369 0.034 9.3

h wet 0.426 0.032 7.4 0.436 0.029 6.6

h Tillage range 0.085 0.030 35.6 0.067 0.013 19.5

Soil resistance

c1 0.0002

c2 4.96c3 7.51

SR at 10 kPa 0.09 3.015 0.096 107.5SR at 1500 kPa 1.34 3.579 1.074 80.0h at 3 MPa 0.22 0.028 12.8

LLWR

lower limit 0.42 0.027 6.5

upper limit 0.26 0.028 10.9LLWR 0.16 0.023 14.3



25/33

constructed inference engine should recognise these possibilities and logically select the

PTF that generates the smallest variance. Ideally, the inference engine should trace back

through the process to generate the target variable with minimum variance.

The soil strength at wilting point predicted from the PTF of Da Silva and Kay (Eq. (9))

is 1.34 MPa, which is close to the potential itself ( 1.5 MPa). Since the soil is a clayloam, it partly belongs to the training set of this PTF with outlier membership m*=0.53.

Hence, prediction using this PTF for the clay loam is more accurate than using it for the

clay soil (Example 1).Since the prediction is based on a field texture, the uncertainty in clay and sand content

estimates are high, and therefore the uncertainty of prediction due to error in the input

variables is higher than Example 1, which uses input from laboratory measurements.

7.2.3. Example 3

An ideal inference system would have a user interface, be populated with initial values

and would return the minimum variance prediction of desired quantities. In this case, the

input interface would prompt for:

1. The contextual (general) information. Basically, the region, soil class or textureclass of interest. Depending on what is known, the mean values for up to three

would be generated and the minimum variance estimate filled in.

Table 5 (continued)

Soil Properties Measurement SD Primary Secondary

meas.Estimate SD

input

% Error

input

Estimate SD

model

SD

input

% Error

input

Organic C

(Walkley &

Black)

1.10 0.5 1.10 0.50 45.5 1.10 0.50 45.5

Total C 1.43 0.67 46.8 1.43 0.67 46.8

OM 2.51 1.17 46.8 2.51 1.17 46.8

CEC 290.63 69.34 23.9 290.63 2.315 69.34 23.9

pH field 6 0.2 6.00 0.20 3.3 6.00 0.20 3.3

pH 1:5 water 6.11 0.12 2.0 6.11 0.12 2.0

pH 1:5 CaCl2 5.03 0.13 2.5 5.03 0.13 2.5

pH buffering

capacity

31.33 6.65 21.2 31.33 6.65 21.2

Net acid

addition rate

2.5

Time for pH to

drop by 1 unit

24.06 4.33 18.0 24.06 4.33 18.0

Critical P level 8.56 2.45 28.6 8.56 2.45 28.6

P buffer

coefficient

0.24 0.05 23.0 0.24 0.05 23.0



26/33

Table 6

Predicted soil properties of an average soil using the soil inference system (Example 3)

Soil properties Measurement SD Primary Secondary

Estimate SDinput

% Errorinput

Estimate SDmodel

SDinput

% Error

Clay < 2 29 17.3 29.00 17.30 59.2 29.00 17.30 59.2

Silt 2 20 17 12 17.00 12.49 72.7 17.00 12.49 72.7

Silt 2 50 36.48 18.80 51.5 36.48 0.407 18.80 51.5

Sand 20 2000 54 21.5 54.00 21.50 40.1 54.00 21.50 40.1

Sand 50 2000 34.29 23.02 67.1 34.29 0.407 23.02 67.1

Bulk density 1.45 0.21 1.49 0.10 6.8 1.49 0.10 6.8

Particle density 2.65 0.05 2.65 0.05 1.9 2.65 0.05 1.9

Porosity 0.44 0.04 9.1 0.44 0.04 9.1

h 10 kPa 0.33 0.10 0.330 0.10 30.3 0.351 0.004 0.080 22.8h 33 kPa 0.29 0.10 0.290 0.10 34.5 0.309 0.003 0.083 26.8h 1500 kPa 0.18 0.09 0.180 0.09 51.4 0.186 0.003 0.070 37.8AWC 0.12 0.05 0.120 0.05 41.7 0.165 0.005 0.030 17.9

Air capacity

(at 10 kPa)0.108 0.11 103.0 0.087 0.063 72.1

Ks 12.68 12.68 12.68 12.68 100.0 9.406 1.138 37.33 396.8

Water retention

hr 0.05 0.08 0.052 0.028 54.8 0.018 0.041 226.1

hs 0.41 0.08 0.429 0.064 14.9 0.418 0.053 12.7

a 0.276 0.441 0.069 0.016 22.6 0.203 0.025 12.3n 1.215 1.264 1.175 0.081 6.9 1.150 0.061 5.3

h 10 kPa 0.402 0.080 20.0 0.361 0.077 21.4h 1500 kPa 0.219 0.077 35.1 0.187 0.075 40.2AWC 0.183 0.047 25.6 0.173 0.025 14.5

Tillage h

h optimal 0.330 0.058 17.5 0.320 0.053 16.5

h dry 0.296 0.091 30.7 0.288 0.086 29.8

h wet 0.369 0.059 16.1 0.359 0.053 14.6

h Tillage range 0.073 0.051 70.3 0.071 0.045 62.9

Soil resistance

c1 0.0011 0.0011

c2 3.82 3.82c3 6.66 6.66

SR at 10 kPa 0.41 1.50 366.6 0.82 2.119 1.427 173.8SR at 1500 kPa 4.15 15.21 366.4 9.22 2.423 12.817 138.9h at 3 MPa 0.25 0.088 35.1 0.25 0.087 34.8

LLWR

lower limit 0.33 0.067 20.2 0.34 0.070 20.7

upper limit 0.25 0.085 34.1 0.25 0.081 32.4

LLWR 0.08 0.107 134.2 0.09 0.075 85.2



27/33

2. Any better estimates of any of the variables would be filled in along with the

uncertainty, e.g. field pH or laboratory-measured clay. The system should have

standardised uncertainty estimates for routine laboratory methods.

3. The target variables are defined.

In the third example (Table 6), for illustrative purposes we will assume that we do not

have any information at all. The mean sand and clay contents from our Australian soil

database are 33% and 55%, with a mean organic carbon content of 13.8 g kg 1 (Baldock

and Skjemstad, 1999) and a pH of 5.9. As in the second example, we need two steps toestimate the soil properties. The estimate column contains average values of soil properties

from the Australian database. The first prediction takes input from the estimate column and

makes a prediction of all soil properties. The second prediction uses predicted values from

the first prediction (e.g. bulk density and CEC) and makes a prediction of all soil properties.

The resulting predictions show that the uncertainties are very high when using input

information that has very large variances. Uncertainty in the prediction of water retention

ranges from 15% to 40%. Some predictions have lower uncertainty when using the

average value from the soil database compared to using PTFs (such asKs), but most of the

other properties have lower uncertainties when predicted using PTFs.

The soil strength at wilting point is 9.2 MPa, which is much higher than the waterpotential itself. This average soil is within the training set of Da Silva and Kays PTF (Eq.

(9)), with outlier membership ofm* = 0.08; hence, the prediction is believed to be accurate.

Table 6 (continued)

Soil properties Measurement SD Primary Secondary

Estimate SD

input

% Error

input

Estimate SD

model

SD

input

% Error

Organic C

(Walkley &

Black)

1.38 1 1.38 1.00 72.2 1.38 1.00 72.2

Total C 1.80 1.10 61.3 1.80 1.10 61.3

OM 3.16 1.93 61.3 3.16 1.93 61.3

CEC 345 155 218.92 136.89 62.5 218.92 2.905 136.89 62.5

pH field 6.0 1 6.00 1.00 16.7 6.00 1.00 16.7

pH 1:5 water 6.11 0.62 10.1 6.11 0.62 10.1

pH 1:5 CaCl2 5.03 0.63 12.6 5.03 0.63 12.6

pH buffering

capacity

17.83 10.50 58.9 17.83 10.50 58.9

Net acid

addition rate

2.5

Time for pH to

drop by 1 unit

15.93 7.87 49.4 15.93 7.87 49.4

Critical P level 11.50 6.35 55.2 11.50 6.35 55.2

P buffer

coefficient

0.30 0.13 43.6 0.30 0.13 43.6



28/33

8. General discussion and conclusions

We have shown a first approach to a soil inference system. Using pedotransfer

functions as the knowledge rules for an inference engine, we can predict various importantphysical and chemical properties from the limited information we have. Although it is not

a full inference engine, we have built the essential frame. As a metaphor, by means of an

engine, a small force can be applied to move a much greater resistance, or load. In doing

so, however, the applied force must move through a much greater distance than it would if

it could move the load directly. The mechanical advantage (MA) of the engine is the factor

by which it multiplies any applied force. The inference engine has information advantage.

We foresee an inference system with a user interface that prompts questions such as

what soil properties do you have? The system should have a database of mean soil

properties, such as particle-size distribution and organic matter content, for different soil

types. The inference engine has knowledge rules that will determine what functions to use

realizing their uncertainties. The output will be the predicted physical and chemical

properties along with their uncertainties. This prediction can be based on classes, where

soil properties are obtained for different classes such as texture or soil type, or another

prediction based on measurements. This can be incorporated into a spatial framework, where

a point in space can be predicted from the neighbourhood basic soil properties or soil class.

Bouma (1989) defined pedotransfer functions in terms of data translation. We can

describe this translation function as information. This information, when properly and

logically conjoined, constitutes knowledge. Knowledge can generate various data. Soil

inference systems take measurements we know with a certain precision and infer proper-ties we do not know with given precision, by means of properly and logically conjoined

pedotransfer functions.

Most of the important soil physical properties can be predicted by PTFs, but only a few

soil chemical property PTFs exist and little work has been done on soil biological properties.

Acknowledgements

We thank the Australian Cotton Cooperative Research Centre for funding this work. We

dedicate this paper to Professor Johan Bouma for his prescience in recognizing and

formalizing the concept of pedotransfer functions.

References

Arya, L.M., Paris, J.F., 1981. A physicoempirical model to predict soil moisture characteristics from particle-size

distribution and bulk density data. Soil Science Society of America Journal 45, 10231030.

Baldock, J.A., Skjemstad, J.O., 1999. Soil organic carbon/soil organic matter. In: Peverill, K.I., Sparrow, L.A.,

Reuter, D.J. (Eds.), Soil Analysis: An Interpretation Manual. CSIRO Publishing, Collingwood, Australia, pp.

159170.

Batjes, N.H., 1996. Development of a world data set of soil water retention properties using pedotransfer rules.

Geoderma 71, 3152.

Bell, M.A., van Keulen, H., 1995. Soil pedotransfer functions for four Mexican soils. Soil Science Society of

America Journal 59, 865871.



29/33

Bloemen, G.W., 1977. Calculation of capillary conductivity and capillary rise from grainsize distribution. ICW

Wageningen note no. 952, 962, 1013.

Bloemen, G.W., 1980. Calculation of hydraulic conductivities from texture and organic matter content. Zeitschrift

fur Pflanzenernahrung und Bodenkunde 143, 581 605.

Bouma, J., 1989. Using soil survey data for quantitative land evaluation. Advances in Soil Science 9, 177

213.

Bouma, J., van Lanen, H.A.J., 1986. Transfer functions and threshold values: from soil characteristics to land

qualities. Quantified Land Evaluation Procedures, Proceedings of the International Workshop on Quantified

Land Evaluation Procedures, 27 April2 May, Washington, DC.

Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J., 1984. Classification and Regression Trees. Wadsworth,

Belmont, CA.

Briggs, L.J., McLane, J.W., 1907. The moisture equivalent of soils. USDA Bureau of Soils Bulletin 45, 123.

Briggs, L.J., Shantz, H.L., 1912. The wilting coefficient and its indirect measurement. Botanical Gazette 53, 20 37.

Canarache, A., Motoc, E., Dumitriu, R., 1968. Infiltration rate as related to hydraulic conductivity, moisture

deficit and other soil properties. In: Rijtema, P.E., Wassink, H. (Eds.), Water in the Unsaturated Zone.

Proceedings of the Wageningen Symposium, vol. 1, pp. 392401.Carsel, R.F., Parrish, R.S., 1988. Developing joint probability distributions of soil water retention characteristics.

Water Resources Research 24, 755769.

Clapp, R.B., Hornberger, G.M., 1978. Empirical equations for some soil hydraulic properties. Water Resources

Research 14, 601604.

Cornelis, W.M., Ronsyn, J., Van Meirvenne, M., Hartmann, R., 2001. Evaluation of pedotransfer functions for

predicting the soil moisture retention curve. Soil Science Society of America Journal 65, 638 648.

Cox, F.R., 1994. Current phosphorous availability indices: characteristics and shortcomings. In: Havlin, J.L.,

Jacobsen, J.S. (Eds.), Soil Testing: Prospects for Improving Nutrient Recommendation. SSSA Special Pub-

lications, vol. 40. Soil Science Society of America, Madison, WI, pp. 101113.

Crescimanno, G., Provenzano, G., 1999. Soil shrinkage characteristic curve in clay soils. Soil Science Society of


Cresswell, H.P., Paydar, Z., 1996. Water retention in Australian soils: I. Description and prediction using para-metric functions. Australian Journal of Soil Research 34, 195212.

Cresswell, H.P., Pierret, C., Brebner, P., Paydar, Z., 2000. The SH-Pro V1.03 Software for Predicting and

Analysing Soil Hydraulic Properties. CSIRO Land and Water, Canberra, Australia.

Dale, M.B., McBratney, A.B., Russell, J.S., 1989. On the role of expert systems and numerical taxonomy in soil

classification. Journal of Soil Science 40, 223234.

Da Silva, A., Kay, B.D., 1997. Estimating the least limiting water range of soils from properties and management.

Soil Science Society of America Journal 61, 877883.

De Gruijter, J.J., McBratney, A.B., 1988. A modified fuzzy k-means method for predictive classification. In:

Bock, H.H. (Ed.), Classification and Related Methods of Data Analysis. Elsevier, North-Holland.

De Jong, R., Reynolds, W.D., 1995. Methodology for predicting agrochemical leaching on a watershed basis.

Site-Specific Management for Agricultural Systems. Proceedings of Second International Conference, Min-

neapolis, 2730 March 1994. American Society of Agronomy, Madison, WI, pp. 795818.De Vries, D.A., 1966. Thermal properties of soils. In: van Wijk, W.R. (Ed.), Physics of Plant Environment, 2nd

edn. North-Holland, Amsterdam, pp. 231234.

De Wit, C.T., van Keulen, H., 1975. Simulation of Transport Processes in Soils, 2nd edn. Pudoc, Wageningen.

Dexter, A.R., Bird, N.R.A., 2001. Methods for predicting the optimum and the range of soil water contents for

tillage based on the water retention curve. Soil and Tillage Research 57, 203212.

Donatelli, M., Acutis, M., Kosovan, S., 1996. SOILPAR, Soil Parameters Estimate v1.1. ISA ModenaMiR-

AAF. DASGTUniv. Torino.

Droogers, P., Bouma, J., 1997. Soil survey input in exploratory modelling of sustainable soil management

practices. Soil Science Society of America Journal 61, 1704 1710.

Efron, B., Tibshirani, R.J., 1993. An introduction to the bootstrap. Monographs on Statistics and Applied

Probability, vol. 57. Chapman & Hall, London, UK.

Farrow, S.J., 1984. The GMDH algorithm. In: Farrow, S.J. (Ed.), Self-organizing Methods in Modeling: GMDH

Type Algorithms. Marcel-Dekker, New York, pp. 124.



30/33

Fernandez, R.N., Schulze, D.G., Coffin, D.L., van Scoyoc, G.E., 1988. Color, organic matter, and pesticide

adsorption relationships in a soil landscape. Soil Science Society of America Journal 52, 10231026.

Forrest, J.A., Beatty, J., Hignett, C.T., Pickering, J.H., Williams, R.G.P., 1985. A Survey of the Physical

Properties of Wheatland Soils in Eastern Australia. CSIRO Australia Division of Soils, Divisional Report

No. 78.

Friedman, J.H., 1991. Multivariate Adaptive Regression Splines. Annals of Statistics 19, 1141.

Gilkes, R.J., Hughes, J.C., 1994. Sodium fluoride pH of south-western Australian soils as an indicator of P-

sorption. Australian Journal of Soil Research 32, 755766.

Gregson, K., Hector, D.J., McGowan, M., 1987. A one-parameter model for the soil water characteristic. Journal

of Soil Science 38, 483486.

Gupta, S.C., Larson, W.E., 1979. Estimating soil water retention characteristics from particle size distribution,

organic matter content, and bulk density. Water Resources Research 15, 16331635.

Hall, D.G., Reeve, M.J., Thomasson, A.J., Wright, V.F., 1977. Water retention, porosity and density of field soils.

Technical Monograph, vol. 9. Soil Survey of England and Wales, Harpenden.

Harper, R.J., Gilkes, R.J., 1994. Soil attributes related to water repellency and the utility of soil survey for

predicting its occurrence. Australian Journal of Soil Research 32, 1109 1124.Haskett, J.D., Pachepsky, Y.A., Acock, B., 1995. Estimation of soybean yields at county and state level using

GLYCIM: a case study for Iowa. Agronomy Journal 87, 926931.

Hastie, T.J., 1992. Generalized additive models. In: Chambers, J.M., Hastie T.J. (Eds.). Statistical Models in S.

Wadsworth & Brooks/Cole Advanced Books & Software, Pacific Grove, CA, pp. 249307.

Haverkamp, R., Zammit, C., Bouraoui, F., Rajkai, K., Arrue, J.L., Heckman, N., 1997. GRIZZLY, Grenoble Soil

Catalogue. Soil survey of field data and description of particle size, soil water retention and hydraulic

conductivity functions. Laboratoire dEtude des Transfers en Hydrologie et Environnement, LTHE,

UMR5564, CNRS, INPG, ORSTOM, UJF, BP 53, 38041 Grenoble Cedex 09, France.

Helyar, K.R., Cregan, P.D., Godyn, D.L., 1990. Soil acidity in New South Walescurrent pH values and

estimates of acidification rate. Australian Journal of Soil Research 28, 523527.

Heuvelink, G.B.M., 1998. Error Propagation in Environmental Modelling with GIS. Taylor & Francis, London.

Jang, J.-S.R., 1993. ANFIS: Adaptive-network based fuzzy inference systems. IEEE Transactions on Systems,Man, and Cybernetics 23, 665685.

Jang, J.-S.R., 1997. Classification and regression trees. In: Jang, J.-S.R., Sun, C.-T., Mizutani, E. (Eds.), Neuro-

Fuzzy and Soft Computing. A Computational Approach to Learning and Machine Intelligence. Prentice Hall,

Upper Saddle River, NJ, pp. 403422.

Jones, C.A., 1984. Estimation of percent aluminium saturation from soil chemical data. Communications in Soil

Science and Plant Analysis 15, 327335.

Kleinman, P.J.A, Bryant, R.B., Reid, W.S., 1999. Development of pedotransfer functions to quantify phosphorus

saturation of agricultural soils. Journal of Environmental Quality 28, 20262030.

Lamp, J., Kneib, W., 1981. Zur quantitativen Erfassung und Bewertung von Pedofunktionen. Mitteilungen der

Deutschen Bodenkundlichen Gesellschaft 32, 695711.

Letey, J., 1985. Relationship between soil physical properties and crop production. Advances in Soil Science 1,

277294.Lilly, A., 2000. The relationship between field-saturated hydraulic conductivity and soil structure: development of

class pedotransfer functions. Soil Use and Management 16, 5660.

Little, I.P., 1992. The relationship between soil pH measurements in calcium chloride and water suspensions.

Australian Journal of Soil Research 30, 587592.

Mathsoft, 1999. S-Plus 2000 Users Guide. Data Analysis Products Division, Mathsoft, Seattle, WA.

Mathworks, 1998. Fuzzy Logic Toolbox Users Guide. The Mathworks, Natick, MA.

Mayr, T., Jarvis, N.J., 1999. Pedotransfer functions to estimate soil water retention parameters for a modified

BrooksCorey type model. Geoderma 91, 19.

Mbagwu, J.S.C., Abeh, O.G., 1998. Prediction of engineering properties of tropical soils using intrinsic pedo-

logical parameters. Soil Science 163, 93102.

McBride, R.A., Joosse, P.J., 1996. Overconsolidation in agricultural soils: II. Pedotransfer functions for estimat-

ing preconsolidation stress. Soil Science Society of America Journal 60, 373380.

McCullagh, P., Nelder, J.A., 1989. Generalized Linear Models. Chapman & Hall, London.



31/33

McGarry, D., Ward, W.T., McBratney, A.B., 1989. Soil Studies in the Lower Namoi Valley: Methods and Data.

The Edgeroi Data Set. CSIRO Division of Soils, Glen Osmond, South Australia.

McKenzie, N.J., Jacquier, D.W., 1997. Improving the field estimation of saturated hydraulic conductivity in soil

survey. Australian Journal of Soil Research 35, 803825.

McQueen, I.S., Miller, R.F., 1974. Approximating soil moisture characteristics from limited data: empirical

evidence and tentative model. Water Resources Research 10, 521527.

Minasny, B., McBratney, A.B., 2000. Hydraulic conductivity pedotransfer functions for Australian soil. Austral-

ian Journal of Soil Research 38, 905926.

Minasny, B., McBratney, A.B., 2001. The Australian soil texture boomerang: a comparison of the Australian

USDA/FAO soil particle-size classification systems. Australian Journal of Soil Research 39, 14431451.

Minasny, B., McBratney, A.B., 2002a. The efficiency of various approaches to obtaining estimates of soil

hydraulic properties. Geoderma 107, 5570.

Minasny, B., McBratney, A.B., 2002b. The neuro-m method for fitting neural-network parametric pedotransfer

functions. Soil Science Society of America Journal 66, 352361.

Minasny, B., McBratney, A.B., Bristow, K.L., 1999. Comparison of different approaches to the development of

pedotransfer functions for water-retention curves. Geoderma 93, 225 253.Nemes, A., Schaap, M.G., Leij, F.J., Wosten, J.H.M., 2001. Description of the unsaturated soil hydraulic database

UNSODA version 2.0. Journal of Hydrology 251, 151162.

Nielsen, D.R., Shaw, R.H., 1958. Estimation of the 15-atmosphere moisture percentage from hydrometer data.

Soil Science 86, 103105.

Noble, A.D., Cannon, M., Muller, D., 1997. Evidence of accelerated soil acidification under Stylosanthes-

dominated pastures. Australian Journal of Soil Research 35, 13091322.

Pachepsky, Ya., Rawls, W.J., 1999. Accuracy and reliability of pedotransfer functions as affected by grouping

soils. Soil Science Society of America Journal 63, 17481756.

Pachepsky, Ya., Timlin, D.J., Ahuja, L.R., 1999a. Estimating saturated soil hydraulic conductivity using water

retention data and neural networks. Soil Science 164, 552560.

Pachepsky, Ya., Timlin, D.J., Ahuja, L.R., 1999b. The current status of pedotransfer functions: their accuracy,

reliability and utility in field- and regional-scale modeling. In: Corwin, D.L., Loage, K., Ellsworth, T.R.(Eds.), Assessment of Non-Point Source Pollution in Vadose Zone. Geophysical Monograph, vol. 108.

American Geophysical Union, Washington, DC, pp. 223234.

Palisade, 2000. @RISK Version 4. Palisade, NY, USA.

Petach, M.C., Wagenet, R.J., DeGloria, S.D., 1991. Regional water flow and pesticide leaching using simulations

with spatially distributed data. Geoderma 48, 245 269.

Quinlan, J.R., 1992. Learning with continuous classes. Proceedings of the 5th Australian Joint Conference on

Artificial Intelligence. World Scientific, Singapore, pp. 343348.

Rasiah, V., 1995. Comparison of pedotransfer functions to predict nitrogen-mineralization parameters of one- and

two-pool models. Communications in Soil Science and Plant Analysis 26, 18731884.

Rasiah, V., Kay, B.D., 1994. Characterizing changes in aggregate stability subsequent to introduction of forages.

Soil Science Society of America Journal 58, 935942.

Rawls, W.J., 1983. Estimating soil bulk density from particle size analysis and organic matter content. SoilScience 135, 123125.

Rawls, W.J., Brakensiek, D.L., Saxton, K.E., 1982. Estimation of soil water properties. Transactions of the ASAE

25, 13161320, 1328.

Rawls, W.J., Gish, T.J., Brakensiek, D.L., 1991. Estimating soil water retention from soil physical properties and

characteristics. Advances in Soil Science 16, 213234.

RuleQuest Research, 2000. Cubist v. 1.09. Rulequest Research, Sydney, Australia.

Salter, P.J., Williams, J.B., 1965a. The influence of texture on the moisture characteristics of soils: I. A critical

comparison for determining the available water capacity and moisture characteristics curve of a soil. Journal

of Soil Science 16, 115.

Salter, P.J., Williams, J.B., 1965b. The influence of texture on the moisture characteristics of soils: II Available

water capacity and moisture release characteristics. Journal of Soil Science 16, 310317.

Salter, P.J., Williams, J.B., 1967. The influence of texture on the moisture characteristics of soils: IV. A method of

estimating available water capacities of profiles in the field. Journal of Soil Science 18, 174181.



32/33

Salter, P.J., Williams, J.B., 1969. The influence of texture on the moisture characteristics of soils: V. Relationships

between particle-size composition and moisture contents at the upper and lower limits of available water.

Journal of Soil Science 20, 126131.

Salter, P.J., Berry, G., Williams, J.B., 1966. The influence of texture on the moisture characteristics of soils: III.

Quantitative relationships between particle size, composition and available water capacity. Journal of Soil

Science 17, 93 98.

Schaap, M.G., Leij, F.L., van Genuchten, M.Th., 1998. Neural network analysis for hierarchical prediction of soil

hydraulic properties. Soil Science Society of America Journal 62, 847855.

Scheinost, A.C., Schwertmann, U., 1995. Predicting phosphate adsorptiondesorption in a soilscape. Soil Sci-

ence Society of America Journal 59, 15751580.

Scheinost, A.C., Sinowski, W., Auerswald, K., 1997a. Regionalization of soil buffering functions: a new concept

applied to K/Ca exchange curves. Advances in GeoEcology 30, 2338.

Scheinost, A.C., Sinowski, W., Auerswald, K., 1997b. Regionalization of soil water retention curves in a highly

variable soilscape: I. Developing a new pedotransfer function. Geoderma 78, 129143.

Schug, B., Hoss, T., During, R.A., 1999. Regionalization of sorption capacities for arsenic and cadmium. Plant

and Soil 213, 181187.Shirazi, M.A., Boersma, L., 1984. A unifying quantitative analysis of soil texture. Soil Science Society of


Slattery, W.J., Edwards, D.G., Bell, L.C., Conventry, D.R., 1995. Liming effects on soil exchangeable and soil

cations of four soil types in north-eastern Victoria. Australian Journal of Soil Research 33, 277295.

Stirk, G.B., 1957. Physical properties of soils of the lower Burdekin valley, North Queensland. CSIRO Division

of Soils Divisional Report 1/57. CSIRO, Australia.

Soutter, M., Pannatier, Y., 1996. Groundwater vulnerability to pesticide contamination on a regional scale. Journal

of Environmental Quality 25, 439444.

Springob, G., Bottcher, J., 1998. Parameterization and regionalization of Cd sorption characteristics of sandy

soils: II. Regionalization: Freundlich k estimates by pedotransfer functions. Zeitschrift fur Pflanzenernahrung

und Bodenkunde 161, 689 696.

Takagi, G.T., Sugeno, M., 1985. Fuzzy identification of systems and its applications to modeling and control.IEEE Transactions on Systems, Man, and Cybernetics 15, 116132.

Tiktak, A., Leijnse, A., Vissenberg, H., 1999. Uncertainty in a regional-scale assessment of cadmium accumu-

lation in the Netherlands. Journal of Environmental Quality 28, 461470.

Timlin, D.J., Pachepsky, Ya.A., Acock, B., Whisler, F., 1996. Indirect estimation of soil hydraulic properties to

predict soybean yield using GLYCIM. Agricultural Systems 52, 331 353.

Torrent, J., Schwertmann, U., Fechter, H., Alferez, F., 1983. Quantitative relationships between soil color and

hematite content. Soil Science 136, 354358.

Torri, D., Poesen, J., Borselli, L., 1997. Predictability and uncertainty of the soil erodibility factor using a global

dataset. Catena 31, 122.

Ungaro, F., Calzolari, C., 2001. Using existing soil databases for estimating retention properties for soils of the

Pianura Padano-Veneta region of North Italy. Geoderma 99, 99121.

van de Genachte, G., Mallants, D., Ramos, J., Deckers, J.A., Feyen, J., 1996. Estimating infiltration parametersfrom basic soil properties. Hydrological Processes 10, 687 701.

van Genuchten, M.Th., 1980. A closed-form equation for predicting the hydraulic conductivity of unsaturated

soils. Soil Science Society of America Journal 44, 892898.

Veihmeyer, F.J., Hendrikson, A.H., 1927. The relation of soil moisture to cultivation and plant growth. Proc. 1st

Int. Cong. Soil Science Washington, vol. 3, pp. 498513.

Wagner, B., Tarnawski, V.R., Hennings, V., Muller, U., Wessolek, G., Plagge, R., 2001. Evaluation of pedo-transfer

functions for unsaturated soil hydraulic conductivity using an independent data set. Geoderma 102, 275 297.

Williams, J., Prebble, J.E., Williams, W.T., Hignett, C.T., 1983. The influence of texture, structure and clay

mineralogy on the soil moisture characteristic. Australian Journal of Soil Research 21, 1532.

Williams, J., Ross, P.J., Bristow, K.L., 1992. Prediction of the Campbell water retention function from texture,

structure and organic matter. In: van Genuchten, M.Th., Leij, F.J., Lund, L.J. (Eds.), Proceedings of the

International Workshop on Indirect Methods for Estimating the Hydraulic Properties of Unsaturated Soils.

University of California, Riverside, CA, pp. 427441.



33/33

Wosten, J.H.M., 1997. Pedotransfer functions to evaluate soil quality. In: Gregorich, E.G., Carter, M.R. (Eds.),

Soil Quality for Crop Production and Ecosystem Health. Developments in Soil Science, vol. 25. Elsevier,

Amsterdam, pp. 221245.

Wosten, J.H.M., Finke, P.A., Jansen, M.J.W., 1995. Comparison of class and continuous pedotransfer functions to

generate soil hydraulic characteristics. Geoderma 66, 227 237.

Wosten, J.H.M., Lilly, A., Nemes, A., Le Bas, C., 1999. Development and use of a database of hydraulic

properties of European soils. Geoderma 90, 169 185.

Wosten, J.H.M., Pachepsky, Y.A., Rawls, W.J., 2001. Pedotransfer functions: bridging gap between available

basic soil data and missing soil hydraulic characteristics. Journal of Hydrology 251, 123 150.


2002- mcbratney- from pedotransfer functions to soil inference systems

Documents