utilization of gaussian process regression for determination of soil electrical resistivity

ORIGINAL PAPER

Utilization of Gaussian Process Regressionfor Determination of Soil Electrical Resistivity

Pijush Samui

Received: 20 February 2013 / Accepted: 5 October 2013 / Published online: 11 October 2013

� Springer Science+Business Media Dordrecht 2013

Abstract Soil electrical resistivity (RE) is an impor-

tant parameter for geotechnical engineering projects.

This article employs Gaussian process regression

(GPR) for prediction of RE of soil based on soil

thermal resistivity (RT), percentage sum of the gravel

and sand size fractions (F), and degree of saturation

(Sr). GPR is derived from the perspective of Bayesian

nonparametric regression. Two models (Model I and

Model II) have been developed. The developed GPR

has been compared with the artificial neural network.

It gives the variance of the predicted RE. The results

show the developed GPR is an efficient tool for

prediction of RE of soil.

Keywords Soil electrical resistivity �Gaussian

process regression � Artificial neural network �Variance � Soil thermal resistivity

1 Introduction

Geotechnical engineers use soil electrical resistivity

(RE) for prediction of different soil properties such as

soil water content, degree of compaction, saturation,

liquefaction potential, soil salinity, etc. (McCollum and

Logan 1930; Shea and Luthin 1961; Tagg 1964;

Butterfield and Jhonston 1980; Ronald and Ronald

1982; Schultz et al. 1984; McCarter 1984; Mazac et al.

1990; Kalinski and Kelly 1993; Gunnink and El-

Jayyousi 1993). RE depends on many parameters such

as the frequency of the current used, geometry and type

of the electrodes used, resistivity of water, soil type,

porosity, temperature, organic mater, etc. If the water

content of soil increases, then the value of RE of soil

drops. The value of RE is reduced for well graded soil. If

the density of soil increases, then the value of RE of soil

decreases. Therefore, the determination of RE is an

essential task for geotechnical engineers. The determi-

nation of RE is a difficult task (Abu-Hassanein 1994).

Geotechnical engineers use different correlations for

determination of RE (Singh et al. 2001; Sreedeep et al.

2005). The available corrections have own his limita-

tions (Erzin et al. 2010). Recently, Erzin et al. (2010)

successfully adopted artificial neural network (ANN)

for determination RE based on soil thermal resistivity

(RT). However, ANN model has some limitations such

as black box approach, low generalization capability,

overtraining problem, slow convergence speed, etc.

(Park and Rilett 1999; Kecman 2001).

This study will examine the capability of Gaussian

process regression (GPR) for prediction of RE of soil

based on (RT), percentage sum of the gravel and sand

size fractions (F), and degree of saturation (Sr). GPR is a

probabilistic and non-parametric model (Azman and

Kocijan 2007). It infers the parameter from the given

P. Samui (&)

Centre for Disaster Mitigation and Management, VIT

University, Vellore 632014, India

e-mail: [email protected]

123

Geotech Geol Eng (2014) 32:191–195

DOI 10.1007/s10706-013-9705-8

training dataset. It has been successfully adopted for

solving different problems in engineering (Bazi and

Melgani 2010; Kim et al. 2011; Zhang et al. 2012). Two

models (Model I and Model II) have been developed. In

Model I, input variables are F and RT. Whereas, Model II

uses F, Sr and RT as input. The results of GPR have been

compared with the ANN model. The developed GPR

gives the variance of the predicted RE. The article is

organized as follows: Sect. 2 gives the details of GPR.

The results and discussion have been described in Sect.

3. The conclusion has been drawn in Sect. 4.

2 Details of GPR

This section will describe the GPR for prediction of RE

of soil. Let us consider the following sample.

L ¼ xi; yið Þf gMi¼1; xi 2 RN and yi 2 R ð1Þ

where x is input, y is output, RN is N-dimensional

vector space and R is one dimensional vector space.

For Model I, input variables are F and RT. So, x ¼F;RT½ � and y ¼ RE½ �.

For Model II, input variables are F, Sr and RT. So,

x ¼ F; Sr;RT½ � and y ¼ RE½ �.In GPR, the relation between latent function f(xi)

and output variable(yi) is given below:

yi ¼ f xið Þ þ ei ð2Þ

where ei is Gaussian noise with zero mean and

variance r2 (Rasmussen and Williams 2006). The

predictive distribution of yM?1 corresponding to a new

given input xM?1 is given by the following expression.

y

yMþ1

� ��N 0;KMþ1ð Þ ð3Þ

with covariance matrix.

KMþ1 ¼K½ � K xMþ1ð Þ½ �

K xMþ1ð ÞT� �

k xMþ1ð Þ½ �

� �ð4Þ

where K is covariance matrix, K(xM?1) are covari-

ances between training inputs and the test input and

k(xM?1) is the autocovariance of the test input.

The distribution of yM?1 is Gaussian with mean and

variance:

l ¼ k xMþ1ð ÞT K�1y ð5Þ

r2 ¼ k xMþ1ð Þ � k xMþ1ð ÞT K�1k xMþ1ð Þ ð6ÞThe hyperparameters of a GPR model and their

optimal value for the data set have been derived by

maximizing the log marginal likelihood.

The above GPR model has been used to determine

RE of soil.

To develop the GPR, the datasets have been divided

into the following groups:

Training Dataset: This is used to develop the GPR.

This article adopts 165 datasets out of 236 datasets as

training dataset.

Testing Dataset: This is used to verify the devel-

oped GPR. The remaining 71 datasets have been used

as testing dataset.

The data are normalized between 0 and 1. The

following formula has been adopted for normalization.

dnormalized ¼d � dminð Þ

dmax � dminð Þ ð7Þ

where d = any data (input or output), dmin = mini-

mum value of the entire dataset, dmax = maximum

value of the entire dataset, and dnormalized = normal-

ized value of the data. Radial basis function

(exp � xi�xð Þ xi�xð ÞT2r2

n o, where r is width of radial basis

function) has been used as covariance function. The

program of GPR has been developed by MATLAB.

3 Results and Discussion

For GPR, the design values of r and e have been

determined by trail and error approach. For Model I,

the design values of r and e are 0.3 and 0.01

respectively. Figure 1 illustrates the performance of

training dataset. The performance of testing dataset

has been shown in Fig. 2. The performance of

developed GPR has been assessed in terms of coef-

ficient of correlation (R) value. For a good model, the

value of R should be close to one. It is observed from

Figs. 1 and 2 that the value of R is close to one for

training as well as testing dataset.

For Model II, the design values of r and e are 0.1

and 0.02 respectively. Figures 1 and 2 depicts the

performance of training and testing dataset respec-

tively. In case of Model II, the value of R is close to

one for training as well as testing dataset. So, the

192 Geotech Geol Eng (2014) 32:191–195

123

developed GPR predicts RE reasonable well for Model

I and II.

A comparative study has been carried out between

the developed GPR and ANN model developed by

Erzin et al. (2010). Comparison has been done in terms

of root mean square error (RMSE) and mean absolute

error (MAE). RMSE and MAE have been determined

by using the following equation:

RMSE ¼

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPni¼1

REai � REpi

2

n

vuuutð8Þ

MAE ¼

Pni¼1

REai � REpi

�� n

ð9Þ

where REai and REpi are the actual and predicted RE

values, respectively, and n is number of data points.

Figure 3 shows the bar chart of RMSE and MAE of

the GPR and ANN models. It is observed from Fig. 3

that the developed GPR produces minimum RMSE

and MAE values compare to the ANN. So, the

developed GPR outperforms the ANN model. GPR

uses two tuning parameters (r and e). Whereas, ANN

uses many tuning parameters such as number of

hidden layers, number of neurons, transfer function,

number of epochs, etc.

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.2 0.4 0.6 0.8 1

Actual Normalized RE

Pre

dict

ed N

orm

aliz

ed R

E

MODEL I(R=0.992)

MODEL II(R=0.997)Actual=Predicted

Fig. 1 Performance of training dataset for Model I and II

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.2 0.4 0.6 0.8 1

Actual Normalized RE

Pre

dict

ed N

orm

aliz

ed R

E MODEL I(R=0.991)MODEL II(R=0.997)Actual=Predicted

Fig. 2 Performance of testing dataset for Model I and II

0

2

4

6

8

10

12

14

RMSE

RMSE

MAE

MAE

MODEL I

MODEL II

ANNGPR

RM

SE(

m)

or M

AE

(m

)

Fig. 3 Comparison between the ANN and GPR models

0.3

0.35

0.4

0.45

0.5

0.55

0.6

0.65

1 51 101 151

Training Dataset

Var

ianc

e

Fig. 4 Variance of training dataset for Model I

0.005

0.0055

0.006

0.0065

0.007

0.0075

0.008

0.0085

0.009

0.0095

0.01

1 11 21 31 41 51 61 71

Testing Dataset

Var

ianc

e

Fig. 5 Variance of testing dataset for Model I

Geotech Geol Eng (2014) 32:191–195 193

123

The developed GPR gives the variance of the

predicted RE. For Model I, Figs. 4 and 5 depict the

variance of training and testing dataset respectively. In

case of Model II, the values of variance of training and

testing datasets have been shown in Figs. 6 and 7

respectively. The predicted variance can be used to

determine the uncertainty.

4 Conclusion

This study presents GPR for prediction of RE of soil.

Two models (Model I and Model II) have been

developed. The developed GPR gives acceptable

result. It outperforms the ANN model. It also gives

the risk of prediction in terms of variance. The

obtained variance can be also used for validation. The

performance of Model I and II is almost same. It can be

used as a quick tool for determination of RE of soil.

The success of GPR model depends on quality of

dataset. The developed GPR allows the inclusion of

covariates in both the covariance structure and the

mean structure. This article shows that GPR can be

used for solving different problems in geotechnical

engineering.

References

Abu-Hassanein ZS (1994) Use of electrical resistivity mea-

surement as a quality control tool for compacted clay lin-

ers. M.S. Thesis, University of Wisconsin, Madison

Azman K, Kocijan J (2007) Application of Gaussian processes

for black-box modelling of biosystems. ISA Trans

46(4):443–457

Bazi Y, Melgani F (2010) Semisupervised Gaussian process

regression for biophysical parameter estimation. Int Geosci

Remote Sens Symp (IGARSS) 5652686:4248–4251

Butterfield R, Jhonston IW (1980) The influence of electro-

osmosis on metallic piles in clay. Geotechnique 30(1):17–38

Erzin Y, Rao BH, Patel A, Gumaste SD, Gupta K, Singh DN

(2010) Artificial neural network models for predicting of

electrical resistivity of soils from their thermal resistivity.

Int J Therm Sci 49:118–130

Gunnink BW, El-Jayyousi J (1993) Soil-fabric measurement

using conduction phase porosimetry. J Geotech Eng Div

119(6):1019–1035

Kalinski RJ, Kelly WE (1993) Estimating water content of soils

from electrical resistivity. Geotech Test J 16(3):323–329

Kecman V (2001) Leaming and soft computing: support vector

machines, neural networks and Fuzzy logic models. The

MIT press, Cambridge

Kim K, Lee D, Essa I (2011) Gaussian process regression flow

for analysis of motion trajectories. Proc IEEE Int Conf

Comput Vis 6126365:1164–1171

Mazac O, Cislerove M, Kelly WE, Landa I, Venhodova D

(1990) Determination of hydraulic conductivities by sur-

face geoelectrical methods, In: Ward S (ed) Geotechnical

and environmental geophysics, Soc. Explor. Geophysics,

vol 2, pp 125–131

McCarter W (1984) The electrical resistivity characteristics of

compacted clays. Geotechnique 34(2):263–267

McCollum B, Logan KH (1930) Electrolytic corrosion of iron in

soils. Bureau of Standards, Technologic Paper, pp 24

Park D, Rilett LR (1999) Forecasting freeway link ravel times

with a multi-layer feed forward neural network. Comput

Aided Civ Znfa Struct Eng 14:358–367

Rasmussen CE, Williams CK (2006) Gaussian processes for

machine learning. MIT-Press, Cambridge

Ronald AE, Ronald CG (1982) Electrical resistivity used to

measure liquefaction of sand. J Geotech Eng 108(GT5):

779–783

Schultz DW, Duff BM, Peters WR (1984) Performance of an

electrical resistivity technique for detecting and locating

geomembrane failures. International Conference on Geo-

membranes, Denver, pp 445–449

Shea PF, Luthin JN (1961) An investigation of the use of the

four electrode probe for measuring soil salinity in situ. Soil

Sci 92:331–339

0.01

0.011

0.012

0.013

0.014

0.015

0.016

0.017

0.018

0.019

0.02

1 51 101 151

Training Dataset

Var

ianc

e

Fig. 6 Variance of training dataset for Model II

0.045

0.05

0.055

0.06

0.065

0.07

0.075

0.08

1 11 21 31 41 51 61 71

Testing Dataset

Var

ianc

e

Fig. 7 Variance of testing dataset for Model II

194 Geotech Geol Eng (2014) 32:191–195

123

Singh DN, Kuriyan SJ, Manthena KC (2001) A generalized

relationships between soil electrical and thermal resistivi-

ties. Exp Thermal Fluid Sci 25:175–181

Sreedeep S, Reshma AC, Singh DN (2005) Generalized rela-

tionship for determining soil electrical resistivity from its

thermal resistivity. Exp Thermal Fluid Sci 29:217–226

Tagg GF (1964) Earth resistances. Newnes, London

Zhang Y, Su GS, Yan LB (2012) Analysis of slope stability

based on Gaussian Process Regression. Appl Mech Mat

170–173:1330–1333

Geotech Geol Eng (2014) 32:191–195 195

123

utilization of gaussian process regression for determination of soil electrical resistivity

Documents