utilization of gaussian process regression for determination of soil electrical resistivity
TRANSCRIPT
ORIGINAL PAPER
Utilization of Gaussian Process Regressionfor Determination of Soil Electrical Resistivity
Pijush Samui
Received: 20 February 2013 / Accepted: 5 October 2013 / Published online: 11 October 2013
� Springer Science+Business Media Dordrecht 2013
Abstract Soil electrical resistivity (RE) is an impor-
tant parameter for geotechnical engineering projects.
This article employs Gaussian process regression
(GPR) for prediction of RE of soil based on soil
thermal resistivity (RT), percentage sum of the gravel
and sand size fractions (F), and degree of saturation
(Sr). GPR is derived from the perspective of Bayesian
nonparametric regression. Two models (Model I and
Model II) have been developed. The developed GPR
has been compared with the artificial neural network.
It gives the variance of the predicted RE. The results
show the developed GPR is an efficient tool for
prediction of RE of soil.
Keywords Soil electrical resistivity �Gaussian
process regression � Artificial neural network �Variance � Soil thermal resistivity
1 Introduction
Geotechnical engineers use soil electrical resistivity
(RE) for prediction of different soil properties such as
soil water content, degree of compaction, saturation,
liquefaction potential, soil salinity, etc. (McCollum and
Logan 1930; Shea and Luthin 1961; Tagg 1964;
Butterfield and Jhonston 1980; Ronald and Ronald
1982; Schultz et al. 1984; McCarter 1984; Mazac et al.
1990; Kalinski and Kelly 1993; Gunnink and El-
Jayyousi 1993). RE depends on many parameters such
as the frequency of the current used, geometry and type
of the electrodes used, resistivity of water, soil type,
porosity, temperature, organic mater, etc. If the water
content of soil increases, then the value of RE of soil
drops. The value of RE is reduced for well graded soil. If
the density of soil increases, then the value of RE of soil
decreases. Therefore, the determination of RE is an
essential task for geotechnical engineers. The determi-
nation of RE is a difficult task (Abu-Hassanein 1994).
Geotechnical engineers use different correlations for
determination of RE (Singh et al. 2001; Sreedeep et al.
2005). The available corrections have own his limita-
tions (Erzin et al. 2010). Recently, Erzin et al. (2010)
successfully adopted artificial neural network (ANN)
for determination RE based on soil thermal resistivity
(RT). However, ANN model has some limitations such
as black box approach, low generalization capability,
overtraining problem, slow convergence speed, etc.
(Park and Rilett 1999; Kecman 2001).
This study will examine the capability of Gaussian
process regression (GPR) for prediction of RE of soil
based on (RT), percentage sum of the gravel and sand
size fractions (F), and degree of saturation (Sr). GPR is a
probabilistic and non-parametric model (Azman and
Kocijan 2007). It infers the parameter from the given
P. Samui (&)
Centre for Disaster Mitigation and Management, VIT
University, Vellore 632014, India
e-mail: [email protected]
123
Geotech Geol Eng (2014) 32:191–195
DOI 10.1007/s10706-013-9705-8
training dataset. It has been successfully adopted for
solving different problems in engineering (Bazi and
Melgani 2010; Kim et al. 2011; Zhang et al. 2012). Two
models (Model I and Model II) have been developed. In
Model I, input variables are F and RT. Whereas, Model II
uses F, Sr and RT as input. The results of GPR have been
compared with the ANN model. The developed GPR
gives the variance of the predicted RE. The article is
organized as follows: Sect. 2 gives the details of GPR.
The results and discussion have been described in Sect.
3. The conclusion has been drawn in Sect. 4.
2 Details of GPR
This section will describe the GPR for prediction of RE
of soil. Let us consider the following sample.
L ¼ xi; yið Þf gMi¼1; xi 2 RN and yi 2 R ð1Þ
where x is input, y is output, RN is N-dimensional
vector space and R is one dimensional vector space.
For Model I, input variables are F and RT. So, x ¼F;RT½ � and y ¼ RE½ �.
For Model II, input variables are F, Sr and RT. So,
x ¼ F; Sr;RT½ � and y ¼ RE½ �.In GPR, the relation between latent function f(xi)
and output variable(yi) is given below:
yi ¼ f xið Þ þ ei ð2Þ
where ei is Gaussian noise with zero mean and
variance r2 (Rasmussen and Williams 2006). The
predictive distribution of yM?1 corresponding to a new
given input xM?1 is given by the following expression.
y
yMþ1
� ��N 0;KMþ1ð Þ ð3Þ
with covariance matrix.
KMþ1 ¼K½ � K xMþ1ð Þ½ �
K xMþ1ð ÞT� �
k xMþ1ð Þ½ �
� �ð4Þ
where K is covariance matrix, K(xM?1) are covari-
ances between training inputs and the test input and
k(xM?1) is the autocovariance of the test input.
The distribution of yM?1 is Gaussian with mean and
variance:
l ¼ k xMþ1ð ÞT K�1y ð5Þ
r2 ¼ k xMþ1ð Þ � k xMþ1ð ÞT K�1k xMþ1ð Þ ð6ÞThe hyperparameters of a GPR model and their
optimal value for the data set have been derived by
maximizing the log marginal likelihood.
The above GPR model has been used to determine
RE of soil.
To develop the GPR, the datasets have been divided
into the following groups:
Training Dataset: This is used to develop the GPR.
This article adopts 165 datasets out of 236 datasets as
training dataset.
Testing Dataset: This is used to verify the devel-
oped GPR. The remaining 71 datasets have been used
as testing dataset.
The data are normalized between 0 and 1. The
following formula has been adopted for normalization.
dnormalized ¼d � dminð Þ
dmax � dminð Þ ð7Þ
where d = any data (input or output), dmin = mini-
mum value of the entire dataset, dmax = maximum
value of the entire dataset, and dnormalized = normal-
ized value of the data. Radial basis function
(exp � xi�xð Þ xi�xð ÞT2r2
n o, where r is width of radial basis
function) has been used as covariance function. The
program of GPR has been developed by MATLAB.
3 Results and Discussion
For GPR, the design values of r and e have been
determined by trail and error approach. For Model I,
the design values of r and e are 0.3 and 0.01
respectively. Figure 1 illustrates the performance of
training dataset. The performance of testing dataset
has been shown in Fig. 2. The performance of
developed GPR has been assessed in terms of coef-
ficient of correlation (R) value. For a good model, the
value of R should be close to one. It is observed from
Figs. 1 and 2 that the value of R is close to one for
training as well as testing dataset.
For Model II, the design values of r and e are 0.1
and 0.02 respectively. Figures 1 and 2 depicts the
performance of training and testing dataset respec-
tively. In case of Model II, the value of R is close to
one for training as well as testing dataset. So, the
192 Geotech Geol Eng (2014) 32:191–195
123
developed GPR predicts RE reasonable well for Model
I and II.
A comparative study has been carried out between
the developed GPR and ANN model developed by
Erzin et al. (2010). Comparison has been done in terms
of root mean square error (RMSE) and mean absolute
error (MAE). RMSE and MAE have been determined
by using the following equation:
RMSE ¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPni¼1
REai � REpi
2
n
vuuutð8Þ
MAE ¼
Pni¼1
REai � REpi
�� ��n
ð9Þ
where REai and REpi are the actual and predicted RE
values, respectively, and n is number of data points.
Figure 3 shows the bar chart of RMSE and MAE of
the GPR and ANN models. It is observed from Fig. 3
that the developed GPR produces minimum RMSE
and MAE values compare to the ANN. So, the
developed GPR outperforms the ANN model. GPR
uses two tuning parameters (r and e). Whereas, ANN
uses many tuning parameters such as number of
hidden layers, number of neurons, transfer function,
number of epochs, etc.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.2 0.4 0.6 0.8 1
Actual Normalized RE
Pre
dict
ed N
orm
aliz
ed R
E
MODEL I(R=0.992)
MODEL II(R=0.997)Actual=Predicted
Fig. 1 Performance of training dataset for Model I and II
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.2 0.4 0.6 0.8 1
Actual Normalized RE
Pre
dict
ed N
orm
aliz
ed R
E MODEL I(R=0.991)MODEL II(R=0.997)Actual=Predicted
Fig. 2 Performance of testing dataset for Model I and II
0
2
4
6
8
10
12
14
RMSE
RMSE
MAE
MAE
MODEL I
MODEL II
ANNGPR
RM
SE(
m)
or M
AE
(m
)
Fig. 3 Comparison between the ANN and GPR models
0.3
0.35
0.4
0.45
0.5
0.55
0.6
0.65
1 51 101 151
Training Dataset
Var
ianc
e
Fig. 4 Variance of training dataset for Model I
0.005
0.0055
0.006
0.0065
0.007
0.0075
0.008
0.0085
0.009
0.0095
0.01
1 11 21 31 41 51 61 71
Testing Dataset
Var
ianc
e
Fig. 5 Variance of testing dataset for Model I
Geotech Geol Eng (2014) 32:191–195 193
123
The developed GPR gives the variance of the
predicted RE. For Model I, Figs. 4 and 5 depict the
variance of training and testing dataset respectively. In
case of Model II, the values of variance of training and
testing datasets have been shown in Figs. 6 and 7
respectively. The predicted variance can be used to
determine the uncertainty.
4 Conclusion
This study presents GPR for prediction of RE of soil.
Two models (Model I and Model II) have been
developed. The developed GPR gives acceptable
result. It outperforms the ANN model. It also gives
the risk of prediction in terms of variance. The
obtained variance can be also used for validation. The
performance of Model I and II is almost same. It can be
used as a quick tool for determination of RE of soil.
The success of GPR model depends on quality of
dataset. The developed GPR allows the inclusion of
covariates in both the covariance structure and the
mean structure. This article shows that GPR can be
used for solving different problems in geotechnical
engineering.
References
Abu-Hassanein ZS (1994) Use of electrical resistivity mea-
surement as a quality control tool for compacted clay lin-
ers. M.S. Thesis, University of Wisconsin, Madison
Azman K, Kocijan J (2007) Application of Gaussian processes
for black-box modelling of biosystems. ISA Trans
46(4):443–457
Bazi Y, Melgani F (2010) Semisupervised Gaussian process
regression for biophysical parameter estimation. Int Geosci
Remote Sens Symp (IGARSS) 5652686:4248–4251
Butterfield R, Jhonston IW (1980) The influence of electro-
osmosis on metallic piles in clay. Geotechnique 30(1):17–38
Erzin Y, Rao BH, Patel A, Gumaste SD, Gupta K, Singh DN
(2010) Artificial neural network models for predicting of
electrical resistivity of soils from their thermal resistivity.
Int J Therm Sci 49:118–130
Gunnink BW, El-Jayyousi J (1993) Soil-fabric measurement
using conduction phase porosimetry. J Geotech Eng Div
119(6):1019–1035
Kalinski RJ, Kelly WE (1993) Estimating water content of soils
from electrical resistivity. Geotech Test J 16(3):323–329
Kecman V (2001) Leaming and soft computing: support vector
machines, neural networks and Fuzzy logic models. The
MIT press, Cambridge
Kim K, Lee D, Essa I (2011) Gaussian process regression flow
for analysis of motion trajectories. Proc IEEE Int Conf
Comput Vis 6126365:1164–1171
Mazac O, Cislerove M, Kelly WE, Landa I, Venhodova D
(1990) Determination of hydraulic conductivities by sur-
face geoelectrical methods, In: Ward S (ed) Geotechnical
and environmental geophysics, Soc. Explor. Geophysics,
vol 2, pp 125–131
McCarter W (1984) The electrical resistivity characteristics of
compacted clays. Geotechnique 34(2):263–267
McCollum B, Logan KH (1930) Electrolytic corrosion of iron in
soils. Bureau of Standards, Technologic Paper, pp 24
Park D, Rilett LR (1999) Forecasting freeway link ravel times
with a multi-layer feed forward neural network. Comput
Aided Civ Znfa Struct Eng 14:358–367
Rasmussen CE, Williams CK (2006) Gaussian processes for
machine learning. MIT-Press, Cambridge
Ronald AE, Ronald CG (1982) Electrical resistivity used to
measure liquefaction of sand. J Geotech Eng 108(GT5):
779–783
Schultz DW, Duff BM, Peters WR (1984) Performance of an
electrical resistivity technique for detecting and locating
geomembrane failures. International Conference on Geo-
membranes, Denver, pp 445–449
Shea PF, Luthin JN (1961) An investigation of the use of the
four electrode probe for measuring soil salinity in situ. Soil
Sci 92:331–339
0.01
0.011
0.012
0.013
0.014
0.015
0.016
0.017
0.018
0.019
0.02
1 51 101 151
Training Dataset
Var
ianc
e
Fig. 6 Variance of training dataset for Model II
0.045
0.05
0.055
0.06
0.065
0.07
0.075
0.08
1 11 21 31 41 51 61 71
Testing Dataset
Var
ianc
e
Fig. 7 Variance of testing dataset for Model II
194 Geotech Geol Eng (2014) 32:191–195
123
Singh DN, Kuriyan SJ, Manthena KC (2001) A generalized
relationships between soil electrical and thermal resistivi-
ties. Exp Thermal Fluid Sci 25:175–181
Sreedeep S, Reshma AC, Singh DN (2005) Generalized rela-
tionship for determining soil electrical resistivity from its
thermal resistivity. Exp Thermal Fluid Sci 29:217–226
Tagg GF (1964) Earth resistances. Newnes, London
Zhang Y, Su GS, Yan LB (2012) Analysis of slope stability
based on Gaussian Process Regression. Appl Mech Mat
170–173:1330–1333
Geotech Geol Eng (2014) 32:191–195 195
123