a brief introduction to gaussian process
Post on 24-Jan-2018
799 Views
Preview:
TRANSCRIPT
A Brief Introduction to Gaussian Process
Eric Xihui Lin
December 19, 2014
Eric Xihui Lin A Brief Introduction to Gaussian Process December 19, 2014 1 / 14
Stochastic Process
For finite number of t: t1, ..., tk ,
(Xt1 , ...,Xtk )
follows a multivariate distribution.A stochastic process a generalization to infinity dimension.In one word, a stochastic process is a random function: Xt .Can be used as a prior distribution of a function (explain later)
Eric Xihui Lin A Brief Introduction to Gaussian Process December 19, 2014 2 / 14
Gaussian Process (GP)
Gaussian Process: if for any t1, ..., tk , (Xt1 , ...,Xtk ) is Gaussiandistributed.A GP can be completely defined by a mean function µ(t) and avariance/kernel function K (s, t) := Var(Xs ,Xt), i.e.,
Xt ∼ GP(µ(·),K (·, ·))
Usually µ(t) ≡ 0;
K (s, t) = exp(−θ2 ||Xs − Xt ||2
)or exp (−θ||xs − Xt ||) ,
In pratice, t are finite and it is equivalent to multivariate normal.
Eric Xihui Lin A Brief Introduction to Gaussian Process December 19, 2014 3 / 14
Covariance function
Eric Xihui Lin A Brief Introduction to Gaussian Process December 19, 2014 4 / 14
GP as Linear Regression
Mapping φ : Rn → Rm, where usually m > nBayesian linear regression on Rm
y | β = βTφ(x)
β ∼ N (0, α−1I)
E (y) = 0 and cov(y) = 1αΦΦT =: K , which can be specified by the
kernel funciton.
Eric Xihui Lin A Brief Introduction to Gaussian Process December 19, 2014 5 / 14
Prediction
Given observations (x1, y1), . . . , (xN , yn), and a new x0,Since yx ∼ GP,
(y1, . . . , yN , y0) ∼ N(0,(
CN kkT c
)).
y0 | y1, . . . , yN ∼ N(kT C−1
N (y1, . . . , yN)T , c − kT C−1N k
).
Eric Xihui Lin A Brief Introduction to Gaussian Process December 19, 2014 6 / 14
GP: Example
1
1picture comes from scikit-learnEric Xihui Lin A Brief Introduction to Gaussian Process December 19, 2014 7 / 14
Gaussian Process Regression
Assume Gaussian noise y = f + εn, i.e.,
y | f ∼ N(f , σ2).
Assign a Gaussian prior to f , i.e.,
f ∼ GP(0, k(·, ·; θ))
Classification: can be done through the link function.Usually θ is specified, but it can be estimated by maximum likelyhood.
Eric Xihui Lin A Brief Introduction to Gaussian Process December 19, 2014 8 / 14
GP Regression: Example
Eric Xihui Lin A Brief Introduction to Gaussian Process December 19, 2014 9 / 14
GP Regression in R
library(kernlab);
gp.f <- gausspr(y ~ x, data = DATA,type = 'regression', # Default: depends on yscaled = TRUE, # default to truekernel = 'rbfdot',kpar = list(sigma = 0.1),var = 0.001, # defaultvariance.model = FALSE)
Eric Xihui Lin A Brief Introduction to Gaussian Process December 19, 2014 10 / 14
Application in Mining
In Geostatistics, it is called KrigingX is 2/3-D geographic informationGiven some observations, find the distributions of reserve of oil, Gold orothers.
Eric Xihui Lin A Brief Introduction to Gaussian Process December 19, 2014 11 / 14
Application in Optimization
Areas: Geographics, Experimental and Clinical Design,Hyper-parameter tuningProblem: given an implicite function f (x), which are expensive toevaluate, find x that maximize f (x).Need to avoid frequently evaluate the functionStep:
1 Fit a GP to initial points2 Decide the next point to explore: maximize h(µ̂(x), σ̂(x))3 Evaluate at the new point and update the GP4 Stop or go to step 1 based on some criterion
metric h is chosen to balance exploitation (high mean) andexpoloration (high variance, possibly even better solution)
Eric Xihui Lin A Brief Introduction to Gaussian Process December 19, 2014 12 / 14
Optimization: illustration
Eric Xihui Lin A Brief Introduction to Gaussian Process December 19, 2014 13 / 14
Reference
1 Dr. Ruslan Salakhutdinov’s course note:http://www.cs.toronto.edu/~rsalakhu/sta4273_2013/
2 Brochu, E., Cora, M., and de Freitas, N. A tutorial on Bayesianoptimization of expensive cost functions, with application to activeuser modeling and hierarchical re-inforcement learning. In TR-2009-23,UBC, 2009.
3 Wikipedia: http://en.wikipedia.org/wiki/Kriging4 Roustant, O., Ginsbourger, D., Deville, Y., (2012) DiceKriging,
DiceOptim: Two R Packages for the Analysis of ComputerExperiments by Kriging-Based Metamodeling and Optimization, Jouralof Statistical Software, Vol. 51, Issue 1
5 Karatzoglou, A., Smola, A., Kernal - An S4 Package for KernelMethods in R
Eric Xihui Lin A Brief Introduction to Gaussian Process December 19, 2014 14 / 14
top related