gaussian processes: from the basics to the...

382
Gaussian Processes: From the Basics to the State-of-the-Art Dr. Richard E. Turner ([email protected]) Computational and Biological Learning Lab, Department of Engineering, University of Cambridge with Thang Bui, Yingzhen Li, Jos´ e Miguel Hern´ andez Lobato, Daniel Hern´ andez Lobato, Josiah Jan, Alex Navarro, Felipe Tobar, and Maneesh Sahani 1 / 335

Upload: others

Post on 10-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Gaussian Processes:From the Basics to the State-of-the-Art

Dr. Richard E. Turner ([email protected])Computational and Biological Learning Lab, Department of

Engineering, University of Cambridge

with Thang Bui, Yingzhen Li, Jose Miguel Hernandez Lobato,Daniel Hernandez Lobato, Josiah Jan, Alex Navarro,

Felipe Tobar, and Maneesh Sahani

1 / 335

Page 2: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Motivation: non-linear regression

1 5 10 15 20x

−2.0

−1.5

−1.0

−0.5

0.0

0.5

1.0

1.5

2.0

y

2 / 335

Page 3: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Motivation: non-linear regression

1 5 10 15 20x

−2.0

−1.5

−1.0

−0.5

0.0

0.5

1.0

1.5

2.0

y

?

3 / 335

Page 4: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Motivation: non-linear regression

1 5 10 15 20x

−2.0

−1.5

−1.0

−0.5

0.0

0.5

1.0

1.5

2.0

y

4 / 335

Page 5: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Motivation: non-linear regression

1 5 10 15 20x

−2.0

−1.5

−1.0

−0.5

0.0

0.5

1.0

1.5

2.0

y

5 / 335

Page 6: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Motivation: non-linear regression

1 5 10 15 20x

−2.0

−1.5

−1.0

−0.5

0.0

0.5

1.0

1.5

2.0

y

Can we do this with a plain old Gaussian? 6 / 335

Page 7: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Gaussian distribution

p(y|Σ) ∝ exp(−1

2yTΣ−1y)

Σ =[

1 .7.7 1

]

y1

y2

−3 −2 −1 0 1 2 3−3

−2

−1

0

1

2

3

7 / 335

Page 8: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Gaussian distribution

p(y|Σ) ∝ exp(−1

2yTΣ−1y)

Σ =[

1 .7.7 1

]

y1

y2

−3 −2 −1 0 1 2 3−3

−2

−1

0

1

2

3

8 / 335

Page 9: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Gaussian distribution

p(y|Σ) ∝ exp(−1

2yTΣ−1y)

Σ =[

1 .6.6 1

]

y1

y2

−3 −2 −1 0 1 2 3−3

−2

−1

0

1

2

3

9 / 335

Page 10: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Gaussian distribution

p(y|Σ) ∝ exp(−1

2yTΣ−1y)

Σ =[

1 .4.4 1

]

y1

y2

−3 −2 −1 0 1 2 3−3

−2

−1

0

1

2

3

10 / 335

Page 11: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Gaussian distribution

p(y|Σ) ∝ exp(−1

2yTΣ−1y)

Σ =[

1 .1.1 1

]

y1

y2

−3 −2 −1 0 1 2 3−3

−2

−1

0

1

2

3

11 / 335

Page 12: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Gaussian distribution

p(y|Σ) ∝ exp(−1

2yTΣ−1y)

Σ =[

1 .0.0 1

]

y1

y2

−3 −2 −1 0 1 2 3−3

−2

−1

0

1

2

3

12 / 335

Page 13: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Gaussian distribution

p(y|Σ) ∝ exp(−1

2yTΣ−1y)

Σ =[

1 .7.7 1

]

y1

y2

−3 −2 −1 0 1 2 3−3

−2

−1

0

1

2

3

13 / 335

Page 14: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Gaussian distribution

p(y|Σ) ∝ exp(−1

2yTΣ−1y)

Σ =[

1 .7.7 1

]

y1

y2

−3 −2 −1 0 1 2 3−3

−2

−1

0

1

2

3

14 / 335

Page 15: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Gaussian distribution

p(y2|y1,Σ) ∝ exp(−1

2(y2 − µ∗)Σ∗−1(y2 − µ∗)) [ 1

1

]

y1

y2

−3 −2 −1 0 1 2 3−3

−2

−1

0

1

2

3

15 / 335

Page 16: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Gaussian distribution

p(y2|y1,Σ) ∝ exp(−1

2(y2 − µ∗)Σ∗−1(y2 − µ∗)) [ 1

1

]

y1

y 2

−3 −2 −1 0 1 2 3−3

−2

−1

0

1

2

3

16 / 335

Page 17: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Gaussian distribution

p(y2|y1,Σ) ∝ exp(−1

2(y2 − µ∗)Σ∗−1(y2 − µ∗)) [ 1

1

]

y1

y2

−3 −2 −1 0 1 2 3−3

−2

−1

0

1

2

3

17 / 335

Page 18: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Gaussian distribution

p(y2|y1,Σ) ∝ exp(−1

2(y2 − µ∗)Σ∗−1(y2 − µ∗)) [ 1

1

]

y1

y2

−3 −2 −1 0 1 2 3−3

−2

−1

0

1

2

3

18 / 335

Page 19: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Gaussian distribution

p(y2|y1,Σ) ∝ exp(−1

2(y2 − µ∗)Σ∗−1(y2 − µ∗)) [ 1

1

]

y1

y2

−3 −2 −1 0 1 2 3−3

−2

−1

0

1

2

3

19 / 335

Page 20: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Gaussian distribution

p(y2|y1,Σ) ∝ exp(−1

2(y2 − µ∗)Σ∗−1(y2 − µ∗)) [ 1

1

]

y1

y2

−3 −2 −1 0 1 2 3−3

−2

−1

0

1

2

3

20 / 335

Page 21: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Gaussian distribution

p(y2|y1,Σ) ∝ exp(−1

2(y2 − µ∗)Σ∗−1(y2 − µ∗)) [ 1

1

]

y1

y2

−3 −2 −1 0 1 2 3−3

−2

−1

0

1

2

3

21 / 335

Page 22: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y 2

Σ =[

1 .9.9 1

]

22 / 335

Page 23: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y 2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

23 / 335

Page 24: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y 2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

24 / 335

Page 25: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

25 / 335

Page 26: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

26 / 335

Page 27: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

27 / 335

Page 28: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

28 / 335

Page 29: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

29 / 335

Page 30: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

30 / 335

Page 31: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

31 / 335

Page 32: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

32 / 335

Page 33: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

33 / 335

Page 34: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

34 / 335

Page 35: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

35 / 335

Page 36: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

36 / 335

Page 37: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

37 / 335

Page 38: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

38 / 335

Page 39: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

39 / 335

Page 40: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

40 / 335

Page 41: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

41 / 335

Page 42: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

42 / 335

Page 43: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

43 / 335

Page 44: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

44 / 335

Page 45: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

45 / 335

Page 46: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

46 / 335

Page 47: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

47 / 335

Page 48: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

48 / 335

Page 49: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

49 / 335

Page 50: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

50 / 335

Page 51: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

51 / 335

Page 52: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

52 / 335

Page 53: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

53 / 335

Page 54: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

54 / 335

Page 55: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

55 / 335

Page 56: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

56 / 335

Page 57: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

57 / 335

Page 58: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

58 / 335

Page 59: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

59 / 335

Page 60: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

60 / 335

Page 61: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

61 / 335

Page 62: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

62 / 335

Page 63: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

63 / 335

Page 64: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =[

1 .9.9 1

]

64 / 335

Page 65: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4.9 1 .9 .8 .6.8 .9 1 .9 .8.6 .8 .9 1 .9.4 .6 .8 .9 1

65 / 335

Page 66: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4.9 1 .9 .8 .6.8 .9 1 .9 .8.6 .8 .9 1 .9.4 .6 .8 .9 1

66 / 335

Page 67: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4.9 1 .9 .8 .6.8 .9 1 .9 .8.6 .8 .9 1 .9.4 .6 .8 .9 1

67 / 335

Page 68: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4.9 1 .9 .8 .6.8 .9 1 .9 .8.6 .8 .9 1 .9.4 .6 .8 .9 1

68 / 335

Page 69: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4.9 1 .9 .8 .6.8 .9 1 .9 .8.6 .8 .9 1 .9.4 .6 .8 .9 1

69 / 335

Page 70: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4.9 1 .9 .8 .6.8 .9 1 .9 .8.6 .8 .9 1 .9.4 .6 .8 .9 1

70 / 335

Page 71: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4.9 1 .9 .8 .6.8 .9 1 .9 .8.6 .8 .9 1 .9.4 .6 .8 .9 1

71 / 335

Page 72: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4.9 1 .9 .8 .6.8 .9 1 .9 .8.6 .8 .9 1 .9.4 .6 .8 .9 1

72 / 335

Page 73: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4.9 1 .9 .8 .6.8 .9 1 .9 .8.6 .8 .9 1 .9.4 .6 .8 .9 1

73 / 335

Page 74: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4.9 1 .9 .8 .6.8 .9 1 .9 .8.6 .8 .9 1 .9.4 .6 .8 .9 1

74 / 335

Page 75: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4.9 1 .9 .8 .6.8 .9 1 .9 .8.6 .8 .9 1 .9.4 .6 .8 .9 1

75 / 335

Page 76: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4.9 1 .9 .8 .6.8 .9 1 .9 .8.6 .8 .9 1 .9.4 .6 .8 .9 1

76 / 335

Page 77: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4.9 1 .9 .8 .6.8 .9 1 .9 .8.6 .8 .9 1 .9.4 .6 .8 .9 1

77 / 335

Page 78: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4.9 1 .9 .8 .6.8 .9 1 .9 .8.6 .8 .9 1 .9.4 .6 .8 .9 1

78 / 335

Page 79: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4.9 1 .9 .8 .6.8 .9 1 .9 .8.6 .8 .9 1 .9.4 .6 .8 .9 1

79 / 335

Page 80: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4.9 1 .9 .8 .6.8 .9 1 .9 .8.6 .8 .9 1 .9.4 .6 .8 .9 1

80 / 335

Page 81: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4.9 1 .9 .8 .6.8 .9 1 .9 .8.6 .8 .9 1 .9.4 .6 .8 .9 1

81 / 335

Page 82: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4.9 1 .9 .8 .6.8 .9 1 .9 .8.6 .8 .9 1 .9.4 .6 .8 .9 1

82 / 335

Page 83: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4.9 1 .9 .8 .6.8 .9 1 .9 .8.6 .8 .9 1 .9.4 .6 .8 .9 1

83 / 335

Page 84: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4.9 1 .9 .8 .6.8 .9 1 .9 .8.6 .8 .9 1 .9.4 .6 .8 .9 1

84 / 335

Page 85: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4.9 1 .9 .8 .6.8 .9 1 .9 .8.6 .8 .9 1 .9.4 .6 .8 .9 1

85 / 335

Page 86: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4.9 1 .9 .8 .6.8 .9 1 .9 .8.6 .8 .9 1 .9.4 .6 .8 .9 1

86 / 335

Page 87: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4.9 1 .9 .8 .6.8 .9 1 .9 .8.6 .8 .9 1 .9.4 .6 .8 .9 1

87 / 335

Page 88: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4.9 1 .9 .8 .6.8 .9 1 .9 .8.6 .8 .9 1 .9.4 .6 .8 .9 1

88 / 335

Page 89: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4.9 1 .9 .8 .6.8 .9 1 .9 .8.6 .8 .9 1 .9.4 .6 .8 .9 1

89 / 335

Page 90: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4.9 1 .9 .8 .6.8 .9 1 .9 .8.6 .8 .9 1 .9.4 .6 .8 .9 1

90 / 335

Page 91: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4.9 1 .9 .8 .6.8 .9 1 .9 .8.6 .8 .9 1 .9.4 .6 .8 .9 1

91 / 335

Page 92: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4.9 1 .9 .8 .6.8 .9 1 .9 .8.6 .8 .9 1 .9.4 .6 .8 .9 1

92 / 335

Page 93: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4.9 1 .9 .8 .6.8 .9 1 .9 .8.6 .8 .9 1 .9.4 .6 .8 .9 1

93 / 335

Page 94: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4.9 1 .9 .8 .6.8 .9 1 .9 .8.6 .8 .9 1 .9.4 .6 .8 .9 1

94 / 335

Page 95: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4.9 1 .9 .8 .6.8 .9 1 .9 .8.6 .8 .9 1 .9.4 .6 .8 .9 1

95 / 335

Page 96: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4.9 1 .9 .8 .6.8 .9 1 .9 .8.6 .8 .9 1 .9.4 .6 .8 .9 1

96 / 335

Page 97: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4.9 1 .9 .8 .6.8 .9 1 .9 .8.6 .8 .9 1 .9.4 .6 .8 .9 1

97 / 335

Page 98: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4.9 1 .9 .8 .6.8 .9 1 .9 .8.6 .8 .9 1 .9.4 .6 .8 .9 1

98 / 335

Page 99: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4.9 1 .9 .8 .6.8 .9 1 .9 .8.6 .8 .9 1 .9.4 .6 .8 .9 1

99 / 335

Page 100: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4.9 1 .9 .8 .6.8 .9 1 .9 .8.6 .8 .9 1 .9.4 .6 .8 .9 1

100 / 335

Page 101: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4.9 1 .9 .8 .6.8 .9 1 .9 .8.6 .8 .9 1 .9.4 .6 .8 .9 1

101 / 335

Page 102: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4.9 1 .9 .8 .6.8 .9 1 .9 .8.6 .8 .9 1 .9.4 .6 .8 .9 1

102 / 335

Page 103: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4.9 1 .9 .8 .6.8 .9 1 .9 .8.6 .8 .9 1 .9.4 .6 .8 .9 1

103 / 335

Page 104: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4.9 1 .9 .8 .6.8 .9 1 .9 .8.6 .8 .9 1 .9.4 .6 .8 .9 1

104 / 335

Page 105: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

105 / 335

Page 106: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

106 / 335

Page 107: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

107 / 335

Page 108: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

108 / 335

Page 109: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

109 / 335

Page 110: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

110 / 335

Page 111: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

111 / 335

Page 112: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

112 / 335

Page 113: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

113 / 335

Page 114: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

114 / 335

Page 115: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

115 / 335

Page 116: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

116 / 335

Page 117: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

117 / 335

Page 118: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

118 / 335

Page 119: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

119 / 335

Page 120: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

120 / 335

Page 121: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

121 / 335

Page 122: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

122 / 335

Page 123: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

123 / 335

Page 124: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

124 / 335

Page 125: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

125 / 335

Page 126: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

126 / 335

Page 127: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

127 / 335

Page 128: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

128 / 335

Page 129: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

129 / 335

Page 130: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

130 / 335

Page 131: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

131 / 335

Page 132: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

132 / 335

Page 133: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

133 / 335

Page 134: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

134 / 335

Page 135: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

135 / 335

Page 136: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

136 / 335

Page 137: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

137 / 335

Page 138: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

138 / 335

Page 139: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

139 / 335

Page 140: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

140 / 335

Page 141: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

141 / 335

Page 142: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

142 / 335

Page 143: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

143 / 335

Page 144: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

144 / 335

Page 145: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Regression using Gaussians

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

145 / 335

Page 146: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Regression using Gaussians

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

146 / 335

Page 147: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Regression using Gaussians

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

147 / 335

Page 148: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Regression using Gaussians

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

K(x1, x2) = σ2e−1

2l2(x1−x2)2

1 10 20

1

10

20

148 / 335

Page 149: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Regression: probabilistic inference in function space

−3

−2

−1

0

1

2

3

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

K(x1, x2) = σ2e−1

2l2(x1−x2)2

1 10 20

1

10

20

149 / 335

Page 150: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Regression: probabilistic inference in function space

−3

−2

−1

0

1

2

3

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric)

K(x1, x2) = σ2e−1

2l2(x1−x2)2

1 10 20

1

10

20

Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)

150 / 335

Page 151: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Regression: probabilistic inference in function space

−3

−2

−1

0

1

2

3

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

Non-parametric (∞-parametric)

p(y|θ) = N (0,Σ) function estimatewith uncertainty

K(x1, x2) = σ2e−1

2l2(x1−x2)2

1 10 20

1

10

20

Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)

151 / 335

Page 152: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Regression: probabilistic inference in function space

−3

−2

−1

0

1

2

3

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric)

noise

K(x1, x2) = σ2e−1

2l2(x1−x2)2

1 10 20

1

10

20

Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)

152 / 335

Page 153: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Regression: probabilistic inference in function space

−3

−2

−1

0

1

2

3

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric)

horizontal-scale

K(x1, x2) = σ2e−1

2l2(x1−x2)2

1 10 20

1

10

20

Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)

153 / 335

Page 154: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Regression: probabilistic inference in function space

−3

−2

−1

0

1

2

3

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

1 10 20

1

10

20

horizontal-scalevertical-scale

154 / 335

Page 155: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Mathematical Foundations: Definition

Gaussian process = generalisation of multivariate Gaussian distribution toinfinitely many variables.

Definition: a Gaussian process is a collection of random variables, anyfinite number of which have (consistent) Gaussian distributions.

A Gaussian distribution is fully specified by a mean vector, µ, andcovariance matrix Σ:

f = (f1, . . . , fn) ∼ N (µ,Σ), indices i = 1, . . . ,n

A Gaussian process is fully specified by a mean function m(x) andcovariance function K(x, x′):

f (x) ∼ GP(m(x),K(x, x′)

), indices x

155 / 335

Page 156: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Mathematical Foundations: Regression

Q1. What's the formal justification for how we were using GPs for regression?

156 / 335

Page 157: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Mathematical Foundations: Regression

Q1. What's the formal justification for how we were using GPs for regression?

generative model (like non-linear regression)

156 / 335

Page 158: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Mathematical Foundations: Regression

Q1. What's the formal justification for how we were using GPs for regression?

generative model (like non-linear regression)

place GP prior over the non-linear function

(smoothly wiggling functions expected)

156 / 335

Page 159: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Mathematical Foundations: Regression

Q1. What's the formal justification for how we were using GPs for regression?

generative model (like non-linear regression)

place GP prior over the non-linear function

sum of Gaussian variables = Gaussian: induces a GP over

(smoothly wiggling functions expected)

156 / 335

Page 160: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Mathematical Foundations: Marginalisation

Q2. A GP is "like" a Gaussian distribution with an infinitely long mean vector and an "infinite by infinite" covariance matrix, so how do we represent it on a computer?

157 / 335

Page 161: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Mathematical Foundations: Marginalisation

Q2. A GP is "like" a Gaussian distribution with an infinitely long mean vector and an "infinite by infinite" covariance matrix, so how do we represent it on a computer?

We are saved by the marginalisation property:

157 / 335

Page 162: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Mathematical Foundations: Marginalisation

Q2. A GP is "like" a Gaussian distribution with an infinitely long mean vector and an "infinite by infinite" covariance matrix, so how do we represent it on a computer?

We are saved by the marginalisation property:

157 / 335

Page 163: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Mathematical Foundations: Marginalisation

Q2. A GP is "like" a Gaussian distribution with an infinitely long mean vector and an "infinite by infinite" covariance matrix, so how do we represent it on a computer?

We are saved by the marginalisation property:

Only need to represent finite dimensional projections of GPs on computer.

157 / 335

Page 164: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Mathematical Foundations: Prediction

Q3. How do we make predictions?

158 / 335

Page 165: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Mathematical Foundations: Prediction

Q3. How do we make predictions?

158 / 335

Page 166: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Mathematical Foundations: Prediction

Q3. How do we make predictions?

158 / 335

Page 167: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Mathematical Foundations: Prediction

Q3. How do we make predictions?

158 / 335

Page 168: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Mathematical Foundations: Prediction

Q3. How do we make predictions?

predictive mean

158 / 335

Page 169: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Mathematical Foundations: Prediction

Q3. How do we make predictions?

linear in the data

predictive mean

158 / 335

Page 170: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Mathematical Foundations: Prediction

Q3. How do we make predictions?

prior uncertainty

predictive uncertainty

reduction inuncertainty

linear in the data

predictive mean predictive covariance

predictions more confident than prior

158 / 335

Page 171: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Regression: probabilistic inference in function space

−3

−2

−1

0

1

2

3

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

1 10 20

1

10

20

horizontal-scalevertical-scale

159 / 335

Page 172: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

160 / 335

Page 173: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

161 / 335

Page 174: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

162 / 335

Page 175: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

163 / 335

Page 176: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

164 / 335

Page 177: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

165 / 335

Page 178: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

166 / 335

Page 179: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

167 / 335

Page 180: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

168 / 335

Page 181: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

169 / 335

Page 182: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

170 / 335

Page 183: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

171 / 335

Page 184: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

172 / 335

Page 185: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

173 / 335

Page 186: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

174 / 335

Page 187: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

175 / 335

Page 188: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

176 / 335

Page 189: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

177 / 335

Page 190: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

178 / 335

Page 191: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

179 / 335

Page 192: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

180 / 335

Page 193: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

Σ =

short horizontal length-scale

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

181 / 335

Page 194: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

short horizontal length-scale

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

182 / 335

Page 195: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

short horizontal length-scale

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

183 / 335

Page 196: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

short horizontal length-scale

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

184 / 335

Page 197: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

short horizontal length-scale

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

185 / 335

Page 198: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

short horizontal length-scale

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

186 / 335

Page 199: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

short horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

187 / 335

Page 200: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

short horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

188 / 335

Page 201: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

short horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

189 / 335

Page 202: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

short horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

190 / 335

Page 203: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric)short horizontal length-scale

Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

191 / 335

Page 204: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

short horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

192 / 335

Page 205: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric)short horizontal length-scale

Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

193 / 335

Page 206: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

short horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

194 / 335

Page 207: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

short horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

195 / 335

Page 208: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

short horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

196 / 335

Page 209: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric)short horizontal length-scale

Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

197 / 335

Page 210: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric)short horizontal length-scale

Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

198 / 335

Page 211: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric)short horizontal length-scale

Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

199 / 335

Page 212: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric)short horizontal length-scale

Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

200 / 335

Page 213: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric)short horizontal length-scale

Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

201 / 335

Page 214: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

Σ =

long horizontal length-scale

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

202 / 335

Page 215: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

long horizontal length-scale

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

203 / 335

Page 216: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

long horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

204 / 335

Page 217: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

long horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

205 / 335

Page 218: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

long horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

206 / 335

Page 219: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

long horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

207 / 335

Page 220: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

long horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

208 / 335

Page 221: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

long horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

209 / 335

Page 222: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

long horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

210 / 335

Page 223: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

long horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

211 / 335

Page 224: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

long horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

212 / 335

Page 225: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

long horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

213 / 335

Page 226: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

long horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

214 / 335

Page 227: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

long horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

215 / 335

Page 228: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

long horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

216 / 335

Page 229: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

long horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

217 / 335

Page 230: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

long horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

218 / 335

Page 231: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

long horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

219 / 335

Page 232: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

long horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

220 / 335

Page 233: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

long horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

221 / 335

Page 234: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

long horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

222 / 335

Page 235: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

K(x1, x2) = σ2 exp(− 1

2l2 (x1 − x2)2)

Hyper-parameters have a strong effectI l controls the horizontal length-scaleI σ2 controls the vertical scale of the data

=⇒ need automatic learning of hyper-parameters from data e.g.

arg maxl,σ2

log p(y |θ)

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x−3

−2

−1

0

1

2

3

x−3

−2

−1

0

1

2

3

x−3

−2

−1

0

1

2

3

x−3

−2

−1

0

1

2

3

x−3

−2

−1

0

1

2

3

x−3

−2

−1

0

1

2

3

x−3

−2

−1

0

1

2

3

x−3

−2

−1

0

1

2

3

x−3

−2

−1

0

1

2

3

x−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

l = short l = longl = medium

223 / 335

Page 236: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

K(x1, x2) = σ2 exp(− 1

2l2 (x1 − x2)2)

Hyper-parameters have a strong effectI l controls the horizontal length-scaleI σ2 controls the vertical scale of the data

=⇒ need automatic learning of hyper-parameters from data e.g.

arg maxl,σ2

log p(y |θ)

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x−3

−2

−1

0

1

2

3

x−3

−2

−1

0

1

2

3

x−3

−2

−1

0

1

2

3

x−3

−2

−1

0

1

2

3

x−3

−2

−1

0

1

2

3

x−3

−2

−1

0

1

2

3

x−3

−2

−1

0

1

2

3

x−3

−2

−1

0

1

2

3

x−3

−2

−1

0

1

2

3

x−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

l = short l = longl = medium

224 / 335

Page 237: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Higher dimensional input spaces

0

20

40

60

80

100

0

20

40

60

80

100

−2

0

2

x1x

2

y

225 / 335

Page 238: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Higher dimensional input spaces

0

20

40

60

80

100

0

20

40

60

80

100

−2

0

2

x1x

2

y

226 / 335

Page 239: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Higher dimensional input spaces

0

20

40

60

80

100

0

20

40

60

80

100

−2

0

2

x1x

2

y

227 / 335

Page 240: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Higher dimensional input spaces

0

20

40

60

80

100

0

20

40

60

80

100

−2

0

2

x1x

2

y

228 / 335

Page 241: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Higher dimensional input spaces

0

20

40

60

80

100

0

20

40

60

80

100

−2

0

2

x1x

2

y

229 / 335

Page 242: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Higher dimensional input spaces

0

20

40

60

80

100

0

20

40

60

80

100

−2

0

2

x1x

2

y

230 / 335

Page 243: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Higher dimensional input spaces

0

20

40

60

80

100

0

20

40

60

80

100

−2

0

2

x1x

2

y

231 / 335

Page 244: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Higher dimensional input spaces

0

20

40

60

80

100

0

20

40

60

80

100

−2

0

2

x1x

2

y

232 / 335

Page 245: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Higher dimensional input spaces

0

20

40

60

80

100

0

20

40

60

80

100

−2

0

2

x1x

2

y

233 / 335

Page 246: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Higher dimensional input spaces

0

20

40

60

80

100

0

20

40

60

80

100

−2

0

2

x1x

2

y

234 / 335

Page 247: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Higher dimensional input spaces

0

20

40

60

80

100

0

20

40

60

80

100

−2

0

2

x1x

2

y

235 / 335

Page 248: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Higher dimensional input spaces

0

20

40

60

80

100

0

20

40

60

80

100

−2

0

2

x1x

2

y

236 / 335

Page 249: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Higher dimensional input spaces

0

20

40

60

80

100

0

20

40

60

80

100

−2

0

2

x1x

2

y

237 / 335

Page 250: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Higher dimensional input spaces

0

20

40

60

80

100

0

20

40

60

80

100

−2

0

2

x1x

2

y

238 / 335

Page 251: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Higher dimensional input spaces

0

20

40

60

80

100

0

20

40

60

80

100

−2

0

2

x1x

2

y

239 / 335

Page 252: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

long horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f (x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2e−1

2l2(x1−x2)2

240 / 335

Page 253: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

Σ =

K(x1, x2) = σ2 exp(− 1

2l2 |x1 − x2|)

241 / 335

Page 254: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

K(x1, x2) = σ2 exp(− 1

2l2 |x1 − x2|)

242 / 335

Page 255: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

K(x1, x2) = σ2 exp(− 1

2l2 |x1 − x2|)

243 / 335

Page 256: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

K(x1, x2) = σ2 exp(− 1

2l2 |x1 − x2|)

244 / 335

Page 257: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

K(x1, x2) = σ2 exp(− 1

2l2 |x1 − x2|)

245 / 335

Page 258: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

K(x1, x2) = σ2 exp(− 1

2l2 |x1 − x2|)

246 / 335

Page 259: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

K(x1, x2) = σ2 exp(− 1

2l2 |x1 − x2|)

247 / 335

Page 260: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

K(x1, x2) = σ2 exp(− 1

2l2 |x1 − x2|)

248 / 335

Page 261: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

K(x1, x2) = σ2 exp(− 1

2l2 |x1 − x2|)

249 / 335

Page 262: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

K(x1, x2) = σ2 exp(− 1

2l2 |x1 − x2|)

250 / 335

Page 263: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

K(x1, x2) = σ2 exp(− 1

2l2 |x1 − x2|)

251 / 335

Page 264: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

K(x1, x2) = σ2 exp(− 1

2l2 |x1 − x2|)

252 / 335

Page 265: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

K(x1, x2) = σ2 exp(− 1

2l2 |x1 − x2|)

253 / 335

Page 266: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

K(x1, x2) = σ2 exp(− 1

2l2 |x1 − x2|)

254 / 335

Page 267: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

K(x1, x2) = σ2 exp(− 1

2l2 |x1 − x2|)

255 / 335

Page 268: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

K(x1, x2) = σ2 exp(− 1

2l2 |x1 − x2|)

256 / 335

Page 269: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

K(x1, x2) = σ2 exp(− 1

2l2 |x1 − x2|)

257 / 335

Page 270: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

K(x1, x2) = σ2 exp(− 1

2l2 |x1 − x2|)

258 / 335

Page 271: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

K(x1, x2) = σ2 exp(− 1

2l2 |x1 − x2|)

259 / 335

Page 272: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

K(x1, x2) = σ2 exp(− 1

2l2 |x1 − x2|)

260 / 335

Page 273: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

K(x1, x2) = σ2 exp(− 1

2l2 |x1 − x2|)

261 / 335

Page 274: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

Σ =

Rational Quadratic

K(x1, x2) = σ2(1 + 1

2αl2 |x1 − x2|)−α

262 / 335

Page 275: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2(1 + 1

2αl2 |x1 − x2|)−α

Rational Quadratic

263 / 335

Page 276: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2(1 + 1

2αl2 |x1 − x2|)−α

Rational Quadratic

264 / 335

Page 277: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2(1 + 1

2αl2 |x1 − x2|)−α

Rational Quadratic

265 / 335

Page 278: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2(1 + 1

2αl2 |x1 − x2|)−α

Rational Quadratic

266 / 335

Page 279: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2(1 + 1

2αl2 |x1 − x2|)−α

Rational Quadratic

267 / 335

Page 280: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2(1 + 1

2αl2 |x1 − x2|)−α

Rational Quadratic

268 / 335

Page 281: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2(1 + 1

2αl2 |x1 − x2|)−α

Rational Quadratic

269 / 335

Page 282: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2(1 + 1

2αl2 |x1 − x2|)−α

Rational Quadratic

270 / 335

Page 283: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2(1 + 1

2αl2 |x1 − x2|)−α

Rational Quadratic

271 / 335

Page 284: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2(1 + 1

2αl2 |x1 − x2|)−α

Rational Quadratic

272 / 335

Page 285: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2(1 + 1

2αl2 |x1 − x2|)−α

Rational Quadratic

273 / 335

Page 286: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2(1 + 1

2αl2 |x1 − x2|)−α

Rational Quadratic

274 / 335

Page 287: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2(1 + 1

2αl2 |x1 − x2|)−α

Rational Quadratic

275 / 335

Page 288: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2(1 + 1

2αl2 |x1 − x2|)−α

Rational Quadratic

276 / 335

Page 289: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2(1 + 1

2αl2 |x1 − x2|)−α

Rational Quadratic

277 / 335

Page 290: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2(1 + 1

2αl2 |x1 − x2|)−α

Rational Quadratic

278 / 335

Page 291: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Rational Quadratic

K(x1, x2) = σ2(1 + 1

2αl2 |x1 − x2|)−α

279 / 335

Page 292: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Rational Quadratic

K(x1, x2) = σ2(1 + 1

2αl2 |x1 − x2|)−α

280 / 335

Page 293: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Rational Quadratic

K(x1, x2) = σ2(1 + 1

2αl2 |x1 − x2|)−α

281 / 335

Page 294: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Rational Quadratic

K(x1, x2) = σ2(1 + 1

2αl2 |x1 − x2|)−α

282 / 335

Page 295: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

Σ =

Periodic

sinusoid × squared exponential

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2 (x1 − x2)2)

283 / 335

Page 296: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2 (x1 − x2)2)

Periodic

sinusoid × squared exponential

284 / 335

Page 297: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2 (x1 − x2)2)

Periodic

sinusoid × squared exponential

285 / 335

Page 298: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2 (x1 − x2)2)

Periodic

sinusoid × squared exponential

286 / 335

Page 299: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2 (x1 − x2)2)

Periodic

sinusoid × squared exponential

287 / 335

Page 300: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2 (x1 − x2)2)

Periodic

sinusoid × squared exponential

288 / 335

Page 301: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2 (x1 − x2)2)

Periodic

sinusoid × squared exponential

289 / 335

Page 302: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2 (x1 − x2)2)

Periodic

sinusoid × squared exponential

290 / 335

Page 303: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2 (x1 − x2)2)

Periodic

sinusoid × squared exponential

291 / 335

Page 304: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2 (x1 − x2)2)

Periodic

sinusoid × squared exponential

292 / 335

Page 305: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2 (x1 − x2)2)

Periodic

sinusoid × squared exponential

293 / 335

Page 306: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2 (x1 − x2)2)

Periodic

sinusoid × squared exponential

294 / 335

Page 307: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2 (x1 − x2)2)

Periodic

sinusoid × squared exponential

295 / 335

Page 308: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2 (x1 − x2)2)

Periodic

sinusoid × squared exponential

296 / 335

Page 309: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2 (x1 − x2)2)

Periodic

sinusoid × squared exponential

297 / 335

Page 310: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2 (x1 − x2)2)

Periodic

sinusoid × squared exponential

298 / 335

Page 311: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2 (x1 − x2)2)

Periodic

sinusoid × squared exponential

299 / 335

Page 312: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2 (x1 − x2)2)

Periodic

sinusoid × squared exponential

300 / 335

Page 313: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2 (x1 − x2)2)

Periodic

sinusoid × squared exponential

301 / 335

Page 314: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2 (x1 − x2)2)

Periodic

sinusoid × squared exponential

302 / 335

Page 315: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2 (x1 − x2)2)

Periodic

sinusoid × squared exponential

303 / 335

Page 316: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

The covariance function has a large effect

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

OU RQ periodic

304 / 335

Page 317: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

The covariance function has a large effect

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

OU RQ periodic

Bayesian model comparison:

305 / 335

Page 318: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

The covariance function has a large effect

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

OU RQ periodic

Bayesian model comparison:prior over models

306 / 335

Page 319: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

The covariance function has a large effect

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

OU RQ periodic

Bayesian model comparison:prior over models

marginal likelihood

307 / 335

Page 320: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

The covariance function has a large effect

Health warnings: Hard to compute (need approximations)Often results are very sensitive to the priors

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

OU RQ periodic

Bayesian model comparison:prior over models

marginal likelihood

308 / 335

Page 321: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Scaling Gaussian Processes toLarge Datasets

309 / 335

Page 322: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Motivation: Gaussian Process Regression

inputs

outputs

310 / 335

Page 323: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Motivation: Gaussian Process Regression

inputs

outputs

?

310 / 335

Page 324: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Motivation: Gaussian Process Regression

inputs

outputs

?

310 / 335

Page 325: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Motivation: Gaussian Process Regression

inputs

outputs

?

inference & learning

310 / 335

Page 326: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Motivation: Gaussian Process Regression

inputs

outputs

?

inference & learning

intractabilities

computational

analytic

310 / 335

Page 327: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Motivation: Gaussian Process Regression

310 / 335

Page 328: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

A Brief History of Gaussian Process Approximations

FITC: Snelson et al. “Sparse Gaussian Processes using Pseudo-inputs”PITC: Snelson et al. “Local and global sparse Gaussian process approximations”EP: Csato and Opper 2002 / Qi et al. "Sparse-posterior Gaussian Processes for general likelihoods.” VFE: Titsias “Variational Learning of Inducing Variables in Sparse Gaussian Processes”DTC / PP: Seeger et al. “Fast Forward Selection to Speed Up Sparse Gaussian Process Regression”

311 / 335

Page 329: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

A Brief History of Gaussian Process Approximations

approximate generative modelexact inference

FITC: Snelson et al. “Sparse Gaussian Processes using Pseudo-inputs”PITC: Snelson et al. “Local and global sparse Gaussian process approximations”EP: Csato and Opper 2002 / Qi et al. "Sparse-posterior Gaussian Processes for general likelihoods.” VFE: Titsias “Variational Learning of Inducing Variables in Sparse Gaussian Processes”DTC / PP: Seeger et al. “Fast Forward Selection to Speed Up Sparse Gaussian Process Regression”

311 / 335

Page 330: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

A Brief History of Gaussian Process Approximations

approximate generative modelexact inference

methods employingpseudo-data

FITC: Snelson et al. “Sparse Gaussian Processes using Pseudo-inputs”PITC: Snelson et al. “Local and global sparse Gaussian process approximations”EP: Csato and Opper 2002 / Qi et al. "Sparse-posterior Gaussian Processes for general likelihoods.” VFE: Titsias “Variational Learning of Inducing Variables in Sparse Gaussian Processes”DTC / PP: Seeger et al. “Fast Forward Selection to Speed Up Sparse Gaussian Process Regression”

311 / 335

Page 331: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

A Brief History of Gaussian Process Approximations

approximate generative modelexact inference

methods employingpseudo-data

FITC: Snelson et al. “Sparse Gaussian Processes using Pseudo-inputs”PITC: Snelson et al. “Local and global sparse Gaussian process approximations”EP: Csato and Opper 2002 / Qi et al. "Sparse-posterior Gaussian Processes for general likelihoods.” VFE: Titsias “Variational Learning of Inducing Variables in Sparse Gaussian Processes”DTC / PP: Seeger et al. “Fast Forward Selection to Speed Up Sparse Gaussian Process Regression”

FITCPITCDTC

311 / 335

Page 332: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

A Brief History of Gaussian Process Approximations

approximate generative modelexact inference

methods employingpseudo-data

FITC: Snelson et al. “Sparse Gaussian Processes using Pseudo-inputs”PITC: Snelson et al. “Local and global sparse Gaussian process approximations”EP: Csato and Opper 2002 / Qi et al. "Sparse-posterior Gaussian Processes for general likelihoods.” VFE: Titsias “Variational Learning of Inducing Variables in Sparse Gaussian Processes”DTC / PP: Seeger et al. “Fast Forward Selection to Speed Up Sparse Gaussian Process Regression”

FITCPITCDTC

A Unifying View of Sparse Approximate Gaussian Process RegressionQuinonero-Candela & Rasmussen, 2005(FITC, PITC, DTC)

311 / 335

Page 333: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

A Brief History of Gaussian Process Approximations

approximate generative modelexact inference

exact generative modelapproximate inference

methods employingpseudo-data

FITC: Snelson et al. “Sparse Gaussian Processes using Pseudo-inputs”PITC: Snelson et al. “Local and global sparse Gaussian process approximations”EP: Csato and Opper 2002 / Qi et al. "Sparse-posterior Gaussian Processes for general likelihoods.” VFE: Titsias “Variational Learning of Inducing Variables in Sparse Gaussian Processes”DTC / PP: Seeger et al. “Fast Forward Selection to Speed Up Sparse Gaussian Process Regression”

FITCPITCDTC

A Unifying View of Sparse Approximate Gaussian Process RegressionQuinonero-Candela & Rasmussen, 2005(FITC, PITC, DTC)

311 / 335

Page 334: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

A Brief History of Gaussian Process Approximations

approximate generative modelexact inference

exact generative modelapproximate inference

methods employingpseudo-data

FITC: Snelson et al. “Sparse Gaussian Processes using Pseudo-inputs”PITC: Snelson et al. “Local and global sparse Gaussian process approximations”EP: Csato and Opper 2002 / Qi et al. "Sparse-posterior Gaussian Processes for general likelihoods.” VFE: Titsias “Variational Learning of Inducing Variables in Sparse Gaussian Processes”DTC / PP: Seeger et al. “Fast Forward Selection to Speed Up Sparse Gaussian Process Regression”

VFEEPPP

FITCPITCDTC

A Unifying View of Sparse Approximate Gaussian Process RegressionQuinonero-Candela & Rasmussen, 2005(FITC, PITC, DTC)

311 / 335

Page 335: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

A Brief History of Gaussian Process Approximations

approximate generative modelexact inference

exact generative modelapproximate inference

methods employingpseudo-data

FITC: Snelson et al. “Sparse Gaussian Processes using Pseudo-inputs”PITC: Snelson et al. “Local and global sparse Gaussian process approximations”EP: Csato and Opper 2002 / Qi et al. "Sparse-posterior Gaussian Processes for general likelihoods.” VFE: Titsias “Variational Learning of Inducing Variables in Sparse Gaussian Processes”DTC / PP: Seeger et al. “Fast Forward Selection to Speed Up Sparse Gaussian Process Regression”

VFEEPPP

FITCPITCDTC

A Unifying View of Sparse Approximate Gaussian Process RegressionQuinonero-Candela & Rasmussen, 2005(FITC, PITC, DTC)

311 / 335

Page 336: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

A Brief History of Gaussian Process Approximations

approximate generative modelexact inference

exact generative modelapproximate inference

methods employingpseudo-data

FITC: Snelson et al. “Sparse Gaussian Processes using Pseudo-inputs”PITC: Snelson et al. “Local and global sparse Gaussian process approximations”EP: Csato and Opper 2002 / Qi et al. "Sparse-posterior Gaussian Processes for general likelihoods.” VFE: Titsias “Variational Learning of Inducing Variables in Sparse Gaussian Processes”DTC / PP: Seeger et al. “Fast Forward Selection to Speed Up Sparse Gaussian Process Regression”

VFEEPPP

FITCPITCDTC

A Unifying View of Sparse Approximate Gaussian Process RegressionQuinonero-Candela & Rasmussen, 2005(FITC, PITC, DTC)

A Unifying Framework for Sparse Gaussian Process Approximation using Power Expectation PropagationBui, Yan and Turner, 2016(VFE, EP, FITC, PITC ...)

311 / 335

Page 337: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

EP pseudo-point approximation

true posterior

312 / 335

Page 338: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

EP pseudo-point approximation

true posterior

312 / 335

Page 339: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

EP pseudo-point approximation

true posterior

marginal

likelihood

posterior

312 / 335

Page 340: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

EP pseudo-point approximation

true posterior approximate posterior

marginal

likelihood

posterior

312 / 335

Page 341: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

EP pseudo-point approximation

true posterior approximate posterior

marginal

likelihood

posterior

312 / 335

Page 342: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

EP pseudo-point approximation

true posterior approximate posterior

marginal

likelihood

posterior

312 / 335

Page 343: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

EP pseudo-point approximation

true posterior approximate posterior

marginal

likelihood

posterior

312 / 335

Page 344: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

EP pseudo-point approximation

input locations of

'pseudo' data

outputs and covariance

'pseudo' data

true posterior approximate posterior

marginal

likelihood

posterior

exact joint

of new GP

regression

model

312 / 335

Page 345: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

EP algorithm

313 / 335

Page 346: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

EP algorithm

1. remove take out onepseudo-observation

likelihoodcavity

313 / 335

Page 347: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

EP algorithm

1. remove

2. include

take out onepseudo-observation

likelihood

add in onetrue observation

likelihood

cavity

tilted

313 / 335

Page 348: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

EP algorithm

1. remove

2. include

3. project

take out onepseudo-observation

likelihood

add in onetrue observation

likelihood

project ontoapproximating

family

cavity

tilted KL between unnormalisedstochastic processes

313 / 335

Page 349: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

EP algorithm

1. remove

2. include

3. project

4. update

take out onepseudo-observation

likelihood

add in onetrue observation

likelihood

project ontoapproximating

family

updatepseudo-observation

likelihood

cavity

tilted KL between unnormalisedstochastic processes

313 / 335

Page 350: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

EP algorithm

1. remove

2. include

3. project

4. update

take out onepseudo-observation

likelihood

add in onetrue observation

likelihood

project ontoapproximating

family

updatepseudo-observation

likelihood

cavity

tilted

1. minimum: moments matched at pseudo-inputs2. Gaussian regression: matches moments everywhere

KL between unnormalisedstochastic processes

313 / 335

Page 351: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

EP algorithm

1. remove

2. include

3. project

4. update

take out onepseudo-observation

likelihood

add in onetrue observation

likelihood

project ontoapproximating

family

updatepseudo-observation

likelihood

cavity

tilted

1. minimum: moments matched at pseudo-inputs2. Gaussian regression: matches moments everywhere

KL between unnormalisedstochastic processes

rank 1

313 / 335

Page 352: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

A Brief History of Gaussian Process Approximations

approximate generative modelexact inference

exact generative modelapproximate inference

methods employingpseudo-data

FITC: Snelson et al. “Sparse Gaussian Processes using Pseudo-inputs”PITC: Snelson et al. “Local and global sparse Gaussian process approximations”EP: Csato and Opper 2002 / Qi et al. "Sparse-posterior Gaussian Processes for general likelihoods.” VFE: Titsias “Variational Learning of Inducing Variables in Sparse Gaussian Processes”DTC / PP: Seeger et al. “Fast Forward Selection to Speed Up Sparse Gaussian Process Regression”

VFEEPPP

FITCPITCDTC

A Unifying View of Sparse Approximate Gaussian Process RegressionQuinonero-Candela & Rasmussen, 2005(FITC, PITC, DTC)

A Unifying Framework for Sparse Gaussian Process Approximation using Power Expectation PropagationBui, Yan and Turner, 2016(VFE, EP, FITC, PITC ...)

314 / 335

Page 353: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Fixed points of EP = FITC approximation

approximate generative modelexact inference

exact generative modelapproximate inference

methods employingpseudo-data

FITC: Snelson et al. “Sparse Gaussian Processes using Pseudo-inputs”PITC: Snelson et al. “Local and global sparse Gaussian process approximations”EP: Csato and Opper 2002 / Qi et al. "Sparse-posterior Gaussian Processes for general likelihoods.” VFE: Titsias “Variational Learning of Inducing Variables in Sparse Gaussian Processes”DTC / PP: Seeger et al. “Fast Forward Selection to Speed Up Sparse Gaussian Process Regression”

VFEEPPP

FITCPITCDTC

A Unifying View of Sparse Approximate Gaussian Process RegressionQuinonero-Candela & Rasmussen, 2005(FITC, PITC, DTC)

A Unifying Framework for Sparse Gaussian Process Approximation using Power Expectation PropagationBui, Yan and Turner, 2016(VFE, EP, FITC, PITC ...)

315 / 335

Page 354: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Fixed points of EP = FITC approximation

approximate generative modelexact inference

exact generative modelapproximate inference

methods employingpseudo-data

FITC: Snelson et al. “Sparse Gaussian Processes using Pseudo-inputs”PITC: Snelson et al. “Local and global sparse Gaussian process approximations”EP: Csato and Opper 2002 / Qi et al. "Sparse-posterior Gaussian Processes for general likelihoods.” VFE: Titsias “Variational Learning of Inducing Variables in Sparse Gaussian Processes”DTC / PP: Seeger et al. “Fast Forward Selection to Speed Up Sparse Gaussian Process Regression”

VFEEPPP

FITCPITCDTC

A Unifying View of Sparse Approximate Gaussian Process RegressionQuinonero-Candela & Rasmussen, 2005(FITC, PITC, DTC)

A Unifying Framework for Sparse Gaussian Process Approximation using Power Expectation PropagationBui, Yan and Turner, 2016(VFE, EP, FITC, PITC ...)

315 / 335

Page 355: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Fixed points of EP = FITC approximation

approximate generative modelexact inference

exact generative modelapproximate inference

methods employingpseudo-data

FITC: Snelson et al. “Sparse Gaussian Processes using Pseudo-inputs”PITC: Snelson et al. “Local and global sparse Gaussian process approximations”EP: Csato and Opper 2002 / Qi et al. "Sparse-posterior Gaussian Processes for general likelihoods.” VFE: Titsias “Variational Learning of Inducing Variables in Sparse Gaussian Processes”DTC / PP: Seeger et al. “Fast Forward Selection to Speed Up Sparse Gaussian Process Regression”

VFEEPPP

FITCPITCDTC

interpretation resolves issues with FITC: why does it work so well?

are we allowed to increase M with N

A Unifying View of Sparse Approximate Gaussian Process RegressionQuinonero-Candela & Rasmussen, 2005(FITC, PITC, DTC)

A Unifying Framework for Sparse Gaussian Process Approximation using Power Expectation PropagationBui, Yan and Turner, 2016(VFE, EP, FITC, PITC ...)

315 / 335

Page 356: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

EP algorithm

1. remove

2. include

3. project

4. update

take out onepseudo-observation

likelihood

add in onetrue observation

likelihood

project ontoapproximating

family

updatepseudo-observation

likelihood

cavity

tilted

1. minimum: moments matched at pseudo-inputs2. Gaussian regression: matches moments everywhere

KL between unnormalisedstochastic processes

rank 1

316 / 335

Page 357: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Power EP algorithm (as tractable as EP)

1. remove

2. include

3. project

4. update

take out fraction ofpseudo-observation

likelihood

add in fraction oftrue observation

likelihood

project ontoapproximating

family

updatepseudo-observation

likelihood

cavity

tilted

1. minimum: moments matched at pseudo-inputs2. Gaussian regression: matches moments everywhere

KL between unnormalisedstochastic processes

rank 1

317 / 335

Page 358: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Power EP: a unifying framework

FITCCsato and Opper, 2002

Snelson and Ghahramani, 2005

VFETitsias, 2009

318 / 335

Page 359: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Power EP: a unifying framework

GP Regression GP Classification

PEPVFE

EP

inter-domain

[4] Quiñonero-Candela et al. 2005[5] Snelson et al., 2005[6] Snelson, 2006[7] Schwaighofer, 2002

[10,5,6*]

[14*]

[12*,15*]

[13][17,13]

[9,11,8*]

[16*] inter-domain

structuredapprox.

structuredapprox.

(FITC)

[7,4*,6*](PITC)

[8] Titsias, 2009[9] Csató, 2002[10] Csató et al., 2002[11] Seeger et al., 2003

[12] Naish-Guzman et al, 2007[13] Qi et al., 2010[14] Hensman et al., 2015[15] Hernández-Lobato et al., 2016[16] Matthews et al., 2016[17] Figueiras-Vidal et al., 2009

PEPVFE

EP

* = optimised pseudo-inputs ** = structured versions of VFE recover VFE

** **

319 / 335

Page 360: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

How should I set the power parameter α?

6 UCI classification datasets20 random splitsM = 10, 50, 100

hypers and inducing inputs optimised

8 UCI regression datasets20 random splits

M = 0 - 200hypers and inducing

inputs optimised

0.0 0.2 0.4 0.6 0.8 1.02

3

4

5

6

7

8

0.0 0.2 0.4 0.6 0.8 1.02

3

4

5

6

7

8

MS

Era

nker

ror

rank

log-

loss

ran

klo

g-lo

ss r

ank

= 0.5 does well on average

0.0 0.2 0.4 0.6 0.8 1.02.5

3.0

3.5

4.0

4.5

5.0

5.5

6.0

0.0 0.2 0.4 0.6 0.8 1.01

2

3

4

5

6

7

320 / 335

Page 361: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Deep Gaussian Processes forRegression

321 / 335

Page 362: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Pros and cons of Gaussian Process Regression

Probabilistic Xneed well-calibrated predictive uncertainty (decision making)big models (even with small data)neural networks do not provide well calibrated uncertainty estimates

Non-parametric (∞ - parametric) Xmodel parameters: information bottleneck between training and test data

training data→ θθθ → test data

unbounded model (θθθ grows with data) pruned by data

Deep (and wide) 7

intrinsic hierarchical structure in real world datasimpler to learn a series of weak non-linearities

322 / 335

Page 363: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Pros and cons of Gaussian Process Regression

Probabilistic Xneed well-calibrated predictive uncertainty (decision making)big models (even with small data)neural networks do not provide well calibrated uncertainty estimates

Non-parametric (∞ - parametric) Xmodel parameters: information bottleneck between training and test data

training data→ θθθ → test data

unbounded model (θθθ grows with data) pruned by data

Deep (and wide) 7

intrinsic hierarchical structure in real world datasimpler to learn a series of weak non-linearities

322 / 335

Page 364: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Pros and cons of Gaussian Process Regression

Probabilistic Xneed well-calibrated predictive uncertainty (decision making)big models (even with small data)neural networks do not provide well calibrated uncertainty estimates

Non-parametric (∞ - parametric) Xmodel parameters: information bottleneck between training and test data

training data→ θθθ → test data

unbounded model (θθθ grows with data) pruned by data

Deep (and wide) 7

intrinsic hierarchical structure in real world datasimpler to learn a series of weak non-linearities

322 / 335

Page 365: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Pros and cons of Gaussian Process Regression

Probabilistic Xneed well-calibrated predictive uncertainty (decision making)big models (even with small data)neural networks do not provide well calibrated uncertainty estimates

Non-parametric (∞ - parametric) Xmodel parameters: information bottleneck between training and test data

training data→ θθθ → test data

unbounded model (θθθ grows with data) pruned by data

Deep (and wide) 7

intrinsic hierarchical structure in real world datasimpler to learn a series of weak non-linearities

322 / 335

Page 366: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

From Gaussian Processes to Deep Gaussian Processes

input

323 / 335

Page 367: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

From Gaussian Processes to Deep Gaussian Processes

input

323 / 335

Page 368: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

From Gaussian Processes to Deep Gaussian Processes

}input

input

323 / 335

Page 369: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

From Gaussian Processes to Deep Gaussian Processes

}input

input

long short

323 / 335

Page 370: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

From Gaussian Processes to Deep Gaussian Processes

}input

input

long short

DGP = multi-layer neural network, infinitely wide hidden layer

323 / 335

Page 371: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Deep Gaussian Processes

fl ∼ GP(0, k(., .))yn = g(xn) = fL(fL−1(· · · f2(f1(xn)))) + εn

hL−1,n := fL−1(· · · f1(xn)), yn = fL(hL−1,n) + εn

Deep GPsa aremulti-layer generalisation of Gaussian processes,equivalent to deep neural networks with infinitelywide hidden layers

Questions:How to perform inference and learning tractably?How Deep GPs compare to alternative,e.g. Bayesian neural networks?

aDamianou and Lawrence (2013) [unsupervised learning]

xn

h1,n

h2,n

yn

f1

f2

f3

N

324 / 335

Page 372: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Approximate inference for (Deep) Gaussian Processes

GP Regression GP Classification

PEPVFE

EP

inter-domain

[4] Quiñonero-Candela et al. 2005[5] Snelson et al., 2005[6] Snelson, 2006[7] Schwaighofer, 2002

[10,5,6*]

[14*]

[12*,15*]

[13][17,13]

[9,11,8*]

[16*] inter-domain

structuredapprox.

structuredapprox.

(FITC)

[7,4*,6*](PITC)

[8] Titsias, 2009[9] Csató, 2002[10] Csató et al., 2002[11] Seeger et al., 2003

[12] Naish-Guzman et al, 2007[13] Qi et al., 2010[14] Hensman et al., 2015[15] Hernández-Lobato et al., 2016[16] Matthews et al., 2016[17] Figueiras-Vidal et al., 2009

PEPVFE

EP

* = optimised pseudo-inputs ** = structured versions of VFE recover VFE

** **

325 / 335

Page 373: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Experiment: Value function of the mountain car problem

goalr = 10

car

each stepr=-1

[x,x]

x1

x2

GP fit

x1

x2

Value function

x1

x2

DGP fit

326 / 335

Page 374: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Experiment: Comparison to Bayesian neural networks

Compare DGPs with GPs and Bayesian neural networks with one andtwo hidden layers using:

VI(G): Graves’ VI [diagonal Gaussian, without the reparam. trick]VI(KW): Kingma and Welling’s VI [with the reparam. trick]PBP: ADF with Probablistic BackpropagationDropout: Combining dropout predictions at test timeSGLD: Stochastic gradient Langevin dynamicsHMC: Hamiltonian Monte Carlo [only for small networks]

327 / 335

Page 375: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Experiment: Comparison to Bayesian neural networks

Rankings of all methods across all datasets

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

MLL Average Rank

PBP−1Dropout−1SGLD−1DGP−1 50SGLD−2VI(KW)−1GP 50

DGP−3 100DGP−2 100DGP−3 50DGP−2 50

GP 100HMC−1

VI(KW)−2DGP−1 100

CD

328 / 335

Page 376: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Experiment: Comparison to Bayesian neural networks [Best results]

bostonN = 506D = 13

−2.5

−2.4

−2.3

−2.2

−2.1

−2.0

aver

age

test

log-

likel

ihoo

d/na

ts

concreteN = 1030

D = 8

−3.2

−3.0

−2.8

−2.6

−2.4

−2.2

−2.0

−1.8

energyN = 768D = 8

−2.0

−1.8

−1.6

−1.4

−1.2

−1.0

−0.8

−0.6

kin8nmN = 8192

D = 8

1.6

1.8

2.0

2.2

2.4

2.6

2.8

3.0

3.2

navalN = 11934

D = 16

5.0

5.5

6.0

6.5

7.0

7.5

powerN = 9568

D = 4

−2.65

−2.60

−2.55

−2.50

−2.45

−2.40

−2.35

−2.30

−2.25

proteinN = 45730

D = 9

−3.0

−2.5

−2.0

−1.5

−1.0

−0.5

0.0

red wineN = 1588D = 11

−0.8

−0.6

−0.4

−0.2

0.0

0.2

0.4

0.6

0.8

yachtN = 308D = 6

−1.8

−1.6

−1.4

−1.2

−1.0

−0.8

−0.6

−0.4

yearN = 515345

D = 90

−3.0

−2.5

−2.0

−1.5

−1.0

−0.5

0.0

0.5

BNN-deterministic BNN-sampling GP DGP

329 / 335

Page 377: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Experiment: Comparison to Bayesian neural networks [Best results]

bostonN = 506D = 13

−2.5

−2.4

−2.3

−2.2

−2.1

−2.0

aver

age

test

log-

likel

ihoo

d/na

ts

concreteN = 1030

D = 8

−3.2

−3.0

−2.8

−2.6

−2.4

−2.2

−2.0

−1.8

energyN = 768D = 8

−2.0

−1.8

−1.6

−1.4

−1.2

−1.0

−0.8

−0.6

kin8nmN = 8192

D = 8

1.6

1.8

2.0

2.2

2.4

2.6

2.8

3.0

3.2

navalN = 11934

D = 16

5.0

5.5

6.0

6.5

7.0

7.5

powerN = 9568

D = 4

−2.65

−2.60

−2.55

−2.50

−2.45

−2.40

−2.35

−2.30

−2.25

proteinN = 45730

D = 9

−3.0

−2.5

−2.0

−1.5

−1.0

−0.5

0.0

red wineN = 1588D = 11

−0.8

−0.6

−0.4

−0.2

0.0

0.2

0.4

0.6

0.8

yachtN = 308D = 6

−1.8

−1.6

−1.4

−1.2

−1.0

−0.8

−0.6

−0.4

yearN = 515345

D = 90

−3.0

−2.5

−2.0

−1.5

−1.0

−0.5

0.0

0.5

BNN-deterministic BNN-sampling GP DGP

330 / 335

Page 378: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Experiment: Comparison to Bayesian neural networks [Best results]

bostonN = 506D = 13

−2.5

−2.4

−2.3

−2.2

−2.1

−2.0

aver

age

test

log-

likel

ihoo

d/na

ts

concreteN = 1030

D = 8

−3.2

−3.0

−2.8

−2.6

−2.4

−2.2

−2.0

−1.8

energyN = 768D = 8

−2.0

−1.8

−1.6

−1.4

−1.2

−1.0

−0.8

−0.6

kin8nmN = 8192

D = 8

1.6

1.8

2.0

2.2

2.4

2.6

2.8

3.0

3.2

navalN = 11934

D = 16

5.0

5.5

6.0

6.5

7.0

7.5

powerN = 9568

D = 4

−2.65

−2.60

−2.55

−2.50

−2.45

−2.40

−2.35

−2.30

−2.25

proteinN = 45730

D = 9

−3.0

−2.5

−2.0

−1.5

−1.0

−0.5

0.0

red wineN = 1588D = 11

−0.8

−0.6

−0.4

−0.2

0.0

0.2

0.4

0.6

0.8

yachtN = 308D = 6

−1.8

−1.6

−1.4

−1.2

−1.0

−0.8

−0.6

−0.4

yearN = 515345

D = 90

−3.0

−2.5

−2.0

−1.5

−1.0

−0.5

0.0

0.5

BNN-deterministic BNN-sampling GP DGP

331 / 335

Page 379: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Experiment: Comparison to Bayesian neural networks [Best results]

bostonN = 506D = 13

−2.5

−2.4

−2.3

−2.2

−2.1

−2.0

aver

age

test

log-

likel

ihoo

d/na

ts

concreteN = 1030

D = 8

−3.2

−3.0

−2.8

−2.6

−2.4

−2.2

−2.0

−1.8

energyN = 768D = 8

−2.0

−1.8

−1.6

−1.4

−1.2

−1.0

−0.8

−0.6

kin8nmN = 8192

D = 8

1.6

1.8

2.0

2.2

2.4

2.6

2.8

3.0

3.2

navalN = 11934

D = 16

5.0

5.5

6.0

6.5

7.0

7.5

powerN = 9568

D = 4

−2.65

−2.60

−2.55

−2.50

−2.45

−2.40

−2.35

−2.30

−2.25

proteinN = 45730

D = 9

−3.0

−2.5

−2.0

−1.5

−1.0

−0.5

0.0

red wineN = 1588D = 11

−0.8

−0.6

−0.4

−0.2

0.0

0.2

0.4

0.6

0.8

yachtN = 308D = 6

−1.8

−1.6

−1.4

−1.2

−1.0

−0.8

−0.6

−0.4

yearN = 515345

D = 90

−3.0

−2.5

−2.0

−1.5

−1.0

−0.5

0.0

0.5

BNN-deterministic BNN-sampling GP DGP

332 / 335

Page 380: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Experiment: Comparison to Bayesian neural networks [Best results]

bostonN = 506D = 13

−2.5

−2.4

−2.3

−2.2

−2.1

−2.0

aver

age

test

log-

likel

ihoo

d/na

ts

concreteN = 1030

D = 8

−3.2

−3.0

−2.8

−2.6

−2.4

−2.2

−2.0

−1.8

energyN = 768D = 8

−2.0

−1.8

−1.6

−1.4

−1.2

−1.0

−0.8

−0.6

kin8nmN = 8192

D = 8

1.6

1.8

2.0

2.2

2.4

2.6

2.8

3.0

3.2

navalN = 11934

D = 16

5.0

5.5

6.0

6.5

7.0

7.5

powerN = 9568

D = 4

−2.65

−2.60

−2.55

−2.50

−2.45

−2.40

−2.35

−2.30

−2.25

proteinN = 45730

D = 9

−3.0

−2.5

−2.0

−1.5

−1.0

−0.5

0.0

red wineN = 1588D = 11

−0.8

−0.6

−0.4

−0.2

0.0

0.2

0.4

0.6

0.8

yachtN = 308D = 6

−1.8

−1.6

−1.4

−1.2

−1.0

−0.8

−0.6

−0.4

yearN = 515345

D = 90

−3.0

−2.5

−2.0

−1.5

−1.0

−0.5

0.0

0.5

BNN-deterministic BNN-sampling GP DGP

333 / 335

Page 381: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

Pros and cons of Gaussian Process Regression

Probabilistic Xneed well-calibrated predictive uncertainty (decision making)big models (even with small data)neural networks do not provide well calibrated uncertainty estimates

Non-parametric (∞ - parametric) Xmodel parameters: information bottleneck between training and test data

training data→ θθθ → test data

unbounded model (θθθ grows with data) pruned by data

Deep (and wide) Xintrinsic hierarchical structure in real world datasimpler to learn a series of weak non-linearities

334 / 335

Page 382: Gaussian Processes: From the Basics to the State-of-the-Artcbl.eng.cam.ac.uk/pub/Public/Turner/News/imperial-gp-tutorial.pdf · Gaussian Processes: From the Basics to the State-of-the-Art

References (hyperlinked)

Deep Gaussian Processes:Deep Gaussian Processes for Regression using ApproximateExpectation Propagation, ICML 2016

Approximate inference in GPs:

A Unifying Framework for Sparse Gaussian Process Approximationusing Power Expectation Propagation, arXiv preprint 2016

Scalable Approximate inference:Stochastic Expectation Propagation, NIPS 2015Black-box α-divergence Minimization, ICML 2016

335 / 335