inlaandinlabru withspatial patterns · 2021. 6. 9. · control.compute = list(waic = true),...
TRANSCRIPT
www.IN
BO.be
INLA and inlabruwith spatialpatternsThierry Onkelinx
Overzicht
1 Checking spatial autocorrelationPearson residualsVariogram
2 Prepare the modelCreating a meshCreating an SPDE model
3 Fitting the modelOnly the dataPredictions
1 / 54
www.IN
BO.be
Checkingspatial auto-correlation
www.IN
BO.be
Checkingspatial auto-correlationPearson residuals
www.IN
BO.be
Definition
▶ components?
▶ observed value (yi), fitted value (yi), mean squared error (MSE)▶ formula▶
pri =yi − yi√MSE
▶ MSE (variance) depends on distribution! Check it usinginla.doc("name_of_your_distribution").
4 / 54
www.IN
BO.be
Definition
▶ components?▶ observed value (yi), fitted value (yi), mean squared error (MSE)
▶ formula▶
pri =yi − yi√MSE
▶ MSE (variance) depends on distribution! Check it usinginla.doc("name_of_your_distribution").
4 / 54
www.IN
BO.be
Definition
▶ components?▶ observed value (yi), fitted value (yi), mean squared error (MSE)▶ formula
▶pri =
yi − yi√MSE
▶ MSE (variance) depends on distribution! Check it usinginla.doc("name_of_your_distribution").
4 / 54
www.IN
BO.be
Definition
▶ components?▶ observed value (yi), fitted value (yi), mean squared error (MSE)▶ formula▶
pri =yi − yi√MSE
▶ MSE (variance) depends on distribution! Check it usinginla.doc("name_of_your_distribution").
4 / 54
www.IN
BO.be
Definition
▶ components?▶ observed value (yi), fitted value (yi), mean squared error (MSE)▶ formula▶
pri =yi − yi√MSE
▶ MSE (variance) depends on distribution! Check it usinginla.doc("name_of_your_distribution").
4 / 54
www.IN
BO.be
Example data: rainfall in Parana state, Brazil
-27
-26
-25
-24
-23
-54 -52 -50 -48
Longitude
Lati
tude
Rain
30
60
90
30
60
90
Rain
5 / 54
www.IN
BO.be
Calculate Pearson residuals
model_iid <- inla(Rain ~ Xc + Yc, family = "gamma", data = dataset,control.compute = list(waic = TRUE))
dataset %>%mutate(
mu = model_iid$summary.fitted.values$mean,sigma2 = mu ^ 2 / model_iid$summary.hyperpar[1, "mean"],Pearson_iid = (Rain - mu) / sqrt(sigma2)
) -> dataset
6 / 54
www.IN
BO.be
Challenge 1
▶ What is the mean for your model?▶ What is the variance for your model? Hint: inla.doc("your
distribution")▶ Calculate the Pearson residuals for your model
7 / 54
www.IN
BO.be
Checkingspatial auto-correlationVariogram
www.IN
BO.be
Definition
vg_default <- variogram(Pearson_iid ~ 1, locations = ~X + Y,data = as.data.frame(dataset), cressie = TRUE)
0.00
0.25
0.50
0.75
0 100 200
distance (km)
vari
ance
9 / 54
www.IN
BO.be
Important characteristics
1
3
2
4
0.00
0.25
0.50
0.75
0 100 200
distance (km)
vari
ance
10 / 54
www.IN
BO.be
Important characteristics
range
nugget
sill
partial sill
0.00
0.25
0.50
0.75
0 100 200
distance (km)
vari
ance
11 / 54
www.IN
BO.be
Projected example data
7000000
7100000
7200000
7300000
7400000
7500000
5000000 5100000 5200000 5300000 5400000 5500000 5600000
Rain
30
60
90
30
60
90
Rain
12 / 54
www.IN
BO.be
Increased cutoff
vg_large <- variogram(Pearson_iid ~ 1, locations = ~X + Y, cressie = TRUE,data = as.data.frame(dataset), cutoff = 600e3)
0.00
0.25
0.50
0.75
0 200 400 600
distance (km)
vari
ance
13 / 54
www.IN
BO.be
Number of point pairs is important
3010
8473
11968
143751583916722
1597014653124159514
6920
44932781
1370546
0.00
0.25
0.50
0.75
0 200 400 600
distance (km)
vari
ance
14 / 54
www.IN
BO.be
Too small width leads to unstable variograms
vg_small <- variogram(Pearson_iid ~ 1, locations = ~X + Y, cressie = TRUE,data = as.data.frame(dataset), width = 1e3)
0.0
0.3
0.6
0.9
1.2
0 100 200
distance (km)
vari
ance
number ofpoint pairs
(0,10]
(10,20]
(20,50]
(50,100]
(100,200]
(200,500]
15 / 54
www.IN
BO.be
Sensible small width yields the most informative variogram
vg_final <- variogram(Pearson_iid ~ 1, locations = ~X + Y, cressie = TRUE,data = as.data.frame(dataset), width = 10e3)
150590
9451325
16942008
22322539272429483128316833723518
3762
3723
383639074014
40824211
4065423542114096
3882
2206
0.00
0.25
0.50
0.75
0 100 200
distance (km)
vari
ance
16 / 54
www.IN
BO.be
Challenge 2
▶ What is the minimum binwidth for your data?▶ Calculate the variogram for your model▶ What is the approximate range of of the variogram?▶ What is the nugget, sill and partial sill?
17 / 54
www.IN
BO.be
Prepare themodel
www.IN
BO.be
Prepare themodelCreating a mesh
www.IN
BO.be
Size of a mesh I
20 / 54
www.IN
BO.be
Size of a mesh II
21 / 54
www.IN
BO.be
Size of a mesh III
22 / 54
www.IN
BO.be
Size of a mesh IV
23 / 54
www.IN
BO.be
Size of a mesh V
24 / 54
www.IN
BO.be
Guidelines
▶ equilateral triangles work best▶ edge length should be around a third to a tenth of the range▶ avoid narrow triangles▶ avoid small edges▶ add extra, larger triangles around the border▶ simplify the border
25 / 54
www.IN
BO.be
Mesh only within the border
mesh <- inla.mesh.2d(boundary = border, max.edge = 0.15)ggplot() + gg(mesh) + coord_fixed() + theme_map() +ggtitle(paste("Vertices: ", mesh$n))
Vertices: 261
26 / 54
www.IN
BO.be
Mesh going outside the border
mesh <- inla.mesh.2d(boundary = border, max.edge = c(0.15, 0.3))ggplot() + gg(mesh) + coord_fixed() + theme_map() +ggtitle(paste("Vertices: ", mesh$n))
Vertices: 417
27 / 54
www.IN
BO.be
Mesh for rainfall data
mesh <- inla.mesh.2d(boundary = boundary, max.edge = c(30e3, 100e3))ggplot(dataset) + gg(mesh) + geom_sf() + ggtitle(paste("Vertices: ", mesh$n)) +coord_sf(datum = st_crs(5880))
7000000
7100000
7200000
7300000
7400000
7500000
7600000
4800000 5000000 5200000 5400000 5600000
x
yVertices: 8531
28 / 54
www.IN
BO.be
Use cutoff to simplify mesh
mesh1 <- inla.mesh.2d(boundary = boundary, max.edge = c(30e3, 100e3),cutoff = 10e3)
ggplot(dataset) + gg(mesh1) + geom_sf() +ggtitle(paste("Vertices: ", mesh1$n)) + coord_sf(datum = st_crs(5880))
7000000
7100000
7200000
7300000
7400000
7500000
7600000
4800000 5000000 5200000 5400000 5600000
x
y
Vertices: 844
29 / 54
www.IN
BO.be
Finer mesh for final model run
mesh2 <- inla.mesh.2d(boundary = boundary, max.edge = c(10e3, 30e3),cutoff = 5e3)
ggplot(dataset) + gg(mesh2) + geom_sf() +ggtitle(paste("Vertices: ", mesh2$n)) + coord_sf(datum = st_crs(5880))
7000000
7100000
7200000
7300000
7400000
7500000
7600000
4800000 5000000 5200000 5400000 5600000
x
y
Vertices: 5920
30 / 54
www.IN
BO.be
Challenge 3
▶ What are the relevant max.edge and cutoff for a course mesh?▶ What are the relevant max.edge and cutoff for a smooth mesh?▶ Create a course and a smooth mesh for your data
31 / 54
www.IN
BO.be
Prepare themodelCreating an SPDE model
www.IN
BO.be
SPDE using penalised complexity priors
Stochastic Partial Differential Equations▶ prior.range = c(r, alpha_r): P(ρ < r) < αr▶ prior.sigma = c(s, alpha_s): P(σ > s) < αs
spde1 <- inla.spde2.pcmatern(mesh1, prior.range = c(100e3, 0.5),prior.sigma = c(0.9, 0.05))
spde2 <- inla.spde2.pcmatern(mesh2, prior.range = c(100e3, 0.5),prior.sigma = c(0.9, 0.05))
33 / 54
www.IN
BO.be
Challenge 4
▶ What are relevant priors for the range and sigma for your data▶ Hint: see challenge 2
▶ Make the SPDE models for your data
34 / 54
www.IN
BO.be
Fitting themodel
www.IN
BO.be
Fitting themodelOnly the data
www.IN
BO.be
The stack for the observed data
A1 <- inla.spde.make.A(mesh = mesh1, loc = st_coordinates(dataset))stack1 <- inla.stack(tag = "estimation", ## tagdata = list(Rain = dataset$Rain), ## responseA = list(A1, 1), ## projector matrices (SPDE and fixed effects)effects = list(
list(site = seq_len(spde1$n.spde)), ## random field indexdataset %>%
as.data.frame() %>%transmute(Intercept = 1, Xc, Yc) ## fixed effect covariates
))
37 / 54
www.IN
BO.be
Model fit
INLA
model_spde1 <- inla(Rain ~ 0 + Intercept + Xc + Yc + f(site, model = spde1),family = "gamma", data = inla.stack.data(stack1),control.predictor = list(A = inla.stack.A(stack1)),control.compute = list(waic = TRUE)
)
inlabru
bru_spde1 <- bru(Rain ~ Xc + Yc + site(map = st_coordinates, model = spde1),family = "gamma", data = dataset)
bru_spde1 <- bru(Rain ~ Xc + Yc + site(map = coordinates, model = spde1),family = "gamma", data = as_Spatial(dataset))
38 / 54
www.IN
BO.be
Comparison of fixed effect parameters
Intercept Xc Yc
INLA inlabru INLA inlabru INLA inlabru
-0.70
-0.65
-0.60
-0.55
-0.50
-0.45
-0.30
-0.25
-0.20
-0.15
-0.10
3.4
3.6
3.8
4.0
model
mea
n
39 / 54
www.IN
BO.be
Comparing hyperparameters
Precision parameter for the Gamma observations Range for site Stdev for site
INLA inlabru INLA inlabru INLA inlabru
0.40
0.45
0.50
0.55
50000
75000
100000
125000
3.6
4.0
4.4
4.8
model
mea
n
40 / 54
www.IN
BO.be
Correlation structure
spde.posterior(bru_spde1, "site", what = "matern.covariance") -> covplotspde.posterior(bru_spde1, "site", what = "matern.correlation") -> corplotmultiplot(plot(covplot), plot(corplot))
0.0
0.1
0.2
0.3
0 50000 100000 150000
x
med
ian
0.00
0.25
0.50
0.75
1.00
0 50000 100000 150000
x
med
ian
41 / 54
www.IN
BO.be
Calculate Pearson residuals
dataset %>%mutate(
mu = model_spde1$summary.fitted.values$mean,sigma2 = mu ^ 2 / model_spde1$summary.hyperpar[1, "mean"],Pearson_iid = (Rain - mu) / sqrt(sigma2)
) -> dataset
## Error: Problem with `mutate()` input `mu`.## x Input `mu` can't be recycled to size 528.## i Input `mu` is `model_spde1$summary.fitted.values$mean`.## i Input `mu` must be size 528 or 1, not 1664.
42 / 54
www.IN
BO.be
Using the stack index
si <- inla.stack.index(stack1, "estimation")$datadataset %>%mutate(
mu = model_spde1$summary.fitted.values$mean[si],sigma2 = mu ^ 2 / model_spde1$summary.hyperpar[1, "mean"],Pearson_spde = (Rain - mu) / sqrt(sigma2)
) -> dataset
43 / 54
www.IN
BO.be
Using inlabru
fit <- predict(bru_spde1, as_Spatial(dataset), ~exp(Intercept + Xc + Yc + site))dataset %>%
mutate(mu = fit$mean,sigma2 = mu ^ 2 / model_spde1$summary.hyperpar[1, "mean"],Pearson_spde = (Rain - mu) / sqrt(sigma2)
) -> dataset
44 / 54
www.IN
BO.be
Variogram
vg_fit <- variogram(Pearson_spde ~ 1, cressie = TRUE,data = as_Spatial(dataset), width = 10e3)
0.0
0.2
0.4
0.6
0 100 200
distance (km)
vari
ance
45 / 54
www.IN
BO.be
Interpolate GMRF
A1.grid <- inla.mesh.projector(mesh1, dims = c(41, 41))inla.mesh.project(A1.grid, model_spde1$summary.random$site) %>%as.matrix() %>%as.data.frame() %>%bind_cols(
expand.grid(x = A1.grid$x, y = A1.grid$y)) %>%filter(!is.na(ID)) -> eta_spde
46 / 54
www.IN
BO.be
Plot GMRF
ggplot(dataset) + geom_tile(data = eta_spde, aes(x = x, y = y, fill = mean)) +geom_sf() + scale_fill_gradient2()
27°S
26°S
25°S
24°S
23°S
22°S
56°W 54°W 52°W 50°W 48°W
x
y
-1.0
-0.5
0.0
0.5
mean
47 / 54
www.IN
BO.be
Fitting themodelPredictions
www.IN
BO.be
Prediction stack for SPDE grid + fixed effects
expand.grid(X = A1.grid$x, Y = A1.grid$y) %>%mutate(Intercept = 1, Xc = X / 1e5 - 53, Yc = Y / 1e5 - 71) -> grid_data
stack1_grid <- inla.stack(tag = "grid", ## tagdata = list(Rain = NA), ## responseA = list(A1.grid$proj$A, 1), ## projector matrices (SPDE and fixed effects)effects = list(
list(site = seq_len(spde1$n.spde)), ## random field indexgrid_data ## covariates at grid locations
))
49 / 54
www.IN
BO.be
Refit the model with the combinated stack
stack_all <- inla.stack(stack1, stack1_grid)model_grid <- inla(Rain ~ 0 + Intercept + Xc + Yc + f(site, model = spde1),
family = "gamma", data = inla.stack.data(stack_all),control.predictor = list(A = inla.stack.A(stack_all),
link = 1),control.compute = list(waic = TRUE),control.mode = list(theta = model_spde1$mode$theta,
restart = FALSE),control.results = list(return.marginals.random = FALSE,
return.marginals.predictor = FALSE))
50 / 54
www.IN
BO.be
Plot grid I
si <- inla.stack.index(stack_all, "grid")$datagrid_data %>%bind_cols(model_grid$summary.fitted.values[si, ]) %>%`coordinates<-`(~X + Y) %>%`proj4string<-`(CRS(SRS_string = "EPSG:5880")) -> gd
gd[!is.na(over(gd, boundary)), ] %>%as.data.frame() %>%ggplot() + geom_tile(aes(x = X, y = Y, fill = mean)) + coord_fixed()
51 / 54
www.IN
BO.be
Plot grid II
7100000
7200000
7300000
7400000
7500000
5000000 5200000 5400000 5600000
X
Y
20
40
60
80
mean
52 / 54
www.IN
BO.be
Using inlabru
pred_mesh <- predict(bru_spde1, pixels(mesh1), ~exp(Intercept + Xc + Yc + site))ggplot() + gg(pred_mesh) + gg(boundary)
7000000
7200000
7400000
7600000
5000000 5250000 5500000
x
y
10
20
30
40
50
mean
53 / 54
www.IN
BO.be
Challenge 5
▶ Fit the model using the SPDE▶ Plot a map of the GMRF▶ Plot a map of the predictions and their credible interval
54 / 54
www.IN
BO.be