just another gibbs sampler: jags · just another gibbs sampler: jags...

30
Just Another Gibbs Sampler: JAGS Inglis, A., Ahmed, A., Wundervald, B. and Prado, E. PhD students at Hamilton Institute 14th November of 2018 1/1

Upload: others

Post on 15-Aug-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Just Another Gibbs Sampler: JAGS · Just Another Gibbs Sampler: JAGS Inglis,A.,Ahmed,A.,Wundervald,B.andPrado,E. PhDstudentsatHamiltonInstitute 14thNovemberof2018 1/1

Just Another Gibbs Sampler: JAGS

Inglis, A., Ahmed, A., Wundervald, B. and Prado, E.PhD students at Hamilton Institute

14th November of 20181 / 1

Page 2: Just Another Gibbs Sampler: JAGS · Just Another Gibbs Sampler: JAGS Inglis,A.,Ahmed,A.,Wundervald,B.andPrado,E. PhDstudentsatHamiltonInstitute 14thNovemberof2018 1/1

Agenda

1. Introduction

2. R scripts

3. Python scripts

4. New models in JAGS4.1 Poisson model4.2 Exponential survival model4.3 Gaussian Mixture model

5. Final remarks

6. Acknowledgments

2 / 1

Page 3: Just Another Gibbs Sampler: JAGS · Just Another Gibbs Sampler: JAGS Inglis,A.,Ahmed,A.,Wundervald,B.andPrado,E. PhDstudentsatHamiltonInstitute 14thNovemberof2018 1/1

Introduction

I JAGS is a program to perform inference for BayesianHierarchical models. It was proposed by Martyn Plummer in2003 as an alternative to the BUGS software;

I The Gibbs Sampler function of JAGS is the ARMS, which isflexible for dealing with univariate target densities;

I The main advantages of JAGS, compared to BUGS, is theprogramming language and its interfaces with other software,such as R and Python;

I Also, JAGS does not require the specification of the conditionaldistributions.

3 / 1

Page 4: Just Another Gibbs Sampler: JAGS · Just Another Gibbs Sampler: JAGS Inglis,A.,Ahmed,A.,Wundervald,B.andPrado,E. PhDstudentsatHamiltonInstitute 14thNovemberof2018 1/1

Introduction

I Although JAGS is a really useful software to learn, there arefew good examples online that explain both theory and code;

I The goals of this project were:I Learn how to use JAGS to perform Bayesian modelling and;I Write new codes;

I Basically, JAGS requires only the sampling distribution and theprior distribution for each parameter.

4 / 1

Page 5: Just Another Gibbs Sampler: JAGS · Just Another Gibbs Sampler: JAGS Inglis,A.,Ahmed,A.,Wundervald,B.andPrado,E. PhDstudentsatHamiltonInstitute 14thNovemberof2018 1/1

Introduction

model {# Likelihood

for (i in 1:n) {y[i] ~ dnorm(mu[i], precision)mu[i] <- alpha + beta * x

}# Prior distributions

alpha ~ dnorm(0.0, 1.0E-3)beta ~ dnorm(0.0, 1.0E-3)aux <- dgamma(0.001, 0.001)precision ~ 1.0/ aux

}

5 / 1

Page 6: Just Another Gibbs Sampler: JAGS · Just Another Gibbs Sampler: JAGS Inglis,A.,Ahmed,A.,Wundervald,B.andPrado,E. PhDstudentsatHamiltonInstitute 14thNovemberof2018 1/1

R scripts

I Our main contributions were to add mathematical detailsand provide real datasets examples for 5 R scripts;

I The models involved were:I Random effect model;I Multivariate Normal model;I Beta regression;I Time series Beta Auto-Regressive model of order 1 and;I Mixture model.

6 / 1

Page 7: Just Another Gibbs Sampler: JAGS · Just Another Gibbs Sampler: JAGS Inglis,A.,Ahmed,A.,Wundervald,B.andPrado,E. PhDstudentsatHamiltonInstitute 14thNovemberof2018 1/1

R script - Beta regression

Let {Yi}ni=1 be independent and identically distributed randomvariables and Xi = (1, xi ,1, ..., xi ,1) a line vector with all covariates ofthe individual i . We assume that Yi is distributed according to aBeta distribution, denoted by Yi ∼ Beta(µ, φ), which may bewritten in the form

f (Yi |µ, φ) = Γ(φ)Γ(µφ)Γ((1− µ)φ)Y µφ−1

i (1− Yi )(1−µ)/φ,

where 0 < Yi < 1, E(Yi ) = µ, V(Yi ) = µ(1− µ)/(1 + φ),0 < µ < 1 and φ > 0. Thus, it is possible to model g(µ) = Xiβ,where g(·) is the link function that maps the unit interval into R.

This parametrization was proposed by Ferrari and Cribari-Neto(2004).

7 / 1

Page 8: Just Another Gibbs Sampler: JAGS · Just Another Gibbs Sampler: JAGS Inglis,A.,Ahmed,A.,Wundervald,B.andPrado,E. PhDstudentsatHamiltonInstitute 14thNovemberof2018 1/1

R script - Beta regression

model{

# Likelihoodfor (t in 1:T) {y[t] ~ dbeta(a[t], b[t])a[t] <- mu[t] * phib[t] <- (1 - mu[t]) * philogit(mu[t]) <- alpha + beta * x[t]}# Priorsalpha ~ dnorm(0, 10^-2)beta ~ dnorm(0, 10^-2)phi ~ dunif(0, 10)

}

8 / 1

Page 9: Just Another Gibbs Sampler: JAGS · Just Another Gibbs Sampler: JAGS Inglis,A.,Ahmed,A.,Wundervald,B.andPrado,E. PhDstudentsatHamiltonInstitute 14thNovemberof2018 1/1

R script - Beta regressionlibrary(datasets)head(attenu)

#Set up the dataacc=with(attenu,list(y=attenu$accel

,T=nrow(attenu)))

# Set up jags modeljags_model=jags(acc,

parameters.to.save = model_parameters,model.file = textConnection(model_code),n.chains=4,n.iter=1000,n.burnin=200,n.thin=2)

# Plot the jags outputprint(jags_model)

9 / 1

Page 10: Just Another Gibbs Sampler: JAGS · Just Another Gibbs Sampler: JAGS Inglis,A.,Ahmed,A.,Wundervald,B.andPrado,E. PhDstudentsatHamiltonInstitute 14thNovemberof2018 1/1

R script - Beta regression

Simulation results for Beta regression, where three parameters wereconsider. The true values were set in α = −1, β = 0.2 and φ = 5.

10 / 1

Page 11: Just Another Gibbs Sampler: JAGS · Just Another Gibbs Sampler: JAGS Inglis,A.,Ahmed,A.,Wundervald,B.andPrado,E. PhDstudentsatHamiltonInstitute 14thNovemberof2018 1/1

Python scripts

I 3 Python scripts were created providing mathematical detailsand simulated and real datasets examples;

I The models involved were:I Bayesian linear regression;I Logistic regression;I Beta regression.

11 / 1

Page 12: Just Another Gibbs Sampler: JAGS · Just Another Gibbs Sampler: JAGS Inglis,A.,Ahmed,A.,Wundervald,B.andPrado,E. PhDstudentsatHamiltonInstitute 14thNovemberof2018 1/1

Python script - Logistic regression

The logistic regression model assumes that a sequence ofindependent and identically distributed random variables {Yi}n1 has aBinomial distribution, denoted by Yi ∼ Binomial(pi ), in the form of

f (Yi |pi ) =(

nYi

)pYi

i (1− pi )1−Yi ,

where Yi ∈ {0, 1}, log( pi1−pi

) = Xiβ, Xi = (1, xi ,1, ..., xi ,1) is the linevector of covariates associated to the individual i and β is thevector of unknown parameters.

12 / 1

Page 13: Just Another Gibbs Sampler: JAGS · Just Another Gibbs Sampler: JAGS Inglis,A.,Ahmed,A.,Wundervald,B.andPrado,E. PhDstudentsatHamiltonInstitute 14thNovemberof2018 1/1

Python script - Logistic regression

model{

# Likelihoodfor (t in 1:n) {y[t] ~ dbin(p[t], 1)logit(p[t]) <- beta_0 + beta_1 * x_1[t] +

beta_2 * x_2[t]}

# Priorsbeta_0 ~ dnorm(0.0,0.01)beta_1 ~ dnorm(0.0,0.01)beta_2 ~ dnorm(0.0,0.01)

}

13 / 1

Page 14: Just Another Gibbs Sampler: JAGS · Just Another Gibbs Sampler: JAGS Inglis,A.,Ahmed,A.,Wundervald,B.andPrado,E. PhDstudentsatHamiltonInstitute 14thNovemberof2018 1/1

Python script - Logistic regression

The data obtained for this section was adapted from data used tomodel logistic regression of moth mortalities when exposed toidentical doses of a particular insecticide;

I Response variable: denotes the number of deaths observed ineach batch;

I X1 it is defined as the sexcode (i.e male or female);I X2 it is defined as the dose administered to each moth.

14 / 1

Page 15: Just Another Gibbs Sampler: JAGS · Just Another Gibbs Sampler: JAGS Inglis,A.,Ahmed,A.,Wundervald,B.andPrado,E. PhDstudentsatHamiltonInstitute 14thNovemberof2018 1/1

Python script - Logistic regression

# Set up the datamodel = pyjags.Model(code, data=dict(T = T, y = y,

x_1 = x_1,x_2 = x_2, K = 1))

# Number of iterations to remove at startmodel.sample(200, vars=[])

# Choose the parameters to watch and iterations:samples = model.sample(1000, vars=['alpha', 'beta_1',

'beta_2'])

print(jags_model)

15 / 1

Page 16: Just Another Gibbs Sampler: JAGS · Just Another Gibbs Sampler: JAGS Inglis,A.,Ahmed,A.,Wundervald,B.andPrado,E. PhDstudentsatHamiltonInstitute 14thNovemberof2018 1/1

Python script - Logistic regression

16 / 1

Page 17: Just Another Gibbs Sampler: JAGS · Just Another Gibbs Sampler: JAGS Inglis,A.,Ahmed,A.,Wundervald,B.andPrado,E. PhDstudentsatHamiltonInstitute 14thNovemberof2018 1/1

New models

Here we present 3 new models that we implemented both in R andPython;

The models are:

I Poisson regression model;I Exponential survival model;I Gaussian Mixture model.

17 / 1

Page 18: Just Another Gibbs Sampler: JAGS · Just Another Gibbs Sampler: JAGS Inglis,A.,Ahmed,A.,Wundervald,B.andPrado,E. PhDstudentsatHamiltonInstitute 14thNovemberof2018 1/1

Poisson regression model

A random variable Y is said to have a Poisson distribution withparameter λ if it takes integers y = 0, 1, 2 . . . with probability massfunction

P(Y = y) = exp{−λ}λy

y ! ,

where λ > 0. This mean can be modelled via a link function passedin a systematic component. For the Poisson regression case, themost widely used link function is the natural log, resulting in aequation that has the form

log(λ̂) = β0 + β1φ(x1) + · · ·+ βnφ(xn).

18 / 1

Page 19: Just Another Gibbs Sampler: JAGS · Just Another Gibbs Sampler: JAGS Inglis,A.,Ahmed,A.,Wundervald,B.andPrado,E. PhDstudentsatHamiltonInstitute 14thNovemberof2018 1/1

Poisson regression model

model{

# Likelihoodfor (i in 1:T) {

y[i] ~ dpois(p[i])log(p[i]) <- alpha + beta_1 * x_1[i] +

beta_2 * x_2[i]}# Priorsalpha ~ dnorm(0.0, 0.01)beta_1 ~ dnorm(0.0, 0.01)beta_2 ~ dnorm(0.0, 0.01)

}'

19 / 1

Page 20: Just Another Gibbs Sampler: JAGS · Just Another Gibbs Sampler: JAGS Inglis,A.,Ahmed,A.,Wundervald,B.andPrado,E. PhDstudentsatHamiltonInstitute 14thNovemberof2018 1/1

Poisson regression model

Simulate data -----------------------

# Some R code to simulate data from the Poisson modelT = 1000set.seed(123)x_1 = sort(runif(T, 0, 5))x_2 = sort(runif(T, 0, 5))alpha = 1beta_1 = 1.2beta_2 = -0.3mu = alpha + beta_1 * x_1 + beta_2 * x_2lambda = exp(mu)y = rpois(n = T, lambda = lambda)

20 / 1

Page 21: Just Another Gibbs Sampler: JAGS · Just Another Gibbs Sampler: JAGS Inglis,A.,Ahmed,A.,Wundervald,B.andPrado,E. PhDstudentsatHamiltonInstitute 14thNovemberof2018 1/1

Poisson regression model

0

100

200

0

100

200

0 1 2 3 4 5 0 1 2 3 4 5

x_1 x_2

y y

Y versus the explanatory variables with predicted line

21 / 1

Page 22: Just Another Gibbs Sampler: JAGS · Just Another Gibbs Sampler: JAGS Inglis,A.,Ahmed,A.,Wundervald,B.andPrado,E. PhDstudentsatHamiltonInstitute 14thNovemberof2018 1/1

Exponential Survival modelsIn survival analysis, we are usually interested in modelling the timeuntil a certain event occurs. Let T be a random variablerepresenting the survival times of individuals in some population.

F (t) = P(T ≤ t) =∫ t

0f (u)du

Survival data is also often censored. In this case, the likelihood iswritten as

L(t) =n∏

i=1[f (ti )]δi [S(ti )]1−δi ,

where δi is the indicator variable that takes 1 for the failures and 0for censored observations. We consider for the failure time theExponential distribution, given by

f (t) = 1α

exp{− tα

}, α > 0.

22 / 1

Page 23: Just Another Gibbs Sampler: JAGS · Just Another Gibbs Sampler: JAGS Inglis,A.,Ahmed,A.,Wundervald,B.andPrado,E. PhDstudentsatHamiltonInstitute 14thNovemberof2018 1/1

Exponential Survival models

model{

# Likelihoodfor (i in 1:T) {

mu[i] = exp(beta_1 * x_1[i] + beta_2 * x_2[i])t[i] ~ dexp(mu[i] * lambda_0)

}# Priorslambda_0 ~ dgamma(1, 1)beta_1 ~ dnorm(0.0, 0.01)beta_2 ~ dnorm(0.0, 0.01)

}'

23 / 1

Page 24: Just Another Gibbs Sampler: JAGS · Just Another Gibbs Sampler: JAGS Inglis,A.,Ahmed,A.,Wundervald,B.andPrado,E. PhDstudentsatHamiltonInstitute 14thNovemberof2018 1/1

Exponential Survival models

24 / 1

Page 25: Just Another Gibbs Sampler: JAGS · Just Another Gibbs Sampler: JAGS Inglis,A.,Ahmed,A.,Wundervald,B.andPrado,E. PhDstudentsatHamiltonInstitute 14thNovemberof2018 1/1

Gaussian Mixture model

The mixture model is viewed hierarchically: the observations y aremodeled conditionally on the vector z , having z itself a probabilisticspecification.

p(yi |θ, λ) = λ1f (yi |θi ) + · · ·+ λH f (yi |θH),

where the vector λ = (λ1, . . . , λH) represents the proportions of thepopulation taken as being drawn from each fh(yi |θh) distribution,for h = 1, . . . ,H, also that

∑Hh=1 λh = 1. Usually, the mixture

components are assumed to be part of the same parametric family,such as the Gaussian, but with different parameter vectors. Theunobserved variables is written as

zih ={1, of the ith unite is drawn from the hth component0, otherwise

25 / 1

Page 26: Just Another Gibbs Sampler: JAGS · Just Another Gibbs Sampler: JAGS Inglis,A.,Ahmed,A.,Wundervald,B.andPrado,E. PhDstudentsatHamiltonInstitute 14thNovemberof2018 1/1

Gaussian Mixture model

model {# Likelihood:for(i in 1:N) {

y[i] ~ dnorm(mu[i] , 1/sigma_inv)mu[i] <- mu_clust[clust[i]]clust[i] ~ dcat(lambda_clust[1:Nclust])

}# Prior:sigma_inv ~ dgamma( 0.01 ,0.01)mu_clust[1] ~ dnorm(0, 10)mu_clust[2] ~ dnorm(5, 10)

lambda_clust[1:Nclust] ~ ddirch(ones)}

26 / 1

Page 27: Just Another Gibbs Sampler: JAGS · Just Another Gibbs Sampler: JAGS Inglis,A.,Ahmed,A.,Wundervald,B.andPrado,E. PhDstudentsatHamiltonInstitute 14thNovemberof2018 1/1

Gaussian Mixture model

µ1 µ2

0.00

0.05

0.10

0.15

0.20

−5 0 5 10

y

dens

ity

27 / 1

Page 28: Just Another Gibbs Sampler: JAGS · Just Another Gibbs Sampler: JAGS Inglis,A.,Ahmed,A.,Wundervald,B.andPrado,E. PhDstudentsatHamiltonInstitute 14thNovemberof2018 1/1

Final remarks

I This project we learned how to use JAGS to perform Bayesiananalysis;

I We had some challenges: i) JAGS was new for whole thegroup; ii) Difficulties to run the package pyjags;

I Our contributions were to provide mathematical details andcodes for existing and new models. In the end, we produced:I 8 R scripts (3 new);I 3 Python scripts (all of them new);

I As future work other models can be implemented, such asother GLMs, Geostatistical models, more complex Survivalmodels and other GLMMs with/without longitudinal structure.

28 / 1

Page 29: Just Another Gibbs Sampler: JAGS · Just Another Gibbs Sampler: JAGS Inglis,A.,Ahmed,A.,Wundervald,B.andPrado,E. PhDstudentsatHamiltonInstitute 14thNovemberof2018 1/1

Acknowledgments

This work was supported by a Science Foundation Ireland CareerDevelopment Award grant number: 17/CDA/4695

29 / 1

Page 30: Just Another Gibbs Sampler: JAGS · Just Another Gibbs Sampler: JAGS Inglis,A.,Ahmed,A.,Wundervald,B.andPrado,E. PhDstudentsatHamiltonInstitute 14thNovemberof2018 1/1

ReferencesDenison, David G. T., Christopher C. Holmes, Bani K. Mallick, andAdrian F. M. Smith. 2002. Bayesian Methods for NonlinearClassification and Regression. Wiley.

Gelman, Andrew, John B. Carlin, Hal S. Stern, and Donald B. Rubin.2004. Bayesian Data Analysis. 2nd ed. Chapman; Hall/CRC.

Kalbfleisch, J. D., and R. L. Prentice. 2002. The StatisticalAnalysis of Failure Time Data. 2nd ed. John Wiley & Sons.

Ntzoufras, Ioannis. 2008. Bayesian Modeling Using WinBUGS.doi:10.1002/9780470434567.

Royle, J Andrew, and Robert M Dorazio. 2008. HierarchicalModeling and Inference in Ecology: The Analysis of Data fromPopulations, Metapopulations and Communities. Elsevier.

Ferrari, Silvia, and Francisco Cribari-Neto. 2004. “Beta Regressionfor Modelling Rates and Proportions.” Journal of Applied Statistics31 (7). Taylor & Francis: 799–815.

30 / 1