iwsmbvs

Download Iwsmbvs

If you can't read please download the document

Upload: bob-ohara

Post on 29-Jun-2015

654 views

Category:

Technology


2 download

DESCRIPTION

An introduction to some Bayesian variable selection techniques, plus some ideas about how to (mis-)use them.Talk given at IWSM 2012 in Prague (which is rather rainy)

TRANSCRIPT

  • 1. Bayesian Variable Selection and the(Ab)use of Priors Bob OHaraBiK-FFrankfurt am MainGermanyblogs.nature.com/boboh/2012/07/16/abusing_a_prior(this is mainly a review of work by other people)

2. The Bad Old DaysANOVA tables 3. Not Useful for Modern ApplicationsGWAS: 105 variablesIkram MK et al (2010) Four Novel Loci (19q13, 6q24, 12q24, and 5q14) Influence the Microcirculation In Vivo. PLoS Genet. 2010 Oct 28;6(10):e1001184. 4. Anyway, we want to be BayesianCould use DIC, but same problems So, lets build variableselection into the model 5. Health Warnings II am not saying you should use variable selection 6. Health Warnings II am not saying you should use variable selectionp-values are EVIL 7. Health Warnings IIThe methods I am about to describe are sensitive to the priors 8. The Regression ProblemOur model:Ky i =0 + k X ik +i k =1 (everything else is just a variant) 9. Bayesian ApproachPosterior:KP ( , 0, y) P ( y , 0, ) P (0 ) P ( ) P (k ) k =1 10. Bayesian ApproachPosterior: KP ( ,0, y) P ( y ,0, ) P (0) P ( ) P (k )k =1 Likelihood 11. Bayesian ApproachPosterior: KP ( ,0, y) P ( y ,0, ) P (0) P ( ) P (k )k =1 Likelihood Priors forregressionparameters 12. Fitting: use MCMCCreates Markov chainLoops through the parametersSimply drop uninteresting parametersmarginalisation 13. The advantage (for us) of MCMCWe can over-parameteriseSome MCMC samplers (e.g Gibbs) are more efficientRun faster & mix better 14. The advantage (for us) of MCMCWith some imagination, we can design priors that will work for us 15. Variable SelectionWhich of the Xs should be inthe model? alternativelyWhich of the s should be zero? 16. Choosing XsrjMCMC General method for movingbetween models withdifferent number ofdimensions 17. Setting s to 0Easier to implementBut can be slowerStays in large dimensions 18. Slab and Spike Priors SpikeSlab 19. Slab and Spike Posteriors Bimodal 20. Several ways of getting priors 21. Method I: Point Mass at 0 P ()=(1 p)0+ p N (0, ) 22. Indicators Ik indicator that variable k is in the model P(Ik=1) = p ~ N(0,2)P() = (1-I) 0 + I And integrate over P(I=1) by MCMCGibbs sampling should work nicely 23. A problem with Gibbs SamplingP() = (1-I) 0 + I When I = 0, only depends on its priorSo MCMC draws wide values of Only rarely will it draw sensible values 24. A Better Version: GVS ~ N(0,2(I)) Pseudo-priorP() = I Now if I=0, generate from a pseudo- prior, tuned to propose sensible values i.e. select 2(0) to cover likely values of the posterior 25. Another wayThe spike can be around 0, not exactly on it 26. SSVS: Mixture distributionsStochastic Search Variable Selection Mixture of normals Spike Slab 27. SSVS ~ N(0, 2(I)) I ~ Bern(p)2 (1)