hierarchical probabilistic inference of cosmic sheardlaw/wfirst2020s/slides/schneider.pdf ·...

42
LLNL-PRES-733055 This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under contract DE-AC52-07NA27344. Lawrence Livermore National Security, LLC Hierarchical Probabilistic Inference of Cosmic Shear Astronomy in the 2020s: Synergies with WFIRST Michael D. Schneider with Josh Meyers and Will Dawson June 27, 2017 Collaborators: D. Bard, D. Hogg, D. Lang, P. Marshall, K. Ng

Upload: others

Post on 31-Jan-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

  • LLNL-PRES-733055This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under contract DE-AC52-07NA27344. Lawrence Livermore National Security, LLC

    HierarchicalProbabilisticInferenceofCosmicShearAstronomyinthe2020s:SynergieswithWFIRST

    MichaelD.SchneiderwithJoshMeyersandWillDawson

    June 27, 2017

    Collaborators:D. Bard, D. Hogg, D. Lang, P. Marshall, K. Ng

  • LLNL-PRES-7330552

    § WFIRST&LSSTareideallysuitedforajointcosmicshearmeasurementtoconstraincosmologicalparametersanddarkenergy

    § NewshearinferencemethodsarerequiredtofullyexploitthesensitivitiesofWFIRST&LSST

    § Ahierarchicalprobabilisticforwardmodelapproachshowsthemostpromiseformeetingshearbiasrequirements

    § Theseprobabilisticalgorithmsenableboth:— Exploitationofinformationinnewcosmologicalstatistics— Moreflexiblecomputingpipelinestoingestandinterpretnewdata

    Summary

  • LLNL-PRES-7330553

    Introduction

  • LLNL-PRES-7330554

    Cosmicshear measurement

    § Thelensingbylargescalestructure§ Lookingforverysmallsignalunderverylarge

    amountofnoise§ Wedon’tknow“unsheared”shapes,butcan

    (roughly)assumetheyareisotropicallydistributed

    § Cosmicsheardistortsstatisticalisotropy;galaxyellipticitiesbecomecorrelated

    § ExquisiteprobeofDE,ifsystematicscanbecontrolled

    § LSST:willmeasurefewbilliongalaxyellipticities.ExcellentsensitivitytobothDEandsystematics!

    4

    Cosmic shearsignal iscomparableto ellipticity of the Earth, ~0.3%

    - D. Wittman

  • LLNL-PRES-7330555

    WhatwilltheWFIRSTHLSaddtocosmicshear?Improvedtomography,reducedbiasmeanstighterdarkenergyconstraints

    § Deblending— 30– 50%ofLSSTgalaxydetectionswillbemultipleresolvedobjects

    asseenbyWFIRST— Shearbias:Mostblendsarechancealignmentsofgalaxiesat

    differentredshifts— Photo-zbias:mixedcolors/biasedphotometry— Assertnumber&propertiesofblendcomponentsgivenHLS

    overlapwithLSST— TrainstatisticalcalibrationforentireLSSTfootprint

    § Improvedphoto-z’sfromLSSToptical+WFIRSTNIRbands

    § HigherfidelityLSScross-correlations(fromgrism survey)— Breakmanysystematicsandcosmologicalparameterdegeneracies

    § Reducedshearbias?

    5

  • LLNL-PRES-7330556

    Weaklensingofgalaxies:theforwardmodel

    Unknown & dominates signal

    Want this

    Marginalize

    Constrained by

    Image credit: GREAT08, Bridle et al.

  • LLNL-PRES-7330557

    Wewon’tjusthavemoredatawith‘StageIV’surveys.- We’reinanerawithqualitativelynewcomputingcapabilities

    Dijkstra’s Law

    A quantitative difference is also a qualitative difference if the quantitative difference is greater than an order of magnitude.

    Figure credit: Wikipedia

    > 2 orders of magnitude since SDSS era

  • LLNL-PRES-7330558

    Qualitativechangesincomputingenablenewscientificmethods

    - Mark Seager, CTO for the HPC Ecosystem at Intel(interview in Inside HPC on June 6, 2016)

    “…predictive simulation has brought together theory and experiment in such a compelling way that it’s fundamentally extended the scientific method for the first time since Galileo Galilei invented the telescope in 1609…”

  • LLNL-PRES-7330559

    § We’refacingsystematics-limitedmeasurements— End-to-endsimulationsoftheexperimentarethebestapproachtoimproveaccuracy&

    precision

    — Tiesdataandsimulationmoreintricatelythaninpastcosmologypipelines

    § Imageandcatalogsummarystatisticsarenolongergoodenoughtomeetnextgenerationsciencerequirements— Probabilistichierarchicalmodelsandrelatedmachine-learningapproachesshowpromise

    butaremuchmorecomputationallyintensive

    — Potentialchangestothetraditional‘facility’/‘user’separateanalysisstages

    Data+Computeconvergenceincosmology– DOEASCRinitiative,April2016

    Removing the line between ‘analysis’ and ‘simulation’.

  • LLNL-PRES-73305510

    Catalog cross-matchingbetweenspaceandgroundisconfusedbysignificantobjectblendingasseenbyLSST

    Ground:(Subaru(Suprime0Cam(

    Space:(Hubble(ACS(

    LSST blend fractions estimated from Subaru & HST overlapping imaging

    Dawson+2015

  • LLNL-PRES-73305511

    Shear bias

  • LLNL-PRES-73305512

    ShapetoShear:NoiseBias

    § Ellipticity:

    § Ensembleaverageellipticity isanunbiasedestimatorofshear.

    § However,maximumlikelihoodellipticity inamodelfitisnotunbiased.

    § Ellipticity isanon-linearfunctionofpixelvalues.

    e =a� ba+ b

    exp(2i✓)

  • LLNL-PRES-73305513

    1. Calibrateusingsimulations.(im3shape,sfit)

    — Butcorrectionsareupto50xlargerthanexpectedsensitivity!

    2. Propagateentireellipticity distributionfunctionP(ellip |data)

    — UseBayes’theorem:P(ellip |data)∝ P(data|ellip)P(ellip)

    — MeasureP(ellip)indeepfields.(lensfit,ngmix,FDNT).

    — Infersimultaneouslywithshearinahierarchicalmodel.(MBI).

    MitigatingNoiseBias– atleast2strategies

  • LLNL-PRES-73305514

    Ahierarchicalmodelforthegalaxydistribution

    § σe =intrinsicellipticity dispersion

    § eint =galaxyintrinsicellipticity

    § g=shear

    § esh =galaxyshearedellipticity

    § PSF=pointspreadfunction

    § D=modelimage

    § σn =pixelnoise

    § D=data:observedimage

    arXiv:1411.2608

  • LLNL-PRES-73305515

    Ourgraphicalmodeltellsushowtofactorthejointlikelihood

    § Useaprobabilisticgraphicalmodeltoencodethefactorizationofthejointprobabilitydistributionofvariablesinthemodel.

    § Wedon’tcareaboutesh forcosmology,sointegrateitout.

    Huge complicated integral to compute for every posterior evaluation.

    /Z Y

    ij

    P⇣D̂ij|PSFj, �n,j, eshi

    ⌘Y

    i

    P⇣eshi |g, �e

    ⌘P (g) P (�e) d{eshi }

    Pr⇣g,�e| {PSF}j , {�n,j , {Dij}}

    /Z

    dngal�eshi

    2

    4Y

    ij

    Pr�Dij |PSFj ,�n,j , eshi

    �3

    5"Y

    i

    Pr�eshi |g,�e

    �Pr(g)Pr(�e)

    #

    arXiv:1411.2608

  • LLNL-PRES-73305516

    ImportanceSampling allowstractabledivide&computeWethusestimatethepseudo-marginallikelihood forshear

    § Don’tgobacktopixelsforeverytimewesampleanewgorσe.

    § Foreachgalaxy,drawimagemodelparametersamplesunderafixed“interim”prior.Thisisembarrassinglyparallelizable.

    § Usereweightedsamplestoapproximatetheintegral viaMonteCarlo.

    1

    ConditionalPrior

    InterimPosterior

    InterimPrior

    Likelihood

    Ongoing research question:How many interim samples are needed?

  • LLNL-PRES-73305517

    Sourcecharacterizationviaprobabilisticimagemodeling

    Infer image model parameters via MCMC under an interim prior distribution for the galaxy and PSF parameters.

    MBI GREAT3 analysis with:The Tractor (Lang & Hogg)

    Now use GalSim + MCMC

    GalSim models inside an MCMC chain – Can it be made fast enough?

  • LLNL-PRES-73305518

    Exampleinterimposteriorinferencesforgalaxystampimages

  • LLNL-PRES-73305519

    ProbabilisticforwardmodelingcanmeetLSSTshearbiasrequirements… atleastwhentestedonsimulatedimages

    § GREAT3CGC-likesetup— 200’fields’withconstantshearperfield— 10kgalaxiesperfield

    § Marginalize7parameterspergalaxy:— e1,e2,HLR,flux,dx,dy,n— Notable:Sersic indexmarginalized

    § HaveNOTmarginalizedPSF(yet!)

    Immediate takeaway: Hierarchical inference performs significantly better than ensemble average maximum likelihood ellipticity.

  • LLNL-PRES-73305520

    Multi-epoch & multi-telescope data sets

  • LLNL-PRES-73305521

    Howdowecombinemultipleobservationsofthesamegalaxy?Naïvelywemustjointfitallepochssimultaneously

    arXiv:1511.03095Generalized Multiple Importance SamplingElvira, Martino, Luengo, & Bugallo

    Problem: Imagine we have fit pixel data from LSST year 1. How do we incorporate year 2 observations without redoing (expensive) calculations?

    Solution: Consider single-epoch samples as draws from a multi-modal importance sampling distribution:

  • LLNL-PRES-73305522

    ‘cross-pollination’ needed:

    Evaluate the likelihood of epoch i given model parameter samples

    from epoch j, for all combinations of i, j.

    A standard scatter / gather operation

    Multipleimportancesampling(MIS)viaweightedpseudo-marginals

    1. Sample from the conditional posterior for each epoch individually

    2. Evaluate the ratio of the conditional posterior for each epoch i to that of the MIS sampling distribution

  • LLNL-PRES-73305523

    MultipleimportancesamplingenablesstreamingdataanalysisEfficiencyissignificantlyenhancedbyusingolddataasasampling‘prior’

    § Drawparametersamplesfromfirstepochunderanominalinterimprior

    § Drawsamplesfromsubsequentepochswithapriorinformedbypreviousepochsamples

    § Simulationstudiesshow:— ~10%ofsampleshavesignificantweight

    whencombining200epochsinstreamingfashion

  • LLNL-PRES-73305524

    PSF marginalization

  • LLNL-PRES-73305525

    MarginalizingPSFs:MISmakesthistractable

    § LSSTwillhave~200epochsperobjectperfilter— WeaimtomarginalizethePSF∏n,i inevery

    epoch— Themarginalizationisconstrainedby:

    • ConsistencyofPSFrealizationsoverthefocalplaneforeachepoch

    • Consistencyoftheunderlyingsourcemodelacrossepochs

    § Simplestapproach(statistically,notcomputationally):Infergalaxymodelsgivenallepochimagingsimultaneously— “Interim”samplesareofsize:~10galaxy

    params +200*~4PSFparams =~1kparameters!

  • LLNL-PRES-73305526

    1. Fitstarfootprintsinallepochsviaprobabilisticforwardmodels

    2. MarginalizestarimageparameterstoconstraintheglobalfieldPSFmodelforeachepoch— Stateoftheopticsaberrations,and— Distributionofatmosphereturbulencestatistics

    3. Fitallgalaxyfootprintsineachepochviaforwardmodels— UsePSFmodelsdrawnfromthemarginalposteriorgiventhestarimages

    4. RunThresherontheinterimgalaxysamplesforallepochs(via‘cross-pollinator’)

    ThepipelineforPSFmarginalization

    One approximation needed: Marginalize PSF model independently for each field location

  • LLNL-PRES-73305527

    Probabilistic cosmological one-point statistics

  • LLNL-PRES-73305528

    Probabilisticcosmologicalmassmapping

    Zero E/B mode mixing by construction

    Objective: infer the 3D gravitational potential of the initial conditions

    arXiv:1610.06673

  • LLNL-PRES-73305529

    Hierarchicalinferenceofcosmologicallensingmassdistributions

    Validation with simulations A real merging galaxy cluster New:• Linear and

    nonlinear scales reconstructed in one framework

    • No E/B mode mixing by construction

    Schneider+2017, ApJ

  • LLNL-PRES-73305530

    Applicationtodata:WeaklensingmassmapsforAbel781merginggalaxyclustersasseenbytheDeepLensSurvey

    Our method Previously published method

  • LLNL-PRES-73305531

    Latent features of the galaxy distribution

  • LLNL-PRES-73305532

    Pr(eint)isnotGaussian!

    § WouldrathernotassertaparticularparametricformforP(eint).

    § Usea“non-parametric”distribution:aDirichletProcessMixtureModel

    3

    Ellipticities from COSMOS

  • LLNL-PRES-73305533

    Hierarchicalinferenceofintrinsic galaxyproperties

    Specify a Dirichlet Process (DP) for the distribution of intrinsic galaxy property hyper-parameters

    The DP is a ‘non-parametric’ distribution with discrete support

    The DP distribution allows clustering of data points (e.g., galaxies) to infer latent structure in the data.

  • LLNL-PRES-73305534

    GibbsupdatesintheDirichlet Processmodel

    Pr(cn = c`|c�n,!n,↵,X ) = bN�n,c Pr(dn|↵c` ,X ), 8` 6= n

    Pr(cn 6= c`8` 6= n|c�n,!n,X ) = bZ

    Pr(dn|↵,X )G0(↵) d↵,

    ↵cn ⇠ G0 (↵cn)NcnY

    `=1

    Pr(d`|↵cn ,X ))

    Latent class assignments are updated with different conditional distributions depending on whether any other observations are assigned to the current class.

    The DP mixture parameters are simply updated with the posterior given all observations currently associated with the given latent class.

    ZPr(dn|↵,X )G0(↵) d↵ =

    ZnN

    NX

    k=1

    Prmarg(!nk|a)Pr(!nk|I0)

    Prmarg(!nk|a) ⌘Z

    d↵cn G0(↵cn |a)Pr(!nk|↵cn)

    Pr(cn 6= c`8` 6= n|c�n,!n,X ) = bZ

    Pr(dn|↵,X )G0(↵) d↵

    Highlighted integral is expensive to compute in general.

    With importance sampling we only require the DP base distribution to be conjugate to the distribution of galaxy properties – NOT the likelihood.

    Neal (2000)

  • LLNL-PRES-73305535

    Asimulationstudywith100galaxiesvalidatestheDPmodel

    100 galaxies drawn from 1 of 2 Gaussian ellipticity distributions

  • LLNL-PRES-73305536

    Simulationstudy:Wecanbeatthetraditional‘shapenoise’statisticalerrorboundbyinferringlatentstructureinthedata

    100 galaxies drawn from 1 of 2 Gaussian ellipticity distributions

    3x improvement in cosmic shear precision

  • LLNL-PRES-73305537

    GREAT3results

    § TestedhierarchicalapproachusingsimulationsfromthethirdGRavitationallEnsingAccuracyTest(GREAT3).

    § Hierarchicalinferenceperformssignificantlybetterthanensembleaveragemaximumlikelihoodellipticity.

    § TheDPMMellipticitypriorperformsbetterthanthesingleGaussianellipticityprior.

    Mean ML ellipticityHierarchical Inference

    : 13% shear calibration errorsH.I. : 4% shear calibration errors

    Dirichlet Process Inference

    DP : 1-2% shear calibration errors

    Input shearSh

    ear r

    esid

    uals

  • LLNL-PRES-73305538

    Multi-variateDPmixturemodel (inprogress):“standardizable”ellipticities.

    § Ellipticalgalaxieshaveanarrowerintrinsicellipticitydistributionthanlate-type.Highersensitivitytoshear!

    § Ellipticals/spiralsalsodistinguishablebycolorandmorphology(e.g.,Sersicindex,Ginicoefficient,asymmetry),potentiallyprovidingadditionalvariableswithwhichtocluster.

    § Othercorrelationstoexploit?

  • LLNL-PRES-73305539

    ApplicationtotheDeepLensSurvey:realgalaxiesrequireatleast2latentclasses(ignoringlensing)

    We infer 2 latent classes given only an ellipticity catalog

    Preliminary: The marginal posterior distribution of ellipticity variance from the Deep Lens Survey

    photo-z: [0.65, 0.7]

  • LLNL-PRES-73305540

    TheprobabilisticweaklensingworkflowplanforLSST

    Probabilistic pipeline

  • LLNL-PRES-73305541

    § Cosmic shear is systematics limited & signal is dominated by PSF and astrophysics

    — A probabilistic approach is warranted to infer a small signal and mitigate biases

    § A hierarchical probabilistic model for cosmic shear can trade bias for variance, but also can increase precision by learning latent structure in the galaxy distribution.

    § Importance sampling methods allow tractable approaches to a probabilistic forward model of LSST & WFIRST imaging

    — With billions of galaxies and hundreds of epochs per galaxy modeling LSST or WFIRST imaging requires an approach to separating analyses of data subsets, even though statistically correlated

    § We are able to sample from a probabilistic model with multiple hierarchies to marginalize both correlated image systematics and astrophysical properties of galaxies.

    Summary

    Probabilistic