bayesian estimation & information...

Bayesian Estimation & Information Theory

Jonathan Pillow

Mathematical Tools for Neuroscience (NEU 314)Spring, 2016

lecture 18

Bayesian Estimation

1. Likelihood

2. Prior

3. Loss function

jointly determine the posterior

“cost” of making an estimate if the true value is

• fully specifies how to generate an estimate from the data

Bayesian estimator is defined as:

✓̂(m) = argmin✓̂

ZL(✓̂, ✓)p(✓|m)d✓

L(✓̂, ✓)

“Bayes’ risk”

three basic ingredients:

Typical Loss functions and Bayesian estimators

1. squared error loss

need to find minimizing the expected loss:

Differentiate with respect to and set to zero:

“posterior mean”

also known as Bayes’ Least Squares (BLS) estimator

L(✓̂, ✓) = (✓̂ � ✓)2

0


2. “zero-one” loss (1 unless )

• posterior maximum (or “mode”).• known as maximum a posteriori (MAP) estimate.

expected loss:

which is minimized by:

L(✓̂, ✓) = 1� �(✓̂ � ✓)0

MAP vs. Posterior Mean estimate:

0 2 4 6 8 100

0.1

0.2

0.3

Note: posterior maximum and mean not always the same!

gamma pdf


3. “L1” loss

expected loss:

HW problem: What is the Bayesian estimator for this loss function?

0

Simple Example: Gaussian noise & prior

1. Likelihood additive Gaussian noise

3. Loss function:

zero-mean Gaussian2. Prior

doesn’t matter (all agree here)

posterior distribution

MAP estimate variance

Likelihood

8 0 8

8

0

8

-

-

θ

m

Likelihood

θ

m

8 0 8

8

0

8

-

-

Likelihood

θ

m

8 0 8

8

0

8

-

-

8 0 8-

8 0 8-

Prior

θ

m

8 0 8-

8

0

8

-

Computing the posterior

x

likelihood prior

00

∝

posterior

0

0

θm

x ∝

likelihood prior posterior

00 0

00 0

0

bias

m*

θm

Making an Bayesian Estimate:

x ∝


00 0

00 0

0

largerbias

θm

High Measurement Noise: large bias

x ∝


00 0

00 0

0

smallbias

θm

Low Measurement Noise: small bias

Bayesian Estimation:

• Likelihood and prior combine to form posterior

• Bayesian estimate is always biased towards the prior (from the ML estimate)

+

Which grating moves faster?

Application #1: Biases in Motion Perception

Explanation from Weiss, Simoncelli & Adelson (2002):

• In the limit of a zero-contrast grating, likelihood becomes infinitely broad ⇒ percept goes to zero-motion.

prior priorlikelihood

likelihoodposterior

0 0

Noisier measurements, so likelihood is broader⇒ posterior has

larger shift toward 0(prior = no motion)

• Claim: explains why people actually speed up when driving in fog!

summary

• 3 ingredients for Bayesian estimation (prior, likelihood, loss)

• Bayes’ least squares (BLS) estimator (posterior mean)

• maximum a posteriori (MAP) estimator (posterior mode)

• accounts for stimulus-quality dependent bias in motion perception (Weiss, Simoncelli & Adelson 2002)

bayesian estimation & information...

Documents