bayesian data analysis

22
Bayesian Data Analysis and Beta Distribution Venkatesh Vinayakarao IIIT Delhi, India. Thomas Bayes, 1701 to 1761 IIIT Delhi – Technical Talk, Ketchup Series. 12 th Nov 2014.

Upload: venkatesh-vinayakarao

Post on 02-Jul-2015

295 views

Category:

Data & Analytics


5 download

DESCRIPTION

It is hard to live without generating and using data. Bayesian Data Analysis builds upon basic probability theory and gives us the necessary tools to deal with realistic data. In this session, I will make no assumptions on your knowledge of probability or random processes. Using the very popular case of coin tosses, I introduce the concepts of Prior, Likelihood and Posteriors. This gives me the platform to discuss Bayes Rule and Bernoulli Distribution. Further, the algebraic limitation of such an analysis leads us to appreciate the purpose and properties of Beta Distributions. If you have heard of "Weak Priors", "Strong Priors" and "Conjugates", but never quite got a chance to study them, you will find this talk very interesting. Ever since my joining PhD@IIIT, it has been a journey through Randomized Algorithms, Information Retrieval, Intelligent Systems and Statistical Computation - all of which deal with related perspectives on probability theory. General wisdom claims that discussing fundamentals is relatively harder than discussing advanced stuff. I look forward for this discussion and hope that you will find value in it.

TRANSCRIPT

Page 1: Bayesian data analysis

Bayesian Data Analysis and

Beta Distribution

Venkatesh Vinayakarao

IIIT Delhi, India.

Thomas Bayes, 1701 to 1761

IIIT

Del

hi –

Tech

nica

l T

alk,

Ketc

hup

Seri

es.

12

thN

ov 2

014

.

Page 2: Bayesian data analysis

Agenda

Updating Beliefs using Probability Theory

Role of Beta Distributions

Will DiscussConceptsIllustrationsIntuitionsPurposeProperties

Will not DiscussDetailsDefinitionsFormalismDerivationsProofs

11/12/2014 Venkatesh Vinayakarao 2

Page 3: Bayesian data analysis

The Case of Coin Flips

General Assumption: If a coin is fair! Heads (H) and Tails (T) are equally likely. But, coin need not be fair

Venkatesh Vinayakarao 3

Experiment with Coin - 1HHHTTTHTHT

Experiment with Coin - 2HHHHHHHHHT

Coin-1 more likely to be fair when compared to coin-2.

Page 4: Bayesian data analysis

Our Beliefs

• Can we find a structured way to determine coin’s nature?

Venkatesh Vinayakarao 4

Coin-1

Data Observed

None H T H T

Belief Fair Skewed Fair Skewed Fair

Coin-2

Data Observed

None H H H H

Belief Fair Skewed MoreSkewed

EvenMore…

Even EvenMore…

Prior BeliefBelief Updates

Page 5: Bayesian data analysis

Strong Prior

Priors

• Priors can be strong or weak

Venkatesh Vinayakarao 5

Coin is lab tested for 1 Million Tosses. 50% H, 50% T observed.

One more observation will not change our belief

significantly.

Weak Prior

New CoinA few observations

sufficient to change our belief significantly.

Page 6: Bayesian data analysis

HyperParameter

• Prior probability (of Heads) could be anything:• O.5 Fair Coin• 0.25 Skewed towards Tails• 0.75 Skewed towards Heads• 1 Head is guaranteed!• 0 Both sides are Tails.

Venkatesh Vinayakarao 6

We use θ as a HyperParameter to visualize what happens for different values.

Page 7: Bayesian data analysis

World of Distributions

Discrete Distribution of Prior. Since I typically perceive coins as fair, Prior belief peaks at 0.5.

Venkatesh Vinayakarao 7

Page 8: Bayesian data analysis

Another Possibility

I may also choose to be unbiased! i.e., θ may take any value equally likely.

Venkatesh Vinayakarao 8

A Continuous Uniform Distribution!

Page 9: Bayesian data analysis

Observations

Let’s flip the coin (N) 5 times. We observe (z) 3 Heads.

Venkatesh Vinayakarao 9

Page 10: Bayesian data analysis

Impact of Data

Belief is influenced by Observations. But, note that:

Venkatesh Vinayakarao 10

Belief ≠ observation

Eeks… what’s in the denominator?

Bayes’ Rule

Page 11: Bayesian data analysis

Numerator is easy

• p(θ) was uniform. So, nothing to calculate.

• How to calculate p(D|θ)?

Venkatesh Vinayakarao 11

Jacob Bernoulli1655 – 1705.

θ𝑧 1 − θ 𝑛−𝑧

If D observed is HHHTT and θ is 0.5,We have:

p(D|θ) = (0.5)3 1 − 0.5 5−3

Remember, two things: 1)we are interested in the distribution 2) Order of H,T does not matter.

Page 12: Bayesian data analysis

Bayesian Update

Venkatesh Vinayakarao 12

Page 13: Bayesian data analysis

Painful Denominator

• Recall, for discrete distributions:

• And, for continuous distributions:

Venkatesh Vinayakarao 13

Page 14: Bayesian data analysis

A Simpler Way

Form, Functions and Distributions

Venkatesh Vinayakarao 14

Normal (or Gaussian) Poisson

Page 15: Bayesian data analysis

What form will suit us?

Venkatesh Vinayakarao 15

Page 16: Bayesian data analysis

Beta Distribution

Venkatesh Vinayakarao 16

Vadivelu, a famous Tamil comedian. This is one of his great expression

- terrified and confused.

Page 17: Bayesian data analysis

Beta Distribution

• Takes two parameters Beta(a,b)

• For now, assume a = #H + 1 and b = #T + 1.

• After 10 flips, we may end up in one of these three:

Venkatesh Vinayakarao 17

Page 18: Bayesian data analysis

Prior and Posterior

Let’s say we have a Strong Prior – Beta (11,11). What should happen if we see 10 more observations with 5H and 5T?

Venkatesh Vinayakarao 18

Page 19: Bayesian data analysis

Prior and Posterior…

What if we have not seen any data?

Venkatesh Vinayakarao 19

Page 20: Bayesian data analysis

Conjugate Prior

So, we see that:

Such Priors that have same form are called Conjugate Priors

Venkatesh Vinayakarao 20

Prior beta(a,b)

Posterior beta(a+z,b+N-z)

z Heads in N trials

Page 21: Bayesian data analysis

Summary

Prior

Likelihood

Posterior

Bayes’ Rule

Bernoulli Distribution

Beta Distribution

Venkatesh Vinayakarao 21

Page 22: Bayesian data analysis

References

Book on Doing Bayesian Data Analysis – John K. Kruschke.

YouTube Video on Statistics 101: The Binomial Distribution– Brandon Foltz.

Venkatesh Vinayakarao 22