bayesian wrap-up (probably). administrivia office hours tomorrow on schedule woo hoo! office hours...
Post on 22-Dec-2015
214 views
TRANSCRIPT
Administrivia
•Office hours tomorrow on schedule
•Woo hoo!
•Office hours today deferred... [sigh]
•4:30-5:15
Retrospective/prospective•Last time:
•Maximum likelihood
• IID samples
•The MLE recipe
•Today:
•Finish up MLE recipe
•Bayesian posterior estimation
Exercise•Find the maximum likelihood estimator of μ for
the univariate Gaussian:
•Find the maximum likelihood estimator of β for the degenerate gamma distribution:
•Hint: consider the log of the likelihood fns in both cases
Putting the parts together
Assumed distributionfamily (hyp. space)w/ parameters Θ
Parameters for class a:
Specific PDFfor class a
5 minutes of math...•Recall your friend the Gaussian PDF:
• I asserted that the d-dimensional form is:
•Let’s look at the parts...
5 minutes of math...•Ok, but what do the parts mean?
•Mean vector, : mean of data along each dimension
5 minutes of math...•Note: covariances on the diagonal of
are same as standard variances on that dimension of data
•But what about skewed data?
5 minutes of math...•Off-diagonal covariances ( )
describe the pairwise variance
•How much xi changes as x
j changes (on
avg)
5 minutes of math...•Calculating from data:
• In practice: you want to measure the covariance between every pair of random variables (dimensions):
•Or, in linear algebra:
5 minutes of math...•Marginal probabilities
• If you have a joint PDF:
• ... and want to know about the probability of just one RV (regardless of what happens to the others)
•Marginal PDF of or :
5 minutes of math...•Conditional probabilities
•Suppose you have a joint PDF, f(H,W)
•Now you get to see one of the values, e.g., H=“183cm”
•What’s your probability estimate of W, given this new knowledge?
5 minutes of math...•Conditional probabilities
•Suppose you have a joint PDF, f(H,W)
•Now you get to see one of the values, e.g., H=“183cm”
•What’s your probability estimate of A, given this new knowledge?
5 minutes of math...•From cond prob. rule, it’s 2 steps to Bayes’
rule:
•(Often helps algebraically to think of “given that” operator, “|”, as a division operation)
Everything’s random...•Basic Bayesian viewpoint:
•Treat (almost) everything as a random variable
•Data/independent var: X vector
•Class/dependent var: Y
•Parameters: Θ
•E.g., mean, variance, correlations, multinomial params, etc.
•Use Bayes’ Rule to assess probabilities of classes
•Allows us to say: “It is is very unlikely that the mean height is 2 light years”
Uncertainty over params•Maximum likelihood treats parameters as
(unknown) constants
• Job is just to pick the constants so as to maximize data likelihood
•Fullblown Bayesian modeling treats params as random variables
•PDF over parameter variables tells us how certain/uncertain we are about the location of that parameter
•Also allows us to express prior beliefs (probabilities) about params