probability and statistics - web.stanford.edu · probability and statistics part 2. more...

Post on 12-Jun-2018

357 Views

Category:

Documents

5 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Probability and StatisticsPart 2. More Probability, Statistics and their Application

Chang-han Rhee

Stanford University

Sep 20, 2011 / CME001

1

Outline

StatisticsEstimation ConceptsEstimation Strategies

More ProbabilityExpectation and Conditional ExpectationInterchange of LimitTransforms

SimulationMonte Carlo MethodRare Event Simulation

Further ReferenceClasses at StanfordBooks

2

Outline

StatisticsEstimation ConceptsEstimation Strategies

More ProbabilityExpectation and Conditional ExpectationInterchange of LimitTransforms

SimulationMonte Carlo MethodRare Event Simulation

Further ReferenceClasses at StanfordBooks

3

Probability and Statistics

Probability

Statistics

Model Data

4

Estimation

Making best guess of an unknown parameter out of sample data.

eg. Average height of west african giraffe

5

Estimator

An estimator (statistic) is a rule of estimation:

θn = g(X1, . . . ,Xn)

6

Quality of an Estimator

I BiasEθ − θ

I Variancevar(θ)

I Mean Square Error (MSE)

E[θ − θ]2 = (bias)2 + (var)

7

Confidence Interval

Consider the sample mean estimator θ = 1n Sn. From the CLT,

Sn − nEX1√n

D→ σN(0, 1)

Rearranging terms, (note: this is not a rigorous argument)

1n

SnD≈ EX1 +

σ√n

N(0, 1)

8

Outline

StatisticsEstimation ConceptsEstimation Strategies

More ProbabilityExpectation and Conditional ExpectationInterchange of LimitTransforms

SimulationMonte Carlo MethodRare Event Simulation

Further ReferenceClasses at StanfordBooks

9

Maximum Likelihood Estimation

Finding most likely explanation.

θn = arg maxθ

f (x1, x2, . . . , xn|θ) = f (x1|θ) · f (x2|θ) · f (xn|θ)

I Gold Standard: Gueranteed to beI Often computationally challenging

10

Method of Moments

Matching the sample moment and the parametric moments.If θ = (θ1, . . . , θk)∫

xjfθn(x)dx =

1n

n∑i=1

Xji for j = 1, . . . , k

or ∑xjpθn

(x) =1n

n∑i=1

Xji for j = 1, . . . , k

I Statistically, less efficient than MLEI Often computationally efficient

11

Outline

StatisticsEstimation ConceptsEstimation Strategies

More ProbabilityExpectation and Conditional ExpectationInterchange of LimitTransforms

SimulationMonte Carlo MethodRare Event Simulation

Further ReferenceClasses at StanfordBooks

12

Properties of Expectation

I Jensen’s Inequality

g(EX) ≤ Eg(X) g(·): convex

I Markov’s inequality

P(|X| > x) ≤ E|X|x

x > 0

I Minkovsky’s inequality(E|X + Y|p

)1/p ≤(E|X|p

)1/p+

(E|Y|p

)1/p

I Hölder’s inequality

E|XY| ≤(E|X|p

)1/p(E|Y|p)1/p for 1/p + 1/q = 1

13

I If X and Y are independent,

Eg(X)h(Y) = Eg(X)Eh(Y)

14

Properties of Conditional Expectation

I Jensen’s Inequality

g(E[X|Y]) ≤ E[g(X)|Y] g(·): convex

I Markov’s inequality

P(|X| > x

∣∣Y) ≤ E[|X|

∣∣Y]x

x > 0

I Minkovsky’s inequality(E[|X + Y|p

∣∣Z])1/p ≤(E[|X|p

∣∣Z])1/p+

(E[|Y|p

∣∣Z])1/p

I Hölder’s inequality

E[|XY|

∣∣Z] ≤ (E[|X|p

∣∣Z])1/p(E[|Y|q

∣∣Z])1/q 1/p + 1/q = 1

15

Tower Property

Tower Property (Law of Iterated Expectation, Law of TotalExpectation)

E[X] = E[E[X|Y]

]i.e.,

E[X] =∑x∈S

E[X|Y = y]P(Y = y)

eg.I Y ∼ Unif (0, 1) & X ∼ Unif (Y, 1). What is EX?I Mouse Escape

16

Bayes Rule

I The law of total probability:

P(A) =∑

i

P(A|Bi)P(Bi)

I Bayes Rule

P(Ai|B) =P(B|Ai)P(Ai)∑j P(B|Aj)P(Aj)

where A1,A2, . . . ,Ak is a disjoint partition of Ω.

17

More Properties of Conditional Expectation

E(Xg(Y)|Y) = g(Y)E(X|Y)E(E(X|Y,Z)|Y) = E(X|Y)

E(X|Y) = X, if X = g(Y) for some g

E(h(X,Y)|Y = y) = Eh(X, y)

E(X|Y) = EX, if X and Y are independent

18

Outline

StatisticsEstimation ConceptsEstimation Strategies

More ProbabilityExpectation and Conditional ExpectationInterchange of LimitTransforms

SimulationMonte Carlo MethodRare Event Simulation

Further ReferenceClasses at StanfordBooks

19

Monotone Convergence

Theorem (Monotone Convergence)If Xn ≥ 0 and Xn ↑ Xn+1 almost surely, then EXn → EX∞.

20

Dominated Convergenceand bounded convergence as a corollary

Theorem (Dominated Convergence)If Xn → X∞ almost surely and |Xn| ≤ Y for all n and some Y such

that EY < ∞, then XnL1→ X∞.

Corollary (Bounded Convergence)If Xn → X∞ almost surely and |Xn| ≤ K for all n and some K ∈ R,

then XnL1→ X∞.

21

and more

I Scheffe’s LemmaI Fatou’s LemmaI Uniform IntegrabilityI Fubini’s Theorem

22

Outline

StatisticsEstimation ConceptsEstimation Strategies

More ProbabilityExpectation and Conditional ExpectationInterchange of LimitTransforms

SimulationMonte Carlo MethodRare Event Simulation

Further ReferenceClasses at StanfordBooks

23

Moment Generating Function and Characteristic Function

Moment generating function and characteristic function chracterizesthe distribution of the random variable.

I Moment Generating Function

MX(θ) = E[exp(θX)]

I Characteristic Function

ΦX(θ) = E[exp(iθX)]

24

Outline

StatisticsEstimation ConceptsEstimation Strategies

More ProbabilityExpectation and Conditional ExpectationInterchange of LimitTransforms

SimulationMonte Carlo MethodRare Event Simulation

Further ReferenceClasses at StanfordBooks

25

Monte Carlo Method

Computational algorithms that rely on repeated random sampling tocompute their results.

Theoretical BasesI Law of Large Numbers guarantees the convergence

1n(#Xi ∈ A) → P(X1 ∈ A)

I Central Limit Theorem

1n(#Xi ∈ A)− P(X1 ∈ A)

D≈ σ√n

N(0, 1)

26

Outline

StatisticsEstimation ConceptsEstimation Strategies

More ProbabilityExpectation and Conditional ExpectationInterchange of LimitTransforms

SimulationMonte Carlo MethodRare Event Simulation

Further ReferenceClasses at StanfordBooks

27

Challenges of Rare Event

Probability that a coin lands on its edge.

How many flips do we need to see at least one occurrence?

28

Importance Sampling (Change of Measure)

We can express the expectation of a random variable as an expectationof another random variable.

eg.Two continuous random variable X and Y have density fX and fY suchthat fY(s) = 0 implies fX(s) = 0. Then,

Eg(X) =∫

g(s)fX(s)ds =∫

g(s)fX(s)fY(s)

fX(s)ds = Eg(Y)L(Y)

where L(X) = fX(X)fY(X)

.

L(X) is called a likelihood ratio.

29

Outline

StatisticsEstimation ConceptsEstimation Strategies

More ProbabilityExpectation and Conditional ExpectationInterchange of LimitTransforms

SimulationMonte Carlo MethodRare Event Simulation

Further ReferenceClasses at StanfordBooks

30

Probability

I Basic ProbabilitySTATS 116

I Stochastic ProcessesSTATS 215, 217, 218, 219

I Theory of ProbabilitySTATS 310ABC

31

Statistics

I Intro to StatisticsSTATS 200

I Theory of StatisticsSTATS 300ABC

32

Application

I Applied StatisticsSTATS 191, 203, 208, 305, 315AB

I Stochastic SystemsMS&E 121, 321

I Stochastic ControlMS&E 322

I Stochastic SimulationMS&E 223, 323, STATS 362

I Little bit of EverythingCME 308

I Econometrics, Finance, Bio and morehttp://explorecourses.stanford.edu

33

Outline

StatisticsEstimation ConceptsEstimation Strategies

More ProbabilityExpectation and Conditional ExpectationInterchange of LimitTransforms

SimulationMonte Carlo MethodRare Event Simulation

Further ReferenceClasses at StanfordBooks

34

Books

I Sheldon Ross (2009). Introduction to Probability Models.Academic Press; 10th edition

I John A. Rice (2006). Mathematical Statistics and Data Analysis.Duxbury Press; 3rd edition

I Larry Wasserman (2004). All of Statistics : a concise course instatistical inference. Springer, New York

35

top related