(probability & statistics) may 27, 2020 lesson 3...

Continuous Distributions &

Expectation, Variance, Moment…

May 27, 2020

来嶋秀治 (Shuji Kijima)

Dept. Informatics,

Graduate School of ISEE

確率統計特論 (Probability & Statistics)

Lesson 3

0. Discrete distributions

review

terminology3

Discrete distribution (離散分布)

distribution on countable set Ω ⊆ ℝ such that

σ𝑥∈ΩPr 𝑋 = 𝑥 = 1 holds

Probability function (確率関数)

𝑓 𝑥 = Pr 𝑋 = 𝑥

(cumulative) distribution function ((累積)分布関数)

𝐹 𝑥 = Pr 𝑋 ≤ 𝑥

important concept

in continuous distr.

(next week)

𝑋 is called “random variable (確率変数)”

4

discrete uniform (離散一様分布)

Ω = 1,2,… , 𝑛

𝑓 𝑘 =1

𝑛𝑘 ∈ Ω

Ex. Roulette

Ω = 0,1,2,… , 36

ℱ = 2Ω

Pr 𝑋 = 𝑥 =1

37(𝑥 ∈ Ω)

roulette

https://en.wikipedia.org/wiki/Roulette

5

Bernoulli (ベルヌーイ分布, 2点分布) 𝐵 1; 𝑝 (0 ≤ 𝑝 ≤ 1)

Ω = 0,1

𝑓 𝑘 = ቊ𝑝 𝑘 = 1

1 − 𝑝 𝑘 = 0

Bernoulli trial (ベルヌーイ試行) is

a random variable according to Bernoulli dist.

Ex. (biased) coin tossing

head (𝑋 = 1)

tail (𝑋 = 0)from「いらすとや」

6

binomial dist. (2項分布) B 𝑛; 𝑝 𝑛 ∈ ℤ, 0 ≤ 𝑝 ≤ 1

Ω = 0,1,2,… , 𝑛

𝑓 𝑘 =𝑛

𝑘𝑝𝑘 1 − 𝑝 𝑛−𝑘 𝑘 ∈ Ω

Ex. i.i.d. Bernoulli 𝑛 trials

Let 𝑋1, 𝑋2, … , 𝑋𝑛 be outputs of Bernoulli trial (B 1; 𝑝 ), i.i.d.

Let 𝑋 = 𝑋1 + 𝑋2 +⋯+ 𝑋𝑛 meaning that the total number of heads.

𝑋 is according to a binomial distribution B 𝑛; 𝑝

from「いらすとや」

7

binomial dist. (2項分布) B 𝑛; 𝑝

For any Ω, 2Ω, Pr , Pr must satisfy Kolmogorov’s axiom.

Axiom (i) is easy to check. (iii) is ok.

We will check axiom (ii): Pr Ω = 1.

Pr Ω =

𝑘=0

𝑛𝑛

𝑘𝑝𝑘 1 − 𝑝 𝑛−𝑘

= 𝑝 + 1 − 𝑝𝑛

= 1.

By binomial thm.

𝑥 + 𝑦 𝑛 =

𝑘=0

𝑛𝑛

𝑘𝑥𝑘𝑦𝑛−𝑘

(where 𝑛𝑘

=𝑛!

𝑘! 𝑛−𝑘 !,

proof by induction)

Ω = 0,1,2,… , 𝑛

𝑓 𝑘 =𝑛

𝑘𝑝𝑘 1 − 𝑝 𝑛−𝑘 𝑘 ∈ Ω

8

geometric dist. (幾何分布) Ge 𝑝 (0 < 𝑝 < 1)

Ω = 0,1,2,…

𝑓 𝑘 = 1 − 𝑝 𝑘𝑝 𝑘 ∈ Ω

Ex.

Repeat Bernoulli trials B 1; 𝑝 i.i.d., until head.

Let 𝐾 denote the number of tail before head,

then 𝐾 is according to a geometric distribution Ge 𝑝 .


Check Kolmogorov (ii)9




Pr Ω =

𝑘=0

∞

1 − 𝑝 𝑘𝑝 = 𝑝

𝑘=0

∞

1 − 𝑝 𝑘 = 𝑝1 − 𝑝∞

1 − 1 − 𝑝= 1.

Recall σ𝑘=0𝑛 𝑥𝑘 =

1−𝑥𝑛+1

1−𝑥holds

since 1 − 𝑥𝑛+1 = 1 − 𝑥 1 + 𝑥1 + 𝑥2 + 𝑥3 +⋯+ 𝑥𝑛

Ω = 0,1,2,…

𝑓 𝑘 = 1 − 𝑝 𝑘𝑝 𝑘 ∈ Ω

10

Poisson dist. (ポアソン分布) Po() (>0)

Ω = 0,1,2,…

𝑓 𝑧 = e−𝜆𝜆𝑧

𝑧!(𝑧 ∈ Ω)

Ex. Rare events

Let’s consider the probability of rare events,

the expected number of occurrences is 𝜆 in a unit time.

Let 𝑋 be the number of occurrences,

then 𝑋 is known to be according to the Poisson distr. Po(𝜆).

More precisely, repeat Bernoulli trials B 1; 𝑝 i.i.d. with 𝑝 ≪ 1.

Let 𝜆 = 𝑛𝑝, then it is known that B 𝑛; 𝑝 ≃ Po(𝜆).

today’s Exercise 2. Poisson distr. appears later today.


Check Kolmogorov (ii)11




Pr Ω =

𝑧=0

∞

e−𝜆𝜆𝑧

𝑧!= e−𝜆

𝑧=0

∞1

𝑧!𝜆𝑧 = e−𝜆e𝜆 = 1.

Recall e𝑥 = σ𝑘=0𝑛 1

𝑘!𝑥𝑘 (by definition. Cf., Taylor expansion)

Ω = 0,1,2,…

𝑓 𝑧 = e−𝜆𝜆𝑧

𝑧!(𝑧 ∈ Ω)

12

Discrete distr.: (distr. on a countable set R)

σ𝑥∈ΩPr 𝑋 = 𝑥 = 1 holds.

probability function (確率関数)




1

P

x

F(x)

1 2 3 4 5 6

1/6

2/6

3/6

4/65/6

13

Discrete distr.: (distr. on a countable set R)

σ𝑥∈ΩPr 𝑋 = 𝑥 = 1 holds.

probability function (確率関数)




1

P

x

F(x)

1 2 3 4 5 6

1/6

2/6

3/6

4/65/6

Discrete Distribution Function 𝐹: Ω → R≥0

1. 𝐹 −∞ = 0, 𝐹 +∞ = 1

2. Monotone non-decreasing (単調非減少)

3. Right continuous (右連続)

1. (univariate) continuous distributions

15

Continuous roulette

Ω = 𝜃 0 ≤ 𝜃 < 2𝜋

ℱ = 2Ω

Pr X = 𝜃 =? (𝜃 ∈ Ω)

Pr 𝑋 =𝜋

4=?

16

Continuous roulette

Ω = 𝜃 0 ≤ 𝜃 < 2𝜋

ℱ = 2Ω

Pr X = 𝜃 =? (𝜃 ∈ Ω)

Pr 𝑋 =𝜋

4= 0 ? ? ?

17

(continuous) uniform distr.

Ω = 0,2𝜋

Pr 𝑋 =𝜋

4= 0 ? ? ?

Pr 𝑋 ≤𝜋

4=

1

8

cumulative distribution function

seems appropriate.

18

continuous distr. (distr. on uncountable set R)

probability density function (確率密度関数)

𝑓 𝑥 =d

d𝑥𝐹 𝑥


𝐹 𝑥 = Pr 𝑋 ≤ 𝑥 differentiable (continuous)

1

P

x

F(x)Continuous Distribution Function 𝐹: R → R≥0

1. 𝐹 −∞ = 0, 𝐹 +∞ = 1

2. Monotone non-decreasing (単調非減少)

3. Differentiable* (微分可能)

*in the effective domain.

19

Uniform ditr. (一様分布) U(a,b)

Ω = 𝑎, 𝑏

𝑓 𝑥 =1

𝑏 − 𝑎a ≤ 𝑥 ≤ 𝑏

𝐹 𝑥 =𝑥 − 𝑎

𝑏 − 𝑎(𝑎 ≤ 𝑥 ≤ 𝑏)

continuous roulette

= (0,2]

ℱ= 2

F(x) = x/2 (x)

f(x) = 1/2 (x)

20

Uniform ditr. (一様分布) U(a,b)

Ω = 𝑎, 𝑏

𝑓 𝑥 =1

𝑏 − 𝑎a ≤ 𝑥 ≤ 𝑏

𝐹 𝑥 =𝑥 − 𝑎

𝑏 − 𝑎(𝑎 ≤ 𝑥 ≤ 𝑏)

https://en.wikipedia.org/wiki/Uniform_distribution_(continuous)

Density function

Distribution function

21

Normal distr. (正規分布) N(, 2)

Ω = −∞,∞

𝑓 𝑥 =1

2𝜋𝜎exp −

1

2

𝑥 − 𝜇

𝜎

2

−∞ < 𝑥 < ∞

https://en.wikipedia.org/wiki/Normal_distribution

Density function


22

Exponential distr. (指数分布) Ex() (>0)

Ω = 0,∞

𝑓 𝑥 = 𝜆e−𝜆𝑥 (𝑥 ≥ 0)

https://en.wikipedia.org/wiki/Exponential_distribution

Density function


where

Γ 𝜈 = න−∞

∞

𝑡𝜈−1e−𝑡d𝑡

23

Gamma distr. (ガンマ分布) G(,) (>0, >0)

Ω = 0,∞

𝑓 𝑥 =1

Γ(𝜈)𝛼𝜈𝑥𝜈−1e−𝛼𝑥 (𝑥 ≥ 0)

remark that

Γ 1 = 1Γ 𝜈 = 𝜈 − 1 Γ 𝜈 − 1Γ 𝜈 = 𝜈 − 1 ! (𝜈 = 1,2,… )

24

Some Distributions

Discrete distributions

(1) Bernoulli B(1,p)

(2) Binomial B(n,p)

#heads during tossing n coins.

(3) Geometric Ge(p)

# tails before a head.

(4) Poisson Po()

Continuous distributions

(1) Uniform U(a,b)

(2) Exponential Ex()

(3) Normal N(,2)

(4) Beta Be(,)

(5) Gamma G(,k)

2. Multivariate distr., and i.i.d.

Distribution of random variables X and Y of (Ω, F , P).

Ex1. two dice.

Ω ={(1,1),(1,2),…,(6,5),(6,6)}

X = sum of casts

Y = product of casts

Multivariate distribution26

multivariate discrete distribution

distr. fnc. : 𝐹 𝑥, 𝑦 ≔ Pr 𝑋, 𝑌 ≤ 𝑥, 𝑦 = Pr 𝑋 ≤ 𝑥 , 𝑌 ≤ 𝑦

pmf: 𝑓 𝑥, 𝑦 ≔ Pr 𝑋, 𝑌 = 𝑥, 𝑦 = Pr 𝑋 = 𝑥 , 𝑌 = 𝑦

multivariate continuous distribution

distr. fnc. : 𝐹 𝑥, 𝑦 ≔ Pr 𝑋, 𝑌 ≤ 𝑥, 𝑦 = Pr 𝑋 ≤ 𝑥 , 𝑌 ≤ 𝑦

pdf: 𝑓 𝑥, 𝑦 ≔𝜕2

𝜕𝑥𝜕𝑦𝐹(𝑥, 𝑦)

i.i.d. (独立同一分布)27

𝑋 and 𝑌 are independent (独立)

𝐹𝑋𝑌 𝑥, 𝑦 = 𝐹𝑋 𝑥 𝐹𝑌(𝑦)

Prop. 𝑋, 𝑌 independent 𝑓𝑋𝑌 𝑥, 𝑦 = 𝑓𝑋 𝑥 𝑓𝑌(𝑦)

𝑋, 𝑌 are identically distributed (同一分布に従う)𝑓𝑋 ≡ 𝑓𝑌

𝑋, 𝑌 are independent and identically distributed

(i.i.d.;独立同一分布)

Prop.28

Proof.

𝑓 𝑥, 𝑦 ≔𝜕2

𝜕𝑥𝜕𝑦𝐹𝑋𝑌 𝑥, 𝑦

=𝜕2

𝜕𝑥𝜕𝑦𝐹𝑋 𝑥 𝐹𝑌 𝑦

=𝜕

𝜕𝑥

𝜕

𝜕𝑦𝐹𝑋 𝑥 𝐹𝑌 𝑦 +

𝜕

𝜕𝑥𝐹𝑋 𝑥

𝜕

𝜕𝑦𝐹𝑌 𝑦

= 0 +𝜕

𝜕𝑥𝐹𝑋 𝑥 𝑓𝑌 𝑦

=𝜕

𝜕𝑥𝐹𝑋 𝑥 𝑓𝑌 𝑦 + 𝐹𝑋 𝑥

𝜕

𝜕𝑥𝑓𝑌 𝑦

= 𝑓𝑋 𝑥 𝑓𝑌 𝑦

Prop.

𝐹𝑋𝑌 𝑥, 𝑦 = 𝐹𝑋 𝑥 𝐹𝑌(𝑦) 𝑓𝑋𝑌 𝑥, 𝑦 = 𝑓𝑋 𝑥 𝑓𝑌(𝑦).

3.1. Expectation

Expectation of discrete random variable30

Expectation (期待値) of a discrete random variable X is defined by

E 𝑋 =

𝑥∈Ω

𝑥 ⋅ 𝑓 𝑥

only when the right hand side is converged absolutely (絶対収束),

i.e., σ𝑥∈Ω 𝑥 ⋅ 𝑓 𝑥 < ∞ holds.

If it is not the case, we say “expectation does not exist.”

Why absolute convergence? 31

Let 𝑎𝑛 = −1 𝑛 for 𝑛 = 0,1,2,….

Q. What is σ𝑛=0∞ 𝑎𝑛?


Let 𝑎𝑛 = −1 𝑛 for 𝑛 = 0,1,2,….

A1.

𝑛=0

∞

𝑎𝑛 = 1 + −1 + 1 + −1 +⋯ =

𝑘=0

∞

𝑎2𝑘 + 𝑎2𝑘+1 =

𝑘=0

∞

1 + −1 = 0



Let 𝑎𝑛 = −1 𝑛 for 𝑛 = 0,1,2,….

A1.

𝑛=0

∞

𝑎𝑛 = 1 + −1 + 1 + −1 +⋯ =

𝑘=0

∞

𝑎2𝑘 + 𝑎2𝑘+1 =

𝑘=0

∞

1 + −1 = 0

A2.

𝑛=0

∞

𝑎𝑛 = 𝑎0 +

𝑛=1

∞

𝑎𝑛 = 1 +

𝑘=0

∞

(𝑎2𝑘+1 + 𝑎2𝑘+2) = 1 +

𝑘=1

∞

−1 + 1 = 1



Let 𝑎𝑛 = −1 𝑛 for 𝑛 = 0,1,2,….

A1.

𝑛=0

∞

𝑎𝑛 = 1 + −1 + 1 + −1 +⋯ =

𝑘=0

∞

𝑎2𝑘 + 𝑎2𝑘+1 =

𝑘=0

∞

1 + −1 = 0

A2.

𝑛=0

∞

𝑎𝑛 = 𝑎0 +

𝑛=1

∞

𝑎𝑛 = 1 +

𝑘=0

∞

(𝑎2𝑘+1 + 𝑎2𝑘+2) = 1 +

𝑘=1

∞

−1 + 1 = 1

A3.

𝑛=0

∞

𝑎𝑛 = 𝑎1 +

𝑘=0

∞

(𝑎2𝑘 + 𝑎2𝑘+3) = −1 +

𝑘=1

∞

1 + −1 = −1



Let 𝑎𝑛 = −1 𝑛 for 𝑛 = 0,1,2,….

A1.

𝑛=0

∞

𝑎𝑛 = 1 + −1 + 1 + −1 +⋯ =

𝑘=0

∞

𝑎2𝑘 + 𝑎2𝑘+1 =

𝑘=0

∞

1 + −1 = 0

A2.

𝑛=0

∞

𝑎𝑛 = 𝑎0 +

𝑛=1

∞

𝑎𝑛 = 1 +

𝑘=0

∞

(𝑎2𝑘+1 + 𝑎2𝑘+2) = 1 +

𝑘=1

∞

−1 + 1 = 1

A3.

𝑛=0

∞

𝑎𝑛 = 𝑎1 +

𝑘=0

∞

(𝑎2𝑘 + 𝑎2𝑘+3) = −1 +

𝑘=1

∞

1 + −1 = −1

A4.

𝑛=0

∞

𝑎𝑛 = 𝑎0 + 𝑎2 +

𝑘=0

∞

(𝑎2𝑘+1 + 𝑎2𝑘+4) = 1 +

𝑘=1

∞

−1 + 1 = 2

A. σ𝑛=0∞ 𝑎𝑛 is not (well-)defined.


Expectation36

Expectation (期待値) of a discrete random variable X is defined by

E 𝑋 =

𝑥∈Ω

𝑥 ⋅ 𝑓 𝑥


i.e., σ𝑥∈Ω 𝑥 ⋅ 𝑓 𝑥 < ∞ holds.


Expectation (期待値) of a continuous random variable X is defined by

E 𝑋 = න−∞

∞

𝑥 ⋅ 𝑓 𝑥 d𝑥


i.e., −∞

∞𝑥 ⋅ 𝑓 𝑥 d𝑥 < ∞ holds.


Compute expectations of distributions37

*Ex 2.

Discrete

(*i) Bernoulli distribution B 1, 𝑝 .

(*ii) Binomial distribution B 𝑛, 𝑝 .

(iii) Geometric distribution Ge 𝑝 .

(iv) Poisson distribution Po 𝜆 .

Continuous

(v) Exponential distribution Ex 𝛼 .

(vi) Normal distribution N 𝜇, 𝜎2 .

Ex. Expectation of Geom. distr. 38

Thm.

The expectation of 𝑋 ∼ 𝐵 𝑛, 𝑝 is 𝑛𝑝

proof

𝑘=0

𝑛

𝑘𝑛

𝑘𝑝𝑘 1 − 𝑝 𝑛−𝑘 =

𝑘=0

𝑛

𝑘𝑛!

𝑘! 𝑛 − 𝑘 !𝑝𝑘 1 − 𝑝 𝑛−𝑘

=

𝑘=1

𝑛

𝑘𝑛!

𝑘! 𝑛 − 𝑘 !𝑝𝑘 1 − 𝑝 𝑛−𝑘

=

𝑘=1

𝑛𝑛!

(𝑘 − 1)! 𝑛 − 𝑘 !𝑝𝑘 1 − 𝑝 𝑛−𝑘

=

𝑘=1

𝑛

𝑛𝑝(𝑛 − 1)!

(𝑘 − 1)! 𝑛 − 𝑘 !𝑝𝑘−1 1 − 𝑝 𝑛−𝑘

= 𝑛𝑝

𝑘′=0

𝑛−1𝑛 − 1

𝑘′𝑝𝑘

′1 − 𝑝 𝑛−1−𝑘′

= 𝑛𝑝

Ex. Expectation of Geom. distr. 39

Thm.

The expectation of 𝑋 ∼ Ge 𝑝 is 1−𝑝

𝑝.

Proof

E 𝑋 = 0 𝑝 + 1 1 − 𝑝 𝑝 + 2 1 − 𝑝 2𝑝 + 3 1 − 𝑝 3𝑝 + ⋯−) 1 − 𝑝 E 𝑋 = 0 1 − 𝑝 𝑝 + 1 1 − 𝑝 2𝑝 + 2 1 − 𝑝 3𝑝 +⋯

−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−𝑝E 𝑋 = 1 − 𝑝 𝑝 + 1 − 𝑝 2𝑝 + 1 − 𝑝 3𝑝 +⋯

=1 − 𝑝 𝑝

1 − (1 − 𝑝)= 1 − 𝑝

Thus E 𝑋 =1−𝑝

𝑝.

Properties of Expectations40

Thm.

For an arbitrary constant c,

E 𝑐 = 𝑐E 𝑐𝑋 = 𝑐 ⋅ E 𝑋E 𝑋 + 𝑐 = E 𝑋 + 𝑐

Linearity of expectations (discrete random variables)41

Thm. (linearity of expectation; 期待値の線形性)

E

𝑖=1

𝑛

𝑋𝑖 =

𝑖=1

𝑛

E(𝑋𝑖)

proof.

E 𝑋 + 𝑌

= σ𝑥σ𝑦(𝑥 + 𝑦) Pr 𝑋 = 𝑥 ∩ 𝑌 = 𝑦

= σ𝑥σ𝑦 𝑥𝑓(𝑥, 𝑦) + σ𝑥σ𝑦 𝑦𝑓(𝑥, 𝑦)

= σ𝑥 𝑥 σ𝑦 𝑓(𝑥, 𝑦) + σ𝑦 𝑦σ𝑥 𝑓(𝑥, 𝑦)

= σ𝑥 𝑥𝑓(𝑥) + σ𝑦 𝑦𝑓(𝑦)

= E 𝑋 + E[𝑌]

= σ𝑥σ𝑦 𝑥 + 𝑦 𝑓(𝑥, 𝑦)

Linearity of expectations (continuous random variables)42

Thm. (linearity of expectation; 期待値の線形性)

E

𝑖=1

𝑛

𝑋𝑖 =

𝑖=1

𝑛

E(𝑋𝑖)

proof.

E 𝑋 + 𝑌

= ∞−+∞

∞−+∞

𝑥 + 𝑦 𝑓 𝑥, 𝑦 d𝑥d𝑦

= −∞

+∞−∞

+∞𝑥𝑓 𝑥, 𝑦 d𝑥d𝑦 +

−∞

+∞−∞

+∞𝑦𝑓 𝑥, 𝑦 d𝑥d𝑦

= −∞

+∞𝑥

−∞

+∞𝑓 𝑥, 𝑦 d𝑦 d𝑥 +

−∞

+∞𝑦

−∞

+∞𝑓 𝑥, 𝑦 d𝑥 d𝑦

= −∞

+∞𝑥𝑓(𝑥)d𝑥 +

−∞

+∞𝑦𝑓(𝑦)d𝑦

= E 𝑋 + E[𝑌]

Application of linearity of expectation43

Thm.

The expectation of 𝑋 ∼ B(𝑛; 𝑝) is 𝑛𝑝

proof

Suppose 𝑋1, … , 𝑋𝑛 are i.i.d. B(1; 𝑝),

then 𝑌 ≔ 𝑋1 +⋯+ 𝑋𝑛 follows B(𝑛; 𝑝).

E 𝑋𝑖 = 1 ⋅ 𝑝 + 0 ⋅ (1 − 𝑝)

E 𝑌 = E σ𝑖𝑋𝑖 = σ𝑖 E 𝑋𝑖 = σ𝑖 𝑝 = 𝑝𝑛

(probability & statistics) may 27, 2020 lesson 3...

Documents