Download - Chapter 3: Special Distributions [email protected] Yang Zhenlin

Chapter 3: Special Distributions

[email protected] http://www.mysmu.edu/faculty/zlyang/

Yang ZhenlinYang Zhenlin

STAT306, Term II, 09/10

Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU

Chapter Contents

Bernoulli Random Variable

Binominal Distribution

Negative Binomial Distribution

Geometric Distribution

Poisson Distribution

Uniform Distribution

Exponential Distribution

Gamma Distribution

Normal Distribution

2


Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU3

Introduction

Many probability distributions have been invented to model the phenomena associated with certain social activities. Each of them is suitable for a special situation, has some unique properties including a special pmf or pdf form, and hence merits a separate study. For example,

Binomial random variable is to the number of successes among n independent experimental trials with two possible outcomes in each trial: success or failure.

Poisson random variable counts the number of ‘events’ occurred in a fixed time period.

Exponential random variable can be used to ‘model’ the lifetime of an electronic component.

Normal random variable can be used to describe the height or weight distributions of human being, … .


Chapter 3


The sample space of a Bernoulli trial is S = {s, f}.

Defining a variable X in such way that

X(s) = 1 and X(f ) = 0,

then X is a r.v. taking only two possible values: 0 and 1. This r.v. is called Bernoulli random variable.

Denoting = P(X = 1), it is then called the probability of success.

The pmf of a Bernoulli r.v., called Bernoulli Distribution, is seen to be

4

Bernoulli Trial

Consider an experiment with only two possible outcomes:

s = "Success" and f = "Failure".

A performance of such an experiment is called a Bernoulli trial.

Consider an experiment with only two possible outcomes:

s = "Success" and f = "Failure".

A performance of such an experiment is called a Bernoulli trial.

1,0,)1()( 1 xxp xx The expectation: E[X] = 1 P(X=1) + 0P(X=0) = .


Chapter 3


Bernoulli Trial

The variance: As E[X2] = 12×P(X=1) + 02×P(X=0) = ,

Var(X) = E[X2] – (E[X])2 = –2 = (1–)

Example 3.1. If in a throw of a fair die the event of obtaining 4 or 6 is called a success, and the event of obtaining 1, 2, 3, or 5 is called a failure, then

is a Bernoulli r.v. with parameter = 1/3.

otherwise,0

obtained is 6or 4 if1X

The MGF: M(t) = E[etX] = et(1)×P(X=1) + et(0)×P(X=0)

= et + (1–)

From MGF: M'(0) = = E[X]; and M''(0) = = E[X2].


Chapter 3


Binomial Distribution

If n Bernoulli trials with probability of success are performed independently, then X, the number of successes in these n trials, is called a Binomial r.v. with parameters n and , denoted by X ~ Bin(n, ). The pmf of the binomial r.v., called the Binomial Distribution, has the expression

If n Bernoulli trials with probability of success are performed independently, then X, the number of successes in these n trials, is called a Binomial r.v. with parameters n and , denoted by X ~ Bin(n, ). The pmf of the binomial r.v., called the Binomial Distribution, has the expression

nxxnxp xnx ,,2,1,0,)1()(

The arguments for deriving the binomial pmf is as follows:

in n Bernoulli trials, the probability for each particular sequence of “x successes” and “nx failures” is x(1-)n-x.

There are totally nCx ways of obtaining x successes and n–x failures. Thus the desired probability of having (any) “x successes” and “nx failures” is nCx x(1-)n-x.


Chapter 3



20 10 0

0.2

0.1

0.0

=0.2 n=20

n=20

=0.8

Plots of Binomial Probability Mass Function

10 5 0

0.25

0.20

0.15

0.10

0.05

0.00

=0.5 n=10

20 10 0

0.2

0.1

0.0

n=20 =0.5


Chapter 3



Expectation: E[X] = nExpectation: E[X] = n

Proof: Apply Binominal Expansion formula,

n

i

iinn bainba

0

)(


Chapter 3



Variance: Var(X) = n (1)Variance: Var(X) = n (1)

Proof: Apply Binominal Expansion formula to find E[X(X1)].


Chapter 3



MGF: M(t) = [et + (1–)]nMGF: M(t) = [et + (1–)]n

Proof:

Exercise: Using MGF, show E[X] = n, and Var(X) = n (1)


Chapter 3



Solution: Let X be the number of persons who decide correctly among a three-person jury. Then X ~ Bin(3, ). Hence the probability that a three-person jury decides correctly is

P(X ≥ 2) = P(X = 2) + P(X = 3) = 3C2 2(1–) + 3C3 3(1–)0

= 32 – 23.

Example 3.2. Suppose that jury members decide independently and that each with probability makes the correct decision. If the decision of the majority is final, what is the probability that a three-person jury makes a correct decision?


Chapter 3


The Mean and Variance:The Mean and Variance:

12

Consider a sequence of independent Bernoulli trials with probability of success . If X represents the number of trials until one success, then, X is a geometric r.v. Its distribution is called the geometric distribution, denoted by X ~ GEO(), which can easily be seen as

Consider a sequence of independent Bernoulli trials with probability of success . If X represents the number of trials until one success, then, X is a geometric r.v. Its distribution is called the geometric distribution, denoted by X ~ GEO(), which can easily be seen as

,,3,2,1,)1()( 1 xxp x

2

1)(Var and ,

1)(E

X

πX


Proof: Recall geometric series and its 1st and 2nd derivatives,1 + a + a2 + a3 + + an + = (1 a)1 , for |a| < 1

1 + 2a + 3a2 + + n an1 + = (1 a)2 ,

2 + 32a + 43a2 + + n(n 1)an2 + = 2(1 a)3 . . . .


Chapter 3


Exercise. Use MGF to derive the mean and variance of a geometric random variable.


The MGF:The MGF: t

t

e

etM

)1(1)(

Proof: Apply geometric series.


Chapter 3


Example 3.3. A father asks his sons to cut the backyard lawn. Since he does not specify which of his three sons is to do the job, each boy tosses a coin to determine the odd person, who must then cut the lawn. In the cases that all three get heads or tails, they continue tossing until they reach a decision. Let be the probability of heads, then 1– is the probability of tails.

(a) Find the probability that they reach a decision in less than n rounds of tosses.

(b) If =1/2, what is the minimum number of rounds required to reach a decision with probability 0.95?


Solution: (a) The probability of reaching a decision on any round of coin tossing:

1 = P{exactly two heads} + P{exactly two tails}

= 3C2 2(1) + 3C2 (1) 2 = 3(1)

Let X be the number of rounds of tosses until they reach a decision; then X is a geometric r.v. with parameter 1 . Thus,

P(X < n) = 1 – P(X ≥ n) = 1 – [1 – 3(1)]n–1.


Chapter 3



(b) We want to find the minimum n so that P(X ≤ n) ≥ .95.

CDF of Geometric Distribution: F(x) = P{X x} = 1 P{X > x}

= 1 P{X x+1}= 1 [(1)x + (1)x+1 + (1)x+2 + … ]

= 1 (1)x [1+ (1) + (1)2 + … ] = 1 (1)x

CDF of Geometric Distribution: F(x) = P{X x} = 1 P{X > x}

= 1 P{X x+1}= 1 [(1)x + (1)x+1 + (1)x+2 + … ]

= 1 (1)x [1+ (1) + (1)2 + … ] = 1 (1)x

P(X ≤ n) = 1 – P(X ≥ n +1) = 1 – (1 – 3(1))n =1 – (1 – 3/4) n ≥ 0.95,

which gives n ≥ 2.16.

So n =3.


Chapter 3


r

t

t

e

etM

rY

π

rY

)1(1)(and,

)1()(Var ,)(E

2

To obtain the pmf of Y, note the event {Y = y} is equivalent to {r–1 successes in first y–1 trials and a success in last trial}, thus,

p(y) = P(Y = y) = r (1 – ) y-r, y = r, r+1, . . .

which is called the negative binomial distribution.

To obtain the pmf of Y, note the event {Y = y} is equivalent to {r–1 successes in first y–1 trials and a success in last trial}, thus,

p(y) = P(Y = y) = r (1 – ) y-r, y = r, r+1, . . .

which is called the negative binomial distribution.

16


If Y represents the number of Bernoulli trials until r successes, then Y is called a negative binomial r.v. with parameters r and , denoted by Y~ NB(r, ).

If Y represents the number of Bernoulli trials until r successes, then Y is called a negative binomial r.v. with parameters r and , denoted by Y~ NB(r, ).

Note: The mean and variance of a negative binomial random variable are r times of those of a geometric r.v, and MGF is that of a geometric r.v. raised to power r. Why ?

y 1r 1


Chapter 3



Example 3.4. Team A plays team B in a seven-game world series. That is, the series is over when either team wins four games. For each game, P(A wins) = 0.6, and the games are assumed independent.

(a) What is the probability that the series ends in exactly six games?

(b) If the series ends in six games, what is the probability that A wins the series?

Example 3.4. Team A plays team B in a seven-game world series. That is, the series is over when either team wins four games. For each game, P(A wins) = 0.6, and the games are assumed independent.

(a) What is the probability that the series ends in exactly six games?

(b) If the series ends in six games, what is the probability that A wins the series?Solution: (a) Let X be the number of games until A wins four games, and Y be the number of games until B wins four games. Then, X ~ NB(4, 0.6) and Y~ NB(4, 0.4).

P(series ends at exactly 6 games) = P(X = 6) + P(Y = 6)= 5C3 (0.6)4(0.4) 2 + 5C3 (0.4) 4(0.6)2

= 0.2074 + 0.0922 = 0.2996.(b) Let A be the event that team A wins and D be the event that series ends in six games. Then the desired probability is

6923.02996.0

2074.0

)6()6(

)6(

)(

)()|(

YPXP

XP

DP

DAPDAP


Chapter 3


These quantities often satisfy the following conditions:

The number of events occurred in nonoverlapping intervals are independent

The probability of exactly one event occurring in a sufficiently short interval of length h is approximately h.

The probability of two or more events occurring in a sufficiently short interval is essentially zero.

These quantities often satisfy the following conditions:

The number of events occurred in nonoverlapping intervals are independent

The probability of exactly one event occurring in a sufficiently short interval of length h is approximately h.

The probability of two or more events occurring in a sufficiently short interval is essentially zero.

18

In many applications, the number of events occurred in a fixed "time interval" is of interest. For example,

The number of customers entering a bank on a given day.The number of typographical errors of a book.The number of traffic accidents in a given year.The number of a–particles discharged in a fixed period of time from some

radioactive material.The number of defects on a piece of wire.



Chapter 3


As a pmf sums to one, we must have,

a useful formula!

As a pmf sums to one, we must have,

a useful formula!

19


A suitable distribution for describing the above phenomena is the Poisson distribution. If the three conditions are satisfied, it can indeed be shown that the number of events occurred in a fixed or a unit interval, X, follows a Poisson distribution with pmf:

,,2,1,0,!

)(

xx

exp

x

,!

or ,1! 00

exx

e

x

x

x

x

The mean: E[X] = : the rate of Poisson events per unit.

The variance: Var(X) =

The MGF:)1()(

teetM


Chapter 3



Plots of Poisson Probability Mass Function

20100

0.3

0.2

0.1

0.0

Mean=2

20100

0.2

0.1

0.0

Mean=5

20100

0.10

0.05

0.00

Mean=10 Note: i)The mean and variance of Poisson are both .

ii)The parameter is usually called the rate of occurrence of Poisson events.


Chapter 3


0

1

1

1

0

0

!

)!1(

!

!

)(

][E

y

y

x

x

x

x

x

x

x

y

e

x

e

x

ex

x

ex

xpx

X

2

0

2

2

22

2

0

!

)!2(

!)1(

!)1(

)]1([E

y

y

x

x

x

x

x

x

y

e

x

e

x

exx

x

exx

XX

E[X2] = 2 + Var(X) = E[X2] = 2 + Var(X) =

Derivation of the mean, variance and MGF:Derivation of the mean, variance and MGF:


)1(

0

0

0

!

)(

!

)(

!

][E)(

t

t

e

e

x

xt

x

xt

x

xtx

tX

e

ee

x

ee

x

ee

x

ee

etM

y=x1

y=x2

Use M(t) to show these?


Chapter 3



Example 3.5. Every day the average number of wrong numbers received by a certain mail-order house is one. What is the probability that this house receives a)two wrong calls tomorrow,b)at least one wrong call tomorrow?

Solution: Assuming that the house receives a lot of calls, X, the number of wrong calls received tomorrow, is approximately a Poisson r.v. with = E(X) = 1. Thus,

a)

b) P(X ≥ 1) = 1 – P(X = 0) = 1 – 1/e = 0.63.

18.0!2

1)2(

21

e

XP

If X, the number of Poisson events occurred in [0, 1] is Poi(), then Y, the number of Poisson events in [0, 2] is Poi(2), and so on !


Chapter 3


Uniform Random Variable

A continuous r.v. X is said to be uniformly distributed over an interval (a, b), denoted by X ~ UN(a, b), if its pdf is given by


othewise,0

if,1

)( bxaabxf

a b

f(x)

x

1 b a

.,1

,

,,0

)(

bx

bxaab

axax

xF

2][

baXE

12)(Var

2abX

)(

)(abt

eetM

tatb

M(0)=?


Chapter 3



Example 3.6. Starting at 5:00 A.M., every half hour there is a flight from San Francisco airport to Los Angeles International airport. Suppose that none of these planes is completely sold out and that they always have rooms for passengers. A person who wants to fly to L.A. arrives at the airport at a random time between 8:45 A.M. and 9:45 A.M. Find the probability that she waits a) at most 10 minutes; b) at least 15 minutes.

Solution: Let the passenger arrive at the airport X minutes past 8:45. Then, X ~ UN(0, 60).

a) P(wait at most 10 minutes) = P(arrive between 8:50 and 9:00 or 9:20 and 9:30) = P(5 < X < 15) + P(35 < X < 45) = 1/3

3

1

60

1

60

1 45

35

15

5 dxdx

b) P(wait at least 15 minutes) = P(arrive between 9:00 and 9:15 or 9:30 and 9:45) = P(15 < X < 30) + P(45 < X < 60) = 1/2

2

1

60

1

60

1 60

45

30

15 dxdx


Chapter 3



A continuous r.v. is said to follow an exponential distribution with parameter , denoted by X ~ Exp(), if its pdf is given by,

otherwise.0

,0)(

1 xexf

x

It is easy to see that the CDF of the exponential distribution is

The mean and variance: E(X) = and Var(X) = 2.

.0 if1

,0 if0)(

xe

xxF x

Exponential distribution has a famous property, called the memoryless property: if the lifetime of an electronic component follows the exponential distribution, then the probability for a component to last additional t time units is the same as the probability of a new component lasting t time units.

Exponential distribution has a famous property, called the memoryless property: if the lifetime of an electronic component follows the exponential distribution, then the probability for a component to last additional t time units is the same as the probability of a new component lasting t time units.


Chapter 3


Exponential Distribution Example 3.7. Suppose that a certain solid-state component has a lifetime or

failure time (in hours) X ~ EXP(100).

a)Find the probability that the component will last at least 100 hours given that it has been used for 50 hours.b)Suppose that 10 such components are randomly selected. What is the probability that more than two of them last at least 50 hours?c)Now, 100 components are randomly selected. Find the probability that at least 2 of them last more than 460 hours.

Example 3.7. Suppose that a certain solid-state component has a lifetime or failure time (in hours) X ~ EXP(100).

a)Find the probability that the component will last at least 100 hours given that it has been used for 50 hours.b)Suppose that 10 such components are randomly selected. What is the probability that more than two of them last at least 50 hours?c)Now, 100 components are randomly selected. Find the probability that at least 2 of them last more than 460 hours.

Solution: a) By memoryless property, P(X > 100 | X > 50) = P(X > 50) = exp(50/100) = 0.6065.

b) Let Y be number of components in the 10 selected that last more than 50 hours. Then Y ~ Bin(10, 0.6065). The desired probability is

P(Y > 2) = 1 – P(Y 2) = 0.9890

c) Let W be number of components in the 100 selected that last more than 460 hours. Then, W ~ Bin(100, ), with = P(X > 460) = exp(460/100) = 0.01. The desired probability = P(W ≥ 2) = 0.2642.


Chapter 3



Exponential and Poisson:The time between any two Poisson events is an exponential random variable with = 1/.

Exponential and Poisson:The time between any two Poisson events is an exponential random variable with = 1/.

Let T be the time until the first Poisson event; let the rate of occurrence of Poisson events be per time unit. Then, the CDF of T is,

,1

)] [0,in eventsPoisson no(1

)(1

)()(

te

tP

tTP

tTPtF

which is the exponential CDF, thus T is exponential with = 1/.


Chapter 3


)(0

1 dxex x

28

A continuous r.v. X is said to follow a gamma distribution with parameters > 0 and > 0, denoted by X ~ Gamma(, ), if its pdf is given by

where () the gamma function defined as

having a property ()= (1)(1), and (n) = (n1)!

Gamma Distribution

otherwise,,0

,0,)(

1)(

1 xexxf

x

.0,)(0

1 tdyey y

As pdf integrates to 1, we have

1)(

10

1

dxex x


Chapter 3


Gamma Distribution

Expectation: E[X] = Expectation: E[X] =

Proof: Apply gamma pdf formula.


Chapter 3


Gamma Distribution

Variance: Var(X) = 2Variance: Var(X) = 2



Chapter 3


Gamma Distribution


The MGF:The MGF: )1(

1)(

ttM

Exercise: Derive the mean and variance using M(t).


Chapter 3


Gamma Distribution

More about Gamma Distribution:

When =1, gamma becomes exponential

When = 2 and =r/2, where r is a positive integer, gamma distribution becomes a chi-squared distribution, r degrees of freedom.

When = k, a positive integer, the gamma random variable can be interpreted as the waiting time until the kth Poisson events.

(Relate this to the relationship between exponential and Poisson.)


Chapter 3


Normal Distribution

A continuous r.v. X is called normal with mean and standard deviation , denoted by X ~ N(, 2), if its pdf has form

x

xxf ,

2

)(exp

2

1)(

2

2

where –∞ < < ∞ and > 0. The f(x) is a bell-shaped, unimodal, and symmetric around .

-5 -2.5 0 2.5 5 7.5 100

0.05

0.1

0.15

0.22

4

2

0

-7.5 -5 -2.5 0 2.5 5 7.50

0.1

0.2

0.3

0.40

3

2

1


Chapter 3


If X ~ N(, 2), then a + bX ~ N(a + b, b22)

It follows that Z = (X )/ ~ N(0, 1), called the Standard Normal Random Variable

The pdf and CDF of Z are usually denoted respectively by

There is no closed-form expression for (z), but the percentage values are tabulated, or can be easily calculated using computer software.

34

,2

1exp

2

1)( 2

zz

zdttz 2

2

1exp

2

1)(

Normal Distribution


Chapter 3


Solution: a) b) c)

35

Normal Distribution

Example 3.8. Let X ~ N(3, 9). Calculate a) P(2 < X < 5), b) P(X > 0), c) P(|X – 3| > 6).

3779.0

)3/1()3/2(

3

2

3

1

3

35

3

3

3

32

)52(

ZP

XP

XP

8413.0

)1(

)1(1

1

3

30

3

3

)0(

ZP

XP

XP

0456.0

)2(2

2||

3

6

3

|3|

)6|3(|

ZP

XP

XP


Chapter 3


Normal Distribution

The Normal Approximations to the Binomial When n, the number of Bernoulli trials, is large. It is difficult to calculate the binomial probabilities, especially in the old days. The normal approximation to the binomial probabilities was then introduced. The basic idea is that when n gets larger, the plot of Binomial pmf gets closer to the plot of a normal pdf.

50 40 30 20 10 0

0.10

0.05

0.00

=0.5 n=50

50 40 30 20 10 0

0.15

0.10

0.05

0.00

n=50 =0.2


Chapter 3


Normal Distribution

As a rule of thumb: if n is such that n(1–) ≥ 10, then the distribution of a Bin(n, ) r.v. can be approximated by the distribution of a N[n, n (1–)] r.v.

As a rule of thumb: if n is such that n(1–) ≥ 10, then the distribution of a Bin(n, ) r.v. can be approximated by the distribution of a N[n, n (1–)] r.v.

Example 3.9. Flip a fair coin 40 times, Let X be number of times that it lands heads. Find the probability that X ≤ 21. Use the normal approximation and then compare it to the exact solution.

Solution: Since X ~ Bin(40, 0.5) and 400.50.5 = 10, the distribution of X can be approximated by N(20, 10),

.6241.0)3162.0(

5.05.040

2021

5.05.040

20)21(

ZP

XPXP

The exact answer is 0.6821. Approximation not so good, why?


Chapter 3


Normal Distribution

The reason for that is because we are using a continuous distribution to approximate a discrete one, and a continuity correction is lacking.

Continuity Correction Factor. Let X ~ Bin(n, p), np(1–p) ≥ 10; and let Y ~ N[np, np(1–p)]. Then, we have

P(X ≤ i) P(Y ≤ i + 1/2),

P(X ≥ i) P(Y ≥ i – 1/2),

P(X = i) P(i – 1/2 ≤ Y ≤ i + 1/2).

Continuity Correction Factor. Let X ~ Bin(n, p), np(1–p) ≥ 10; and let Y ~ N[np, np(1–p)]. Then, we have

P(X ≤ i) P(Y ≤ i + 1/2),

P(X ≥ i) P(Y ≥ i – 1/2),

P(X = i) P(i – 1/2 ≤ Y ≤ i + 1/2).

How is, in general, a probability based on a discrete r.v. be approximated by a probability based on a continuous one? Let X be a discrete r.v. with pmf p(x) and taking integer values. We have

P(i ≤ X ≤ j) = = Sum of the areas of all rectangles from i to j,

where the kth rectangle has base 1, height p(k) and midpoint k.

j

ikkp )(


Chapter 3


j1 j i i+1

39

Normal Distribution

. . .

f(x)

p(i) p(j)

Now suppose the continuous pdf f(x) is a good approximation to p(x). Clearly,

P(i ≤ X ≤ j) the area under f(x) from i–1/2 to j+1/2 = f (x)dx

i 1 2j1 2


Chapter 3


Normal Distribution

Similarly,

P(X = k) = ,

P(X ≥ i) = ,

P(X ≤ j) = .

f (x)dxk 1 2k 1 2

f (x)dxi 1 2

f (x)dx j1 2

Example 3.9 (Cont'd). Now revisit Example 3.9 to approximate P(X ≤ 21) by applying the continuity correction factor.

P(X ≤ 21) P(Y ≤ 21+0.5)

=

P(Z ≤ 0.4743) = 0.6824,

compared to the exact answer 0.6821!

5.05.040

205.21

5.05.040

20YP

Download - Chapter 3: Special Distributions [email protected] Yang Zhenlin

Top Related