chapter 3: special distributions [email protected] yang zhenlin

40
Chapter 3: Special Distributions [email protected] http://www.mysmu.edu/faculty/zlyang/ Yang Yang Zhenlin Zhenlin

Upload: silvester-arnold

Post on 04-Jan-2016

235 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chapter 3: Special Distributions zlyang@smu.edu.sg  Yang Zhenlin

Chapter 3: Special Distributions

[email protected] http://www.mysmu.edu/faculty/zlyang/

Yang ZhenlinYang Zhenlin

Page 2: Chapter 3: Special Distributions zlyang@smu.edu.sg  Yang Zhenlin

STAT306, Term II, 09/10

Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU

Chapter Contents

Bernoulli Random Variable

Binominal Distribution

Negative Binomial Distribution

Geometric Distribution

Poisson Distribution

Uniform Distribution

Exponential Distribution

Gamma Distribution

Normal Distribution

2

Page 3: Chapter 3: Special Distributions zlyang@smu.edu.sg  Yang Zhenlin

STAT306, Term II, 09/10

Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU3

Introduction

Many probability distributions have been invented to model the phenomena associated with certain social activities. Each of them is suitable for a special situation, has some unique properties including a special pmf or pdf form, and hence merits a separate study. For example,

Binomial random variable is to the number of successes among n independent experimental trials with two possible outcomes in each trial: success or failure.

Poisson random variable counts the number of ‘events’ occurred in a fixed time period.

Exponential random variable can be used to ‘model’ the lifetime of an electronic component.

Normal random variable can be used to describe the height or weight distributions of human being, … .

Page 4: Chapter 3: Special Distributions zlyang@smu.edu.sg  Yang Zhenlin

STAT306, Term II, 09/10

Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU

The sample space of a Bernoulli trial is S = {s, f}.

Defining a variable X in such way that

X(s) = 1 and X(f ) = 0,

then X is a r.v. taking only two possible values: 0 and 1. This r.v. is called Bernoulli random variable.

Denoting = P(X = 1), it is then called the probability of success.

The pmf of a Bernoulli r.v., called Bernoulli Distribution, is seen to be

4

Bernoulli Trial

Consider an experiment with only two possible outcomes:

s = "Success" and f = "Failure".

A performance of such an experiment is called a Bernoulli trial.

Consider an experiment with only two possible outcomes:

s = "Success" and f = "Failure".

A performance of such an experiment is called a Bernoulli trial.

1,0,)1()( 1 xxp xx The expectation: E[X] = 1 P(X=1) + 0P(X=0) = .

Page 5: Chapter 3: Special Distributions zlyang@smu.edu.sg  Yang Zhenlin

STAT306, Term II, 09/10

Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU5

Bernoulli Trial

The variance: As E[X2] = 12×P(X=1) + 02×P(X=0) = ,

Var(X) = E[X2] – (E[X])2 = –2 = (1–)

Example 3.1. If in a throw of a fair die the event of obtaining 4 or 6 is called a success, and the event of obtaining 1, 2, 3, or 5 is called a failure, then

is a Bernoulli r.v. with parameter = 1/3.

otherwise,0

obtained is 6or 4 if1X

The MGF: M(t) = E[etX] = et(1)×P(X=1) + et(0)×P(X=0)

= et + (1–)

From MGF: M'(0) = = E[X]; and M''(0) = = E[X2].

Page 6: Chapter 3: Special Distributions zlyang@smu.edu.sg  Yang Zhenlin

STAT306, Term II, 09/10

Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU6

Binomial Distribution

If n Bernoulli trials with probability of success are performed independently, then X, the number of successes in these n trials, is called a Binomial r.v. with parameters n and , denoted by X ~ Bin(n, ). The pmf of the binomial r.v., called the Binomial Distribution, has the expression

If n Bernoulli trials with probability of success are performed independently, then X, the number of successes in these n trials, is called a Binomial r.v. with parameters n and , denoted by X ~ Bin(n, ). The pmf of the binomial r.v., called the Binomial Distribution, has the expression

nxxnxp xnx ,,2,1,0,)1()(

The arguments for deriving the binomial pmf is as follows:

in n Bernoulli trials, the probability for each particular sequence of “x successes” and “nx failures” is x(1-)n-x.

There are totally nCx ways of obtaining x successes and n–x failures. Thus the desired probability of having (any) “x successes” and “nx failures” is nCx x(1-)n-x.

Page 7: Chapter 3: Special Distributions zlyang@smu.edu.sg  Yang Zhenlin

STAT306, Term II, 09/10

Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU7

Binomial Distribution

20 10 0

0.2

0.1

0.0

=0.2 n=20

n=20

=0.8

Plots of Binomial Probability Mass Function

10 5 0

0.25

0.20

0.15

0.10

0.05

0.00

=0.5 n=10

20 10 0

0.2

0.1

0.0

n=20 =0.5

Page 8: Chapter 3: Special Distributions zlyang@smu.edu.sg  Yang Zhenlin

STAT306, Term II, 09/10

Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU8

Binomial Distribution

Expectation: E[X] = nExpectation: E[X] = n

Proof: Apply Binominal Expansion formula,

n

i

iinn bainba

0

)(

Page 9: Chapter 3: Special Distributions zlyang@smu.edu.sg  Yang Zhenlin

STAT306, Term II, 09/10

Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU9

Binomial Distribution

Variance: Var(X) = n (1)Variance: Var(X) = n (1)

Proof: Apply Binominal Expansion formula to find E[X(X1)].

Page 10: Chapter 3: Special Distributions zlyang@smu.edu.sg  Yang Zhenlin

STAT306, Term II, 09/10

Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU10

Binomial Distribution

MGF: M(t) = [et + (1–)]nMGF: M(t) = [et + (1–)]n

Proof:

Exercise: Using MGF, show E[X] = n, and Var(X) = n (1)

Page 11: Chapter 3: Special Distributions zlyang@smu.edu.sg  Yang Zhenlin

STAT306, Term II, 09/10

Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU11

Binomial Distribution

Solution: Let X be the number of persons who decide correctly among a three-person jury. Then X ~ Bin(3, ). Hence the probability that a three-person jury decides correctly is

P(X ≥ 2) = P(X = 2) + P(X = 3) = 3C2 2(1–) + 3C3 3(1–)0

= 32 – 23.

Example 3.2. Suppose that jury members decide independently and that each with probability makes the correct decision. If the decision of the majority is final, what is the probability that a three-person jury makes a correct decision?

Page 12: Chapter 3: Special Distributions zlyang@smu.edu.sg  Yang Zhenlin

STAT306, Term II, 09/10

Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU

The Mean and Variance:The Mean and Variance:

12

Consider a sequence of independent Bernoulli trials with probability of success . If X represents the number of trials until one success, then, X is a geometric r.v. Its distribution is called the geometric distribution, denoted by X ~ GEO(), which can easily be seen as

Consider a sequence of independent Bernoulli trials with probability of success . If X represents the number of trials until one success, then, X is a geometric r.v. Its distribution is called the geometric distribution, denoted by X ~ GEO(), which can easily be seen as

,,3,2,1,)1()( 1 xxp x

2

1)(Var and ,

1)(E

X

πX

Geometric Distribution

Proof: Recall geometric series and its 1st and 2nd derivatives,1 + a + a2 + a3 + + an + = (1 a)1 , for |a| < 1

1 + 2a + 3a2 + + n an1 + = (1 a)2 ,

2 + 32a + 43a2 + + n(n 1)an2 + = 2(1 a)3 . . . .

Page 13: Chapter 3: Special Distributions zlyang@smu.edu.sg  Yang Zhenlin

STAT306, Term II, 09/10

Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU13

Exercise. Use MGF to derive the mean and variance of a geometric random variable.

Geometric Distribution

The MGF:The MGF: t

t

e

etM

)1(1)(

Proof: Apply geometric series.

Page 14: Chapter 3: Special Distributions zlyang@smu.edu.sg  Yang Zhenlin

STAT306, Term II, 09/10

Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU14

Example 3.3. A father asks his sons to cut the backyard lawn. Since he does not specify which of his three sons is to do the job, each boy tosses a coin to determine the odd person, who must then cut the lawn. In the cases that all three get heads or tails, they continue tossing until they reach a decision. Let be the probability of heads, then 1– is the probability of tails.

(a) Find the probability that they reach a decision in less than n rounds of tosses.

(b) If =1/2, what is the minimum number of rounds required to reach a decision with probability 0.95?

Geometric Distribution

Solution: (a) The probability of reaching a decision on any round of coin tossing:

1 = P{exactly two heads} + P{exactly two tails}

= 3C2 2(1) + 3C2 (1) 2 = 3(1)

Let X be the number of rounds of tosses until they reach a decision; then X is a geometric r.v. with parameter 1 . Thus,

P(X < n) = 1 – P(X ≥ n) = 1 – [1 – 3(1)]n–1.

Page 15: Chapter 3: Special Distributions zlyang@smu.edu.sg  Yang Zhenlin

STAT306, Term II, 09/10

Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU15

Geometric Distribution

(b) We want to find the minimum n so that P(X ≤ n) ≥ .95.

CDF of Geometric Distribution: F(x) = P{X x} = 1 P{X > x}

= 1 P{X x+1}= 1 [(1)x + (1)x+1 + (1)x+2 + … ]

= 1 (1)x [1+ (1) + (1)2 + … ] = 1 (1)x

CDF of Geometric Distribution: F(x) = P{X x} = 1 P{X > x}

= 1 P{X x+1}= 1 [(1)x + (1)x+1 + (1)x+2 + … ]

= 1 (1)x [1+ (1) + (1)2 + … ] = 1 (1)x

P(X ≤ n) = 1 – P(X ≥ n +1) = 1 – (1 – 3(1))n =1 – (1 – 3/4) n ≥ 0.95,

which gives n ≥ 2.16.

So n =3.

Page 16: Chapter 3: Special Distributions zlyang@smu.edu.sg  Yang Zhenlin

STAT306, Term II, 09/10

Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU

r

t

t

e

etM

rY

π

rY

)1(1)(and,

)1()(Var ,)(E

2

To obtain the pmf of Y, note the event {Y = y} is equivalent to {r–1 successes in first y–1 trials and a success in last trial}, thus,

p(y) = P(Y = y) = r (1 – ) y-r, y = r, r+1, . . .

which is called the negative binomial distribution.

To obtain the pmf of Y, note the event {Y = y} is equivalent to {r–1 successes in first y–1 trials and a success in last trial}, thus,

p(y) = P(Y = y) = r (1 – ) y-r, y = r, r+1, . . .

which is called the negative binomial distribution.

16

Negative Binomial Distribution

If Y represents the number of Bernoulli trials until r successes, then Y is called a negative binomial r.v. with parameters r and , denoted by Y~ NB(r, ).

If Y represents the number of Bernoulli trials until r successes, then Y is called a negative binomial r.v. with parameters r and , denoted by Y~ NB(r, ).

Note: The mean and variance of a negative binomial random variable are r times of those of a geometric r.v, and MGF is that of a geometric r.v. raised to power r. Why ?

y 1r 1

Page 17: Chapter 3: Special Distributions zlyang@smu.edu.sg  Yang Zhenlin

STAT306, Term II, 09/10

Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU17

Negative Binomial Distribution

Example 3.4. Team A plays team B in a seven-game world series. That is, the series is over when either team wins four games. For each game, P(A wins) = 0.6, and the games are assumed independent.

(a) What is the probability that the series ends in exactly six games?

(b) If the series ends in six games, what is the probability that A wins the series?

Example 3.4. Team A plays team B in a seven-game world series. That is, the series is over when either team wins four games. For each game, P(A wins) = 0.6, and the games are assumed independent.

(a) What is the probability that the series ends in exactly six games?

(b) If the series ends in six games, what is the probability that A wins the series?Solution: (a) Let X be the number of games until A wins four games, and Y be the number of games until B wins four games. Then, X ~ NB(4, 0.6) and Y~ NB(4, 0.4).

P(series ends at exactly 6 games) = P(X = 6) + P(Y = 6)= 5C3 (0.6)4(0.4) 2 + 5C3 (0.4) 4(0.6)2

= 0.2074 + 0.0922 = 0.2996.(b) Let A be the event that team A wins and D be the event that series ends in six games. Then the desired probability is

6923.02996.0

2074.0

)6()6(

)6(

)(

)()|(

YPXP

XP

DP

DAPDAP

Page 18: Chapter 3: Special Distributions zlyang@smu.edu.sg  Yang Zhenlin

STAT306, Term II, 09/10

Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU

These quantities often satisfy the following conditions:

The number of events occurred in nonoverlapping intervals are independent

The probability of exactly one event occurring in a sufficiently short interval of length h is approximately h.

The probability of two or more events occurring in a sufficiently short interval is essentially zero.

These quantities often satisfy the following conditions:

The number of events occurred in nonoverlapping intervals are independent

The probability of exactly one event occurring in a sufficiently short interval of length h is approximately h.

The probability of two or more events occurring in a sufficiently short interval is essentially zero.

18

In many applications, the number of events occurred in a fixed "time interval" is of interest. For example,

The number of customers entering a bank on a given day.The number of typographical errors of a book.The number of traffic accidents in a given year.The number of a–particles discharged in a fixed period of time from some

radioactive material.The number of defects on a piece of wire.

Poisson Distribution

Page 19: Chapter 3: Special Distributions zlyang@smu.edu.sg  Yang Zhenlin

STAT306, Term II, 09/10

Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU

As a pmf sums to one, we must have,

a useful formula!

As a pmf sums to one, we must have,

a useful formula!

19

Poisson Distribution

A suitable distribution for describing the above phenomena is the Poisson distribution. If the three conditions are satisfied, it can indeed be shown that the number of events occurred in a fixed or a unit interval, X, follows a Poisson distribution with pmf:

,,2,1,0,!

)(

xx

exp

x

,!

or ,1! 00

exx

e

x

x

x

x

The mean: E[X] = : the rate of Poisson events per unit.

The variance: Var(X) =

The MGF:)1()(

teetM

Page 20: Chapter 3: Special Distributions zlyang@smu.edu.sg  Yang Zhenlin

STAT306, Term II, 09/10

Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU20

Poisson Distribution

Plots of Poisson Probability Mass Function

20100

0.3

0.2

0.1

0.0

Mean=2

20100

0.2

0.1

0.0

Mean=5

20100

0.10

0.05

0.00

Mean=10 Note: i)The mean and variance of Poisson are both .

ii)The parameter is usually called the rate of occurrence of Poisson events.

Page 21: Chapter 3: Special Distributions zlyang@smu.edu.sg  Yang Zhenlin

STAT306, Term II, 09/10

Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU21

0

1

1

1

0

0

!

)!1(

!

!

)(

][E

y

y

x

x

x

x

x

x

x

y

e

x

e

x

ex

x

ex

xpx

X

2

0

2

2

22

2

0

!

)!2(

!)1(

!)1(

)]1([E

y

y

x

x

x

x

x

x

y

e

x

e

x

exx

x

exx

XX

E[X2] = 2 + Var(X) = E[X2] = 2 + Var(X) =

Derivation of the mean, variance and MGF:Derivation of the mean, variance and MGF:

Poisson Distribution

)1(

0

0

0

!

)(

!

)(

!

][E)(

t

t

e

e

x

xt

x

xt

x

xtx

tX

e

ee

x

ee

x

ee

x

ee

etM

y=x1

y=x2

Use M(t) to show these?

Page 22: Chapter 3: Special Distributions zlyang@smu.edu.sg  Yang Zhenlin

STAT306, Term II, 09/10

Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU22

Poisson Distribution

Example 3.5. Every day the average number of wrong numbers received by a certain mail-order house is one. What is the probability that this house receives a)two wrong calls tomorrow,b)at least one wrong call tomorrow?

Solution: Assuming that the house receives a lot of calls, X, the number of wrong calls received tomorrow, is approximately a Poisson r.v. with = E(X) = 1. Thus,

a)

b) P(X ≥ 1) = 1 – P(X = 0) = 1 – 1/e = 0.63.

18.0!2

1)2(

21

e

XP

If X, the number of Poisson events occurred in [0, 1] is Poi(), then Y, the number of Poisson events in [0, 2] is Poi(2), and so on !

Page 23: Chapter 3: Special Distributions zlyang@smu.edu.sg  Yang Zhenlin

STAT306, Term II, 09/10

Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU23

Uniform Random Variable

A continuous r.v. X is said to be uniformly distributed over an interval (a, b), denoted by X ~ UN(a, b), if its pdf is given by

Uniform Distribution

othewise,0

if,1

)( bxaabxf

a b

f(x)

x

1 b a

.,1

,

,,0

)(

bx

bxaab

axax

xF

2][

baXE

12)(Var

2abX

)(

)(abt

eetM

tatb

M(0)=?

Page 24: Chapter 3: Special Distributions zlyang@smu.edu.sg  Yang Zhenlin

STAT306, Term II, 09/10

Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU24

Uniform Distribution

Example 3.6. Starting at 5:00 A.M., every half hour there is a flight from San Francisco airport to Los Angeles International airport. Suppose that none of these planes is completely sold out and that they always have rooms for passengers. A person who wants to fly to L.A. arrives at the airport at a random time between 8:45 A.M. and 9:45 A.M. Find the probability that she waits a) at most 10 minutes; b) at least 15 minutes.

Solution: Let the passenger arrive at the airport X minutes past 8:45. Then, X ~ UN(0, 60).

a) P(wait at most 10 minutes) = P(arrive between 8:50 and 9:00 or 9:20 and 9:30) = P(5 < X < 15) + P(35 < X < 45) = 1/3

3

1

60

1

60

1 45

35

15

5 dxdx

b) P(wait at least 15 minutes) = P(arrive between 9:00 and 9:15 or 9:30 and 9:45) = P(15 < X < 30) + P(45 < X < 60) = 1/2

2

1

60

1

60

1 60

45

30

15 dxdx

Page 25: Chapter 3: Special Distributions zlyang@smu.edu.sg  Yang Zhenlin

STAT306, Term II, 09/10

Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU25

Exponential Distribution

A continuous r.v. is said to follow an exponential distribution with parameter , denoted by X ~ Exp(), if its pdf is given by,

otherwise.0

,0)(

1 xexf

x

It is easy to see that the CDF of the exponential distribution is

The mean and variance: E(X) = and Var(X) = 2.

.0 if1

,0 if0)(

xe

xxF x

Exponential distribution has a famous property, called the memoryless property: if the lifetime of an electronic component follows the exponential distribution, then the probability for a component to last additional t time units is the same as the probability of a new component lasting t time units.

Exponential distribution has a famous property, called the memoryless property: if the lifetime of an electronic component follows the exponential distribution, then the probability for a component to last additional t time units is the same as the probability of a new component lasting t time units.

Page 26: Chapter 3: Special Distributions zlyang@smu.edu.sg  Yang Zhenlin

STAT306, Term II, 09/10

Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU26

Exponential Distribution Example 3.7. Suppose that a certain solid-state component has a lifetime or

failure time (in hours) X ~ EXP(100).

a)Find the probability that the component will last at least 100 hours given that it has been used for 50 hours.b)Suppose that 10 such components are randomly selected. What is the probability that more than two of them last at least 50 hours?c)Now, 100 components are randomly selected. Find the probability that at least 2 of them last more than 460 hours.

Example 3.7. Suppose that a certain solid-state component has a lifetime or failure time (in hours) X ~ EXP(100).

a)Find the probability that the component will last at least 100 hours given that it has been used for 50 hours.b)Suppose that 10 such components are randomly selected. What is the probability that more than two of them last at least 50 hours?c)Now, 100 components are randomly selected. Find the probability that at least 2 of them last more than 460 hours.

Solution: a) By memoryless property, P(X > 100 | X > 50) = P(X > 50) = exp(50/100) = 0.6065.

b) Let Y be number of components in the 10 selected that last more than 50 hours. Then Y ~ Bin(10, 0.6065). The desired probability is

P(Y > 2) = 1 – P(Y 2) = 0.9890

c) Let W be number of components in the 100 selected that last more than 460 hours. Then, W ~ Bin(100, ), with = P(X > 460) = exp(460/100) = 0.01. The desired probability = P(W ≥ 2) = 0.2642.

Page 27: Chapter 3: Special Distributions zlyang@smu.edu.sg  Yang Zhenlin

STAT306, Term II, 09/10

Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU27

Exponential Distribution

Exponential and Poisson:The time between any two Poisson events is an exponential random variable with = 1/.

Exponential and Poisson:The time between any two Poisson events is an exponential random variable with = 1/.

Let T be the time until the first Poisson event; let the rate of occurrence of Poisson events be per time unit. Then, the CDF of T is,

,1

)] [0,in eventsPoisson no(1

)(1

)()(

te

tP

tTP

tTPtF

which is the exponential CDF, thus T is exponential with = 1/.

Page 28: Chapter 3: Special Distributions zlyang@smu.edu.sg  Yang Zhenlin

STAT306, Term II, 09/10

Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU

)(0

1 dxex x

28

A continuous r.v. X is said to follow a gamma distribution with parameters > 0 and > 0, denoted by X ~ Gamma(, ), if its pdf is given by

where () the gamma function defined as

having a property ()= (1)(1), and (n) = (n1)!

Gamma Distribution

otherwise,,0

,0,)(

1)(

1 xexxf

x

.0,)(0

1 tdyey y

As pdf integrates to 1, we have

1)(

10

1

dxex x

Page 29: Chapter 3: Special Distributions zlyang@smu.edu.sg  Yang Zhenlin

STAT306, Term II, 09/10

Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU29

Gamma Distribution

Expectation: E[X] = Expectation: E[X] =

Proof: Apply gamma pdf formula.

Page 30: Chapter 3: Special Distributions zlyang@smu.edu.sg  Yang Zhenlin

STAT306, Term II, 09/10

Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU30

Gamma Distribution

Variance: Var(X) = 2Variance: Var(X) = 2

Proof: Apply gamma pdf formula.

Page 31: Chapter 3: Special Distributions zlyang@smu.edu.sg  Yang Zhenlin

STAT306, Term II, 09/10

Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU31

Gamma Distribution

Proof: Apply gamma pdf formula.

The MGF:The MGF: )1(

1)(

ttM

Exercise: Derive the mean and variance using M(t).

Page 32: Chapter 3: Special Distributions zlyang@smu.edu.sg  Yang Zhenlin

STAT306, Term II, 09/10

Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU32

Gamma Distribution

More about Gamma Distribution:

When =1, gamma becomes exponential

When = 2 and =r/2, where r is a positive integer, gamma distribution becomes a chi-squared distribution, r degrees of freedom.

When = k, a positive integer, the gamma random variable can be interpreted as the waiting time until the kth Poisson events.

(Relate this to the relationship between exponential and Poisson.)

Page 33: Chapter 3: Special Distributions zlyang@smu.edu.sg  Yang Zhenlin

STAT306, Term II, 09/10

Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU33

Normal Distribution

A continuous r.v. X is called normal with mean and standard deviation , denoted by X ~ N(, 2), if its pdf has form

x

xxf ,

2

)(exp

2

1)(

2

2

where –∞ < < ∞ and > 0. The f(x) is a bell-shaped, unimodal, and symmetric around .

-5 -2.5 0 2.5 5 7.5 100

0.05

0.1

0.15

0.22

4

2

0

-7.5 -5 -2.5 0 2.5 5 7.50

0.1

0.2

0.3

0.40

3

2

1

Page 34: Chapter 3: Special Distributions zlyang@smu.edu.sg  Yang Zhenlin

STAT306, Term II, 09/10

Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU

If X ~ N(, 2), then a + bX ~ N(a + b, b22)

It follows that Z = (X )/ ~ N(0, 1), called the Standard Normal Random Variable

The pdf and CDF of Z are usually denoted respectively by

There is no closed-form expression for (z), but the percentage values are tabulated, or can be easily calculated using computer software.

34

,2

1exp

2

1)( 2

zz

zdttz 2

2

1exp

2

1)(

Normal Distribution

Page 35: Chapter 3: Special Distributions zlyang@smu.edu.sg  Yang Zhenlin

STAT306, Term II, 09/10

Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU

Solution: a) b) c)

35

Normal Distribution

Example 3.8. Let X ~ N(3, 9). Calculate a) P(2 < X < 5), b) P(X > 0), c) P(|X – 3| > 6).

3779.0

)3/1()3/2(

3

2

3

1

3

35

3

3

3

32

)52(

ZP

XP

XP

8413.0

)1(

)1(1

1

3

30

3

3

)0(

ZP

XP

XP

0456.0

)2(2

2||

3

6

3

|3|

)6|3(|

ZP

XP

XP

Page 36: Chapter 3: Special Distributions zlyang@smu.edu.sg  Yang Zhenlin

STAT306, Term II, 09/10

Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU36

Normal Distribution

The Normal Approximations to the Binomial When n, the number of Bernoulli trials, is large. It is difficult to calculate the binomial probabilities, especially in the old days. The normal approximation to the binomial probabilities was then introduced. The basic idea is that when n gets larger, the plot of Binomial pmf gets closer to the plot of a normal pdf.

50 40 30 20 10 0

0.10

0.05

0.00

=0.5 n=50

50 40 30 20 10 0

0.15

0.10

0.05

0.00

n=50 =0.2

Page 37: Chapter 3: Special Distributions zlyang@smu.edu.sg  Yang Zhenlin

STAT306, Term II, 09/10

Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU37

Normal Distribution

As a rule of thumb: if n is such that n(1–) ≥ 10, then the distribution of a Bin(n, ) r.v. can be approximated by the distribution of a N[n, n (1–)] r.v.

As a rule of thumb: if n is such that n(1–) ≥ 10, then the distribution of a Bin(n, ) r.v. can be approximated by the distribution of a N[n, n (1–)] r.v.

Example 3.9. Flip a fair coin 40 times, Let X be number of times that it lands heads. Find the probability that X ≤ 21. Use the normal approximation and then compare it to the exact solution.

Solution: Since X ~ Bin(40, 0.5) and 400.50.5 = 10, the distribution of X can be approximated by N(20, 10),

.6241.0)3162.0(

5.05.040

2021

5.05.040

20)21(

ZP

XPXP

The exact answer is 0.6821. Approximation not so good, why?

Page 38: Chapter 3: Special Distributions zlyang@smu.edu.sg  Yang Zhenlin

STAT306, Term II, 09/10

Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU38

Normal Distribution

The reason for that is because we are using a continuous distribution to approximate a discrete one, and a continuity correction is lacking.

Continuity Correction Factor. Let X ~ Bin(n, p), np(1–p) ≥ 10; and let Y ~ N[np, np(1–p)]. Then, we have

P(X ≤ i) P(Y ≤ i + 1/2),

P(X ≥ i) P(Y ≥ i – 1/2),

P(X = i) P(i – 1/2 ≤ Y ≤ i + 1/2).

Continuity Correction Factor. Let X ~ Bin(n, p), np(1–p) ≥ 10; and let Y ~ N[np, np(1–p)]. Then, we have

P(X ≤ i) P(Y ≤ i + 1/2),

P(X ≥ i) P(Y ≥ i – 1/2),

P(X = i) P(i – 1/2 ≤ Y ≤ i + 1/2).

How is, in general, a probability based on a discrete r.v. be approximated by a probability based on a continuous one? Let X be a discrete r.v. with pmf p(x) and taking integer values. We have

P(i ≤ X ≤ j) = = Sum of the areas of all rectangles from i to j,

where the kth rectangle has base 1, height p(k) and midpoint k.

j

ikkp )(

Page 39: Chapter 3: Special Distributions zlyang@smu.edu.sg  Yang Zhenlin

STAT306, Term II, 09/10

Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU

j1 j i i+1

39

Normal Distribution

. . .

f(x)

p(i) p(j)

Now suppose the continuous pdf f(x) is a good approximation to p(x). Clearly,

P(i ≤ X ≤ j) the area under f(x) from i–1/2 to j+1/2 = f (x)dx

i 1 2j1 2

Page 40: Chapter 3: Special Distributions zlyang@smu.edu.sg  Yang Zhenlin

STAT306, Term II, 09/10

Chapter 3

STAT151, Term I 2015-16 © Zhenlin Yang, SMU40

Normal Distribution

Similarly,

P(X = k) = ,

P(X ≥ i) = ,

P(X ≤ j) = .

f (x)dxk 1 2k 1 2

f (x)dxi 1 2

f (x)dx j1 2

Example 3.9 (Cont'd). Now revisit Example 3.9 to approximate P(X ≤ 21) by applying the continuity correction factor.

P(X ≤ 21) P(Y ≤ 21+0.5)

=

P(Z ≤ 0.4743) = 0.6824,

compared to the exact answer 0.6821!

5.05.040

205.21

5.05.040

20YP