standard statistical distributions most elementary statistical books provide a survey of commonly...

8
Most elementary statistical books provide a survey of commonly used statistical distributions. The reason we study these distributions are that They provide a comprehensive range of distributions for modelling practical applications Their mathematical properties are known They are described in terms of a few parameters, which have natural interpretations. 1. Bernoulli Distribution. This is used to model a trial which gives rise to two outcomes: success/ failure, male/ female, 0 / 1. Let p be the probability that the outcome is one and q = 1 - p that the outcome is zero. E[X] = p (1) + (1 - p) (0) = p VAR[X] = p (1) 2 + (1 - p) (0) 2 - E[X] 2 = p (1 - p). 2. Binomial Distribution. Suppose that we are interested in the number of successes X in n independent repetions of a Bernoulli trial, where the probability of success in an individual trial is p. Then Prob{X = k} = n C k p k (1-p) n - k , (k = 0, 1, …, n) E[X] = n p VAR[X] = n p (1 - p). This is the appropriate distribution to use in modeling the number of boys in a family of n = 4 children, the number of p Pro b 1 1 - p p (n=4, p=0.2) Prob 1 4 np

Upload: justina-lang

Post on 11-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Standard Statistical Distributions Most elementary statistical books provide a survey of commonly used statistical distributions. The reason we study these

Standard Statistical Distributions

Most elementary statistical books provide a survey of commonly used statistical distributions. The reason we study these distributions are that

They provide a comprehensive range of distributions for modelling practical applications Their mathematical properties are known They are described in terms of a few parameters, which have natural interpretations.

1. Bernoulli Distribution.This is used to model a trial which gives rise to two outcomes: success/ failure, male/ female, 0 / 1. Let p be the probability that the outcome is one and q = 1 - p that the outcome is zero.

E[X] = p (1) + (1 - p) (0) = pVAR[X] = p (1)2 + (1 - p) (0)2 - E[X]2 = p (1 - p).

2. Binomial Distribution.Suppose that we are interested in the number of successes X in n independent repetions of a Bernoulli trial, where the probability of success in an individual trial is p. Then

Prob{X = k} = nCk pk (1-p)n - k, (k = 0, 1, …, n)E[X] = n pVAR[X] = n p (1 - p).

This is the appropriate distribution to use in modeling the number of boys in a family of n = 4 children, the number of defective components in a batch n = 10 components and so on.

0 1 p

Prob1

1 - p

p

(n=4, p=0.2)Prob

1

4np

Page 2: Standard Statistical Distributions Most elementary statistical books provide a survey of commonly used statistical distributions. The reason we study these

3. Poisson Distribution.

The Poisson distribution arises as a limiting case of the binomial distribution, where n , p in such a way that n p a constant). Its density isProb{X = k} = exp ( - … ).

Note that exp (x) stands for e to the power of x, where e is approximately 2.71828.

E [X] = VAR [X] = The Poisson distribution is used to model the number of occurrences of a certain phenomenon in a fixed period of time or space, as in the number of

O particles emitted by a radioactive source in a fixed direction and period of timeO telephone calls received at a switchboard during a given periodO defects in a fixed length of cloth or paperO people arriving in a queue in a fixed interval of timeO accidents that occur on a fixed stretch of road in a specified time interval.

4. Geometric Distribution.

This arises in the “time” or number of steps k to the first success in a series of independent Bernoulli trials. The density is

Prob{X = k} = p (1 - p) k-1 (k = 1, 2, … ).E[X] = 1/pVAR [X] = (1 - p) /p2

Prob

X

5

1

Prob1

X

Page 3: Standard Statistical Distributions Most elementary statistical books provide a survey of commonly used statistical distributions. The reason we study these

5. Negative Binomial Distribution

This is used to model the number of failures k that occur before the r th success in a series of independent Bernoulli trials. The density is

Prob {X = k} = r+k-1Ck pr (1 - p)k (k = 0, 1, 2, … )Note E [X] = r (1 - p) / p

VAR[X] = r (1 - p) / p2.

6. Hypergeometric Distribution

Consider a population of M items, of which W are deemed to be successes. Let X be the number of successes that occur in a sample of size n, drawn without replacement from the population. The density is

Prob { X = k} = WCk M-WCn-k / MCn ( k = 0, 1, 2, … )Then E [X] = n W / M

VAR [X] = n W (M - W) (M - n) / { M2 (M - 1)}

7. Uniform Distribution

A random variable X has a uniform distribution on the interval [a, b], if X has density

f (X) = 1 / ( b - a) for a < X < b = 0 otherwise.

Then E [X] = (a + b) / 2VAR [X] = (b - a)2 / 12

Uniformly distributed random numbers occur frequently in simulation models. However, computer based algorithms, such as linear congruential functions, can only approximate this distribution so great care is needed in interpreting the output of simulation models.

1 / (b-a)

Prob

1

a b

X

Page 4: Standard Statistical Distributions Most elementary statistical books provide a survey of commonly used statistical distributions. The reason we study these

If X is a continuous random variable, then the probability that X takes a value in the range [a, b] is the area under the frequency function f(x) between these points:

Prob { a < x < b } = F (b) - F (a) = ab f(x) dx.

In practical work, these integrals are evaluated by looking up entries in statistical tables.

9. Gaussian or Normal Distribution

A random variable X has a normal distribution with mean and standard deviation if it has density

f (x) = 1 exp { - ( x - )2 }, -x <

= 0, otherwiseE [ X] = VAR [X] =

As described below, the normal distribution arises naturally as the limiting distribution of the average of a set of independent, identically distributed random variables with finite variances. It plays a central role in sampling theory and is a good approximation to a large class of empirical distributions. For this reason, a default assumption in many empirical studies is that the distribution of each observation is approximately normal. Therefore, statistical tables of the normal distribution are of great importance in analysing practical data sets. X is said to be a standardised normal variable if = 0 and = 1.

Prob

f(x)X

0

1

Page 5: Standard Statistical Distributions Most elementary statistical books provide a survey of commonly used statistical distributions. The reason we study these

10. Gamma Distribution

The Gamma distribution arises in queueing theory as the time to the arrival of the n th customer in a single server queue, where the average arrival rate is The frequency function is f(x) = ( x )n - 1 exp ( - x) / ( n - 1)! , x 0, 0, n = 1, 2, ...

= 0, otherwiseE [X] = n / VAR [X] = n / 2

11. Exponential Distribution

This is a special case of the Gamma distribution with n = 1 and so is used to model the interarrival time of customers, or the time to the arrival of the first customer, in a simple queue. The frequency function is

f (x) = exp ( - x ), x 00= 0, otherwise.

12. Chi-Square Distribution

A random variable X has a Chi-square distribution with n degrees of freedom ( where n is a positive integer) if it is a Gamma distribution with = 1, so its frequency function is

f (x) = xn - 1 exp ( - x) / ( n - 1) !, x o= 0, otherwise.

X

2 n (x)

Prob

Page 6: Standard Statistical Distributions Most elementary statistical books provide a survey of commonly used statistical distributions. The reason we study these

Chi-square Distribution (continued)

The chi-square distribution arises in two important applications:O If X1, X2, … , Xn is a sequence of independently distributed standardised normal

random variables, then the sum of squares X12 + X2

2 + … + Xn2 has a chi-square

distribution with n degrees of freedom O If x1, x2, … , xn is a random sample from a normal distribution with mean and variance 2 and let

x = xi / n and S2 = ( xi - x ) 2 / 2, then S2 has a chi-square distribution with n - 1 degrees of freedom, and the random variables S2 and x are independent.

13. Beta Distribution.

A random variable X has a Beta distribution with parameters 0 and 0 if it has frequency function

f (x) = x ( 1 - x) ( 0 < x < 1 = 0, otherwise

E [X] =VAR [X] =

If n is an integer, (n) = ( n - 1 ) ! with (1) = 1 (n + 1/2) = (n - 1/2) ( n - 3/2) … with ( 1/2) =

Page 7: Standard Statistical Distributions Most elementary statistical books provide a survey of commonly used statistical distributions. The reason we study these

14. Student’s t Distribution

A random varuable X has a t distribution with n degrees of freedom ( tn ) if it has density

f(x) = (n+1) / 2 + x2 / n ) - (n+1) / 2 ( - < x < n n / 2)

= 0, otherwise.The t distribution is symmetrical about the origin, with

E[X] = 0VAR [X] = n / (n -2).

For small values of n, the tn distribution is very flat. As n is increased the density assumes a bell shape. For values of n 25, the tn distribution is practically indistinguishable from the standard normal curve.

O If X and Y are independent random variables If X has a standard normal distribution and Y has a n

2 distribution then X has a tn distribution Y / n

O If x1, x2, … , xn is a random sample from a normal distribution, with mean and variance and if we define s2 = 1 / ( n - 1) ( xi - x ) 2 then ( x - ) / ( s / n) has a tn- 1 distribution

Page 8: Standard Statistical Distributions Most elementary statistical books provide a survey of commonly used statistical distributions. The reason we study these

15. F Distribution

A random variable X has an F distribution with m and n degrees of freedom if it has density

f(x) = (m + n) / 2 m m / 2 n n / 2 x m / 2 - 1

x > 0m / 2) n / 2) (n + m x) ( m + n ) / 2

= 0, otherwise.Note E[X] = n / ( n - 2) if n > 4

VAR [X] = 2 n2 (m + n - 2) if n > 4 m (n - 4) ( n - 2 )2

If O X andYare independent random variables, X has a m2 and Y a n

2 distribution X / m has an Fm , n distribution Y / nO One consequence of this is that the F distribution represents the distribution of

the ratio of certain independent quadratic forms which can be constructed from random samples drawn from normal distributions:

if x1, x2, … , xm ( m is a random sample from a normal distribution with mean 1 and variance 1

2, and if y1, y2, … , yn ( n is a random sample from a normal

distribution with mean 2 and variance 22, then

( xi - x )2 / ( m - 1) has an Fm - 1 , n - 1 distribution ( yi - y )2 / ( n - 1)