chapter 2 statistical background. 2.3 random variables and probability distributions a variable x is...

37
Chapter 2 Statistical Background

Post on 21-Dec-2015

219 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real

Chapter 2

Statistical Background

Page 2: Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real

2.3 Random Variables and Probability Distributions

• A variable X is said to be a random variable (rv) if for every real number a there exists a probability P(X ≦ a) that X takes on a value less than or equal to a

• Thus P(X = x) is the probability that the random variable X takes the value x.

• P( x1 X ≦ ≦ x2 ) is the probability that the random

variable X takes values between x1 and x2, both i

nclusive.

Page 3: Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real

2.3 Random Variables and Probability Distributions

• A formula giving the probabilities for different v

alues of the random variable X is called a prob

ability distribution in the case of discrete rando

m variables.

• Probability density function (denoted by p.d.f.) fo

r continuous random variables. This is usually

denoted by f(x)

Page 4: Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real

2.3 Random Variables and Probability Distributions

• In general, for a continuous random variable, th

e occurrence of any exact value of X may be re

garded as having a zero probability.

• Hence probabilities are discussed in terms of s

ome ranges.

• These probabilities are obtained by integrating

f(x) over the desired range.

Page 5: Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real

2.3 Random Variables and Probability Distributions

b

a

dxxfbXaobPr )()(

c

dxxfcXobPrcF )()()(

• For instance, if we want Prob (a≦ X ≦b), this is given by

Page 6: Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real

2.4 The Normal Probability Distribution and

Related Distributions• There are some probability distributions for

which the probabilities have been tabulated

and which are considered suitable descriptions

for a wide variety of phenomena.

• These are the normal distribution and the x2, t,

and F distributions

Page 7: Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real

2.4 The Normal Probability Distribution and

Related Distributions

• There is also a question of whether the normal

distribution is an appropriate one to use to

describe economic variables.

• However, even if the variables are not normally

distributed, one can consider transformations of

the variables so that the transformed variables

are normally distributed.

Page 8: Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real

2.4 The Normal Probability Distribution and

Related DistributionsThe Normal Distribution (an example)

• The normal distribution is a bell-shaped

distribution which is used most extensively in

statistical applications in a wide variety of fields.

• Its probability density function is given by

xxxf 2

2)(

2

1exp

2

1)(

Page 9: Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real

2.4 The Normal Probability Distribution and

Related Distributions

and the correlation between x and x is , then

In particular,

and

),(~and),(~ 2222

2111 NxNx

)2,(~ 212122

22

21

2122112211 aaaaaaNxaxa

)2,(~ 2122

212121 Nxx

)2,(~ 2122

212121 Nxx

Page 10: Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real

2.4 The Normal Probability Distribution and

Related Distributions ~ Distribution

• If x1 , x2 , …, xn are independent normal variabl

e with mean zero and variance 1, that is, , then

is said to have the - distribution with degrees of freedom (d.f.) n, and we will write this as .

nix ,....,2,1,)1,0(IN~1 2i

ixZ

2X

2X

2~ nXZ

Page 11: Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real

2.4 The Normal Probability Distribution and

Related Distributions

• The subscript n denotes the d.f.

• The - distribution is the distribution of th

e sum of squares of n independent standar

d normal variables.

2nX

Page 12: Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real

2.4 The Normal Probability Distribution and

Related Distributions

• If ,then Z should be defined as

• The -distribution also has an “additive

property,” although it is different from the

property of the normal distribution and is much

more restrictive.

),0(IN~ 2ix

i

ixZ

2

2

2X

Page 13: Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real

2.4 The Normal Probability Distribution and

Related Distributions

• The property is:

If and and and are ind

ependent,

then

21 ~ nXZ 2

2 ~ mXZ 1Z 2Z

221 ~ mnXZZ

Page 14: Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real

2.4 The Normal Probability Distribution and

Related Distributions

t - Distribution

• If and and x and y are independe

nt, has a t-distribution with d.f. n.

• We write this as .

• The subscript n again denotes the d.f.

)1,0(~ Nx 2~ nXy

nyxZ

ntZ ~

Page 15: Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real

2.4 The Normal Probability Distribution and

Related Distributions• Thus the t-distribution is the distribution of a standard

normal variable divided by the square root of an independent averaged variable ( variable divided by its degrees of freedom).

• The t-distribution is a symmetric probability distributio

n like the normal distribution but is flatter than the nor

mal and has longer tails.

• As the d.f n approached infinity, the t-distribution appr

oaches the normal distribution.

2X2X

Page 16: Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real

2.4 The Normal Probability Distribution and

Related Distributions

F ~Distribution

• If and and and are ind

ependent,

has the F-distribution with d.f. n1 and n2.

• We write this as

211 ~ nxy 2

22 ~ nxy 1y 2y

2,1~ nnFZ

)//()/( 2211 nynyZ

Page 17: Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real

2.4 The Normal Probability Distribution and

Related DistributionsF ~Distribution

• The first subscript, n1, refers to the d.f. the numera

tor, and the second subscript, n2 ,refers to the d.f.

of the denominator.

• The F -distribution is thus the distribution of the rat

io of two independent averaged variables.2X

Page 18: Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real

2.5 Classical Statistical Inference

• Statistical inference is the area that

describes the procedures by which we use

the observed data to draw conclusions

about the population from which the data

came or about the process by which the

data were generated.

Page 19: Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real

2.5 Classical Statistical Inference

• Broadly speaking, statistical inference can

be classified under two headings:

– Classical inference

– Bayesian inference.

Page 20: Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real

2.5 Classical Statistical Inference

• In Bayesian inference we combine sample infor

mation with prior information.

• Suppose that we draw a random sample y1 , y2 ,

…, yn of size n from a normal population with me

an and variance (assumed known), and w

e want to make inferences about .

2

Page 21: Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real

2.5 Classical Statistical Inference

• In classical inference we take the sample mean

as our estimate of .Its variance is .

• The inverse of this variance is known as the

sample precision. Thus the sample precision is

.

y

n2

2n

Page 22: Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real

2.5 Classical Statistical Inference

• In Bayesian inference we have prior information

on .

• This is expressed in term of a probability

distribution known as the prior distribution.

• Suppose that the prior distribution is normal with

mean and variance , that is, precision .

0 20 2

0

1

Page 23: Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real

2.5 Classical Statistical Inference

• We now combine this with the sample

information to obtain what is known as the

posterior distribution of .

• This distribution can be shown to be normal.

• Its mean is a weighted average of the sample.

Page 24: Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real

2.5 Classical Statistical Inference

• Mean and the prior mean , weighted by the sample precision and prior precision, respectively. Thus

Where

• Also, the precision (or inverse of the variance) of the posterior distribution of is , that is, the sum of sample precision and prior precision.

y 0

21

021)Bayesian(ww

wyw

precisionsample21 nw

precisionprior20

2 nw

21 ww

Page 25: Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real

2.5 Classical Statistical Inference

• For instance, if the sample mean is 20 with variance 4 and the prior mean is 10 with variance 2, we have

• The posterior mean will lie between the sample mean and the prior mean.

• The posterior variance will be less than both the sample and prior variance.

33.13

4310

21

41

)10(21

)20(41

meanposterior

33.13

4)2

1

4

1(varianceposterior 1

Page 26: Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real

2.5 Classical Statistical Inference

• 1. Point estimation.

• 2. Interval estimation.

• 3. Testing of hypotheses.

Page 27: Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real

2.6 Properties of Estimators

• There are some desirable properties of estimators that are often mentioned in the book.

• These are:

– 1. Unbiasedness.

– 2. Efficiency.

– 3. Consistency.

• The first two are small-sample properties. The third is a large-sample property.

Page 28: Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real

2.6 Properties of Estimators

Unbiasedness

• An estimator g is said to be unbiased for if ,t

hat is, the mean of the sampling distribution of g is e

qual to .

• What this says is that if we calculate g for each sam

ple and repeat this process infinitely many times, the

average of all these estimates will be equal to .

gE

Page 29: Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real

2.6 Properties of Estimators

• If , then g is said to be biased and we refer to as the bias.

• Unbiasedness is a desirable property but not at all costs.

• Suppose that we have two estimator g1 and g2 c

an assume values far away from and yet have i

ts mean equal to ,whereas g2 always ranges cl

ose to but has its mean slightly away from .

gE

gE

Page 30: Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real

2.6 Properties of Estimators

• Then we might prefer g2 to g1 because it has smaller variance even though it is biased.

• If the variance of the estimator is large, we can have some unlucky samples where our estimate is far from the true value.

• Thus the second property we want our estimators to have is a small variance.

• One criterion that is often suggested is the mean-squared error (MSE), which is defined by

variance)(MSE 2 bias

Page 31: Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real

2.6 Properties of EstimatorsEfficiency

• The property of efficiency is concerned with the variance of estimators.

• Obviously, it is a relative concept and we have to confine ourselves to a particular class.

• If g is an unbiased estimator and it has the minimum variance in the class of unbiased estimators, g is said to be an efficient estimators.

• We say that g is an MVUE (a minimum-variance unbiased estimator).

Page 32: Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real

2.6 Properties of Estimators

• If we confine ourselves to linear estimators, th

at is,

• where the c’s are constants which we choose

so that g is unbiased and has minimum varian

ce, g is called a BLUE (a best linear unbiased

estimator).

nn ycycycg .....2211

Page 33: Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real

2.6 Properties of Estimators

Consistency

• Often it is not possible to find estimators that have

desirable small-sample properties such as

properties.

• In such cases, it is customary to look at desirable

properties in large samples.

• These are called asymptotic properties.

Page 34: Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real

– Three such properties often mentioned are co

nsistency, asymptotic unbiasedness ,and asy

mptotic efficiency

Page 35: Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real

2.6 Properties of Estimators

• Suppose that is the estimator of base on a sample of size n.

• Then the sequence of estimators is called a consistent sequence if for any arbitrarily small positive numbers and there is a sample size such that

0n

0allfor1ˆ nnprob n

Page 36: Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real

2.6 Properties of Estimators

• That is, by increasing the sample size n the estimator can be made to lie arbitrarily close to the true value of with probability arbitrarily close to 1.

• This statement is also written as

And more briefly we write it as

1ˆlim

nn

P

np

n or ˆplimˆ

Page 37: Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real

2.6 Properties of Estimators

• A sufficient condition for to be consistent is that

the bias and variance should both tend to zero as the

sample size increase.

• This condition is often useful to check in practice, but

it should be noted that the condition is not necessary.

• An estimator can be consistent even if the bias does

not tend to zero

• An example: unbiased and consistent estimate