binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · a sum of two random variables...

91
Binomial and normal distributions Business Statistics 41000 Fall 2015 1

Upload: buique

Post on 21-Aug-2019

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Binomial and normal distributions

Business Statistics 41000

Fall 2015

1

Page 2: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Topics

1. Sums of random variables

2. Binomial distribution

3. Normal distribution

4. Vignettes

2

Page 3: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Topic: sums of random variables

Sums of random variables are important for two reasons:

1. Because we often care about aggregates and totals (sales, revenue,employees, etc).

2. Because averages are basically sums, and probabilities are basicallyaverages (of dummy variables), when we go to estimateprobabilities, we will end up using sums of random variables a lot.

This second point is the topic of the next lecture. For now, we focus onthe direct case.

3

Page 4: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

A sum of two random variables

Suppose X is a random variable denoting the profit from one wager andY is a random variable denoting the profit from another wager.

If we want to consider our total profit, we may consider the randomvariable that is the sum of the two wagers, S = X + Y .

To determine the distribution of S , we must first know the jointdistribution of (X ,Y ).

4

Page 5: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

A sum of two random variables

Suppose that (X ,Y ) has the following joint distribution:

-$200 $100 $200

$0 0 19

39

$100 19

29

29

So S can take the values {−200,−100, 100, 200, 300}.

Notice that there are two ways that S can be $200.

5

Page 6: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

A sum of two random variables

We can directly determine the distribution of S as:

S

s P(S = s)

-$200 +$0 0

-$200 + $100 19

$100 + $0 19

$100 + $100 or $200 + $0 29 + 3

9 = 59

$200 + $100 29

When determining the distribution of sums of random variables, we loseinformation about individual values and aggregate the probability ofevents giving the same sum.

6

Page 7: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Topic: binomial distribution

A binomial random variable can be constructed as the sum ofindependent Bernoulli random variables.

Familiarity with the binomial distribution eases many practical probabilitycalculations.

See OpenIntro sections 3.4 and 3.6.4.

7

Page 8: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Sums of Bernoulli RVs

When rolling two dice, what is the probability of rolling two ones?

By independence we can calculate this probability as

P(1, 1) =1

6

(1

6

)=

1

36.

Now with three dice, what is the probability of rolling exactly two 1’s?

8

Page 9: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Sums of Bernoulli RVs (cont’d)

The event A =“rolling a one”, can be described as a Bernoulli randomvariable with p = 1

6 .

We can denote the three independent rolls by writing

Xiiid∼Bernoulli(p), i = 1, 2, 3.

The notation iid is shorthand for “independent and identicallydistributed”.

Determining the probability of rolling exactly two 1’s can be done byconsidering the random variable Y = X1 + X2 + X3 and asking forP(Y = 2).

9

Page 10: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Sums of Bernoulli random variables (cont’d)

Consider the distribution of Y = X1 + X2 + X3.

Y

Event y P(Y = y)

000 0 (1− p)3

001 or 100 or 010 1 (1− p)(1− p)p + p(1− p)(1− p) + (1− p)p(1− p)

011 or 110 or 101 2 (1− p)p2 + p2(1− p) + p(1− p)p

111 3 p3

Remember that for this example p = 16 .

10

Page 11: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Sums of Bernoulli random variables (cont’d)

Determining the probability of a certain number of successes requiresknowing 1) the probability of each individual success and 2) the numberof ways that number of successes can arise.

Y

Event y P(Y = y)

000 0 (1− p)3

001 or 100 or 010 1 3(1− p)2p

011 or 110 or 101 2 3(1− p)p2

111 3 p3

We find that P(Y = 2) = 3p2(1− p) = 3(1/36)(5/6) = 56(12) = 5

72 .

11

Page 12: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Sums of Bernoulli random variables (cont’d)

What if we had four rolls, and the probability of success was 13?

0000100001001100001010100110111000011001010111010011101101111111

12

Page 13: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Sums of Bernoulli random variables (cont’d)

Summing up the probabilities for each of the values of Y , we find:

Y

y P(Y = y)

0 (1− p)4

1 4(1− p)3p2 6(1− p)2p2

3 4(1− p)p3

4 p4

Substituting p = 13 we can now find P(Y = y) for any y = 0, 1, 2, 3, 4.

13

Page 14: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Defintion: N choose y

The number of ways we can arrange y successes among N trials can becalculated efficiently by a computer. We denote this number with aspecial expression.

N choose y

The notation (N

y

)=

N!

(N − y)!y !

designates the number of ways that y items can be assigned to Npossible positions.

This notation can be used to summarize the entries in the previous tablesfor various values of N and y .

14

Page 15: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Definition: Binomial distribution

Binomial distribution

A random variable Y has a binomial distribution with parameters N andp if its probability distribution function is of the form:

p(y) =

(N

y

)py (1− p)N−y

for integer values of y between 0 and N.

15

Page 16: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Example: drunk batter

What is the probability that our alcoholic major-leaguer gets more than 2hits in a game in which he has 5 at bats?

Let X =“number of hits”. We model X as a binomial random variablewith parameters N = 5 and p = 0.316.

X

x P(X = x)

0 (1− p)5

1 5(1− p)4p2 10(1− p)3p2

3 10(1− p)2p3

4 5(1− p)p4

5 p5

Substituting p = 0.316 we calculate P(X > 2) = 0.185.

16

Page 17: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Example: winning a best-of-seven play-off

Assume that the Chicago Bulls have probability 0.4 of beating the MiamiHeat in any given game and that the outcomes of individual games areindependent.

What is the probability that the Bulls win a seven game series against theHeat?

17

Page 18: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Example: winning a best-of-seven play-off (cont’d)

Consider the number of games won by the Bulls over a full seven gamesagainst the Heat. We model this as a binomial random variable Y withparameters N = 7 and p = 0.4, which we express with the notation

Y ∼ Bin(7, 0.4).

The symbol “∼” is read “distributed as”. “Bin” is short for “binomial”.The numbers which follow are the values of the two binomial parameters,the number of independent Bernoulli trials (N) and the probability ofsuccess at each trial (p).

18

Page 19: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Example: winning a best-of-seven play-off (cont’d)

Although we never see all seven games played (because the series stopsas soon as one team wins four games) we note that in this expandedevent space

I any event with at least four Bulls wins corresponds to an observableBulls series win,

I any event corresponding to an observed Bulls series win has at leastfour total Bulls wins.

19

Page 20: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Example: winning a best-of-seven play-off (cont’d)

For example, the observable sequence 011011 (where a 1 stands for aBulls win) has two possible completions, 0110110 or 0110111. Anyhypothetical games played beyond the series-ending fourth win can onlyincrease the total number of wins tallied by Y .

Conversely, the sequence 1010111 is an event corresponding to Y = 5and we can associate it with the observable subsequence 101011, a Bullsseries win in six games.

20

Page 21: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Example: winning a best-of-seven play-off (cont’d)

Therefore, the events corresponding to “Bulls win the series” areprecisely those corresponding to Y ≥ 4.

We may conclude that the probability of a series win for the Bulls is

P(Y ≥ 4) = P(Y = 4) + P(Y = 5) + P(Y = 6) + P(Y = 7)

= 0.29.

21

Page 22: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Example: winning a best-of-seven play-off (cont’d)

We can arrive at this answer without reference to the binomial randomvariable Y if we are willing to do our own counting.

P(Bulls series win) = p4 +

(4

3

)p4(1− p) +

(5

3

)p4(1− p)2 +

(6

3

)p4(1− p)3

= p4 +

(4

1

)p4(1− p) +

(5

2

)p4(1− p)2 +

(6

3

)p4(1− p)3

= 0.29.

This calculation explicitly accounts for the fact that Bulls series winsnecessarily conclude with a Bulls game win.

22

Page 23: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Example: double lottery winners

In 1971, Jane Adams won the lottery twice in one year! If you read of adouble winner in your daily newspaper, how surprised should you be?

To answer this question we need to make some assumptions. Consider 40state lotteries. Assume that each one has a 1 in 18 million chance ofwinning. Assume that each one has 1 million people that play it daily(say, 250 times a year), and that each one buys 5 tickets.

Given these conditions, what is the probability that in one calendar yearthere is at least one double winner?

23

Page 24: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Example: double lottery winners (cont’d)

Let Xi be the random variable denoting how many winning tickets personi has:

Xi ∼ Binomial(5(250), p = (1/18)× 10−6).

Now let Yi be the dummy variable for the event Xi > 1, which is theevent that person i is a double (or more) winner:

Yi ∼ Bernoulli(q).

We can compute q = 1− Pr(Xi = 0)− Pr(Xi = 1) = 2.4× 10−9.

24

Page 25: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Example: double lottery winners (cont’d)

To account for the million people playing the lottery in each of 40 states,we consider Z =

∑Ni=1 Yi , which is another binomial random variable:

Z ∼ Binomial(N = 4× 107, q).

Finally, the probability that Z > 0 can be found as

1− P(Z = 0) = 1− (1− q)N = 1/11.

Not so rare!

25

Page 26: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Example: rural vs. urban hospitals

About as many boys as girls are born in hospitals. In a small Country Hospitalonly a few babies are born every week. In the urban center, many babies areborn every week at City General. Say that a normal week is one where between45% and 55% of the babies are female. An unusual week is one where morethan 55% are girls or more than 55% are boys.

Which of the following is true?

I Unusual weeks occur equally often at Country Hospital and at CityGeneral.

I Unusual weeks are more common at Country Hospital than at CityGeneral.

I Unusual weeks are less common at Country Hospital than at City General.

26

Page 27: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Example: rural vs. urban hospital (cont’d)

We can model the births in the two hospitals as two independent randomvariables. Let X = “number of baby girls born at Country Hospital” andY =“number of baby girls born at City General”.

X ∼ Binomial(N1, p)

Y ∼ Binomial(N2, p)

Assume that p = 0.5. The key difference is that N1 is much smaller thanN2. To illustrate, assume that N1 = 20 and N2 = 500.

27

Page 28: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Example: rural vs. urban hospital (cont’d)

During a usual week at the rural hospital between 0.45N1 = 0.45(20) = 9and 0.55N1 = 0.55(20) = 11 baby girls are born.

The probability of usual week is P(9 ≤ X ≤ 11) ≈ 0.50, so theprobability of an unusual week is

1− P(9 ≤ X ≤ 11) = P(X < 9) + P(X > 11) ≈ 0.5.

Note: satisfying the condition X < 9 is the same as not satisfying thecondition X ≥ 9; strict versus non-strict inequalities make a difference.

28

Page 29: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Example: rural vs. urban hospital (cont’d)

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

0.00

0.05

0.10

0.15

0.20

Country Hospital

Births

Probability

29

Page 30: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Example: rural vs. urban hospital (cont’d)

In a usual week at the city hospital between 0.45N2 = 0.45(500) = 225and 0.55N2 = 0.55(500) = 275 baby girls are born.

Then the probability of a usual week is P(225 ≤ X ≤ 275) = 0.978, sothe probability of an unusual week is

1− P(225 ≤ X ≤ 275) = P(X < 225) + P(X > 275) = 0.022.

30

Page 31: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Example: rural vs. urban hospital (cont’d)

200 206 212 218 224 230 236 242 248 254 260 266 272 278 284 290

0.000

0.010

0.020

0.030

City General

Births

Probability

31

Page 32: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Variance of a sum of independent random variables

A useful fact:

Variance of linear combinations of independent random variables

A weighted sum/difference of random variables Y =∑m

i aiXi can beexpressed as

V(Y ) =m∑i

a2i V(Xi ).

How can this be used to derive the expression for the variance of abinomial random variable?

32

Page 33: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Variance of binomial random variable

Variance of a binomial random variable

A binomial random variable X with parameters N and p has variance

V(X ) = Np(1− p).

33

Page 34: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Variance of a proportion

By dividing through by the total number of babies born each week wecan consider the proportion of girl babies. Define the random variables

P1 =X

N1and P2 =

Y

N2.

Then it follows that

V (P1) =V(X )

N21

=N1p(1− p)

N21

= p(1− p)/N1

and

V (P2) =V(Y )

N22

=N2p(1− p)

N22

= p(1− p)/N2.

34

Page 35: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Law of Large Numbers

An arithmetical average of random variables is itself a random variable.

As more and more individual random variables are averaged up, thevariance decreases but the mean stays the same.

As a result, the distribution of the averaged random variable becomesmore and more concentrated around its expected value.

35

Page 36: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Law of Large Numbers

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

0.00

0.05

0.10

0.15

0.20

0.25

Distribution of sample proportion (N = 10, p = 0.7)

36

Page 37: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Law of Large Numbers

0.00

0.05

0.10

0.15

Distribution of sample proportion (N = 20, p = 0.7)

0 0.7 1

37

Page 38: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Law of Large Numbers

0.00

0.02

0.04

0.06

0.08

0.10

0.12

Distribution of sample proportion (N = 50, p = 0.7)

0 0.7 1

38

Page 39: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Law of Large Numbers

0.00

0.01

0.02

0.03

0.04

0.05

0.06

0.07

Distribution of sample proportion (N = 150, p = 0.7)

0 0.7 1

39

Page 40: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Law of Large Numbers

0.00

0.01

0.02

0.03

0.04

0.05

Distribution of sample proportion (N = 300, p = 0.7)

0 0.7 1

40

Page 41: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Example: Schlitz Super Bowl taste test

41

Page 42: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Bell curve approximation to binomial

The binomial distributions can be approximated by a smooth densityfunction for large N.

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

0.00

0.05

0.10

0.15

0.20

Normal approximation for binomial distribution with N = 20, p = 0.5

x

Pro

babi

lity

mas

s / D

ensi

ty

42

Page 43: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Bell curve approximation to binomial

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

0.00

0.05

0.10

0.15

Normal approximation for binomial distribution with N = 60, p = 0.1

x

Pro

babi

lity

mas

s / D

ensi

ty

43

Page 44: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Bell curve approximation to binomial

340 346 352 358 364 370 376 382 388 394 400 406 412 418 424 430 436 442 448 454 460

0.00

0.01

0.02

0.03

0.04

Normal approximation for binomial distribution with N = 500, p = 0.8

x

Pro

babi

lity

mas

s / D

ensi

ty

What are some reasons that very small p or small N lead to badapproximations?

44

Page 45: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Central limit theorem

The normal distribution can be “justified” via its relationship to thebinomial distribution. Roughly: if a random outcome is the combinedresult of many individual random events, its distribution will follow anormal curve.

The quincunx or Galton box is a device which physically simulates sucha scenario using ball bearings and pins stuck in a board.

PLAY VIDEO

The CLT can be stated more precisely, but the practical impact is justthis: random variables which arise as sums of many other randomvariables (not necessarily normally distributed) tend to be normallydistributed.

45

Page 46: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Normal distributions

The normal family of densities has two parameters, typically denoted µand σ2, which govern the location and scale, respectively.

-4 -2 0 2 4

0.0

0.1

0.2

0.3

0.4

Gaussian densities for various location parameters

x

f(x)

46

Page 47: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Normal distributions (cont’d)

I will use the terms normal distribution, normal density and normalrandom variable more or less interchangeably.

-4 -2 0 2 4

0.0

0.2

0.4

0.6

0.8

Mean-zero Gaussian densities with differing scale parameters

x

f(x)

The normal distribution is also called the Gaussian distribution or thebell curve.

47

Page 48: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Normal means and variances

Mean and variance of a normal random variable

A normal random variable X , with parameters µ and σ2, is denoted

X ∼ N(µ, σ2).

The mean and variance of X are

E (X ) = µ,

V (X ) = σ2.

The density function is symmetric and unimodal, so the median andmode of X are also given by the location parameter µ. The standarddeviation of X is given by σ.

48

Page 49: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Normal approximation to binomial

The binomial distributions can be approximated by a normal distribution.

Normal approximation to the binomial

A Bin(N, p) distribution can be approximated by a N(Np,Np(1− p))distribution for N “large enough”.

Notice that this just “matches” the mean and variance of the twodistributions.

49

Page 50: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Linear transformation of normal RVs

We can add a fixed number to a normal random variable and/or multiplyit by a fixed number and get a new normal random variable. This sort ofoperation is called a linear transformation.

Linear transformation of normal random variables

If X ∼ N(µ, σ2) and Y = a + bX for fixed numbers a and b, thenY ∼ N(a + bµ, b2σ2).

For example, if X ∼ N(1, 2) and Y = 3− 5X , then Y ∼ N(−2, 50).

50

Page 51: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Standard normal RV

Standard normal

A standard normal random variable is one with mean 0 and variance 1.It is often denoted by the letter Z :

Z ∼ N(0, 1).

We can write any normal random variable as a linear transformation of astandard normal RV. For normal random variable X ∼ N(µ, σ2), we canwrite

X = µ+ σZ .

51

Page 52: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

The “empirical rule”

It is convenient to characterize where the “bulk” of the probability massof a normal distribution resides by providing an interval, in terms ofstandard deviations, about the mean.

0.0

0.1

0.2

0.3

0.4

N(µ,σ)

x

Density

µ − 4σ µ − 3σ µ − 2σ µ − σ µ µ + σ µ + 2σ µ + 3σ µ + 4σ

68 %

52

Page 53: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

The “empirical rule” (cont’d)

The widespread application of the normal distribution has lead this to bedubbed the empirical rule.

0.0

0.1

0.2

0.3

0.4

N(µ,σ)

x

Density

µ − 4σ µ − 3σ µ − 2σ µ − σ µ µ + σ µ + 2σ µ + 3σ µ + 4σ

95 %

53

Page 54: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

The “empirical rule” (cont’d)

It is, for obvious reasons, sometimes called the 68-95-99.7 rule.0.0

0.1

0.2

0.3

0.4

N(µ,σ)

x

Density

µ − 4σ µ − 3σ µ − 2σ µ − σ µ µ + σ µ + 2σ µ + 3σ µ + 4σ

99.7 %

54

Page 55: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

The “empirical rule” (cont’d)

To revisit some earlier examples:

I 68% of Chicago daily highs in the winter season are between 19 and48 degrees.

I 95% of NBA players are between 6ft and 7ft 2in.

I In 99.7% of weeks, the proportion of baby girls born at City Generalis between 0.4985 and 0.5015.

55

Page 56: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Sums of normal random variables

Weighted sums of normal random variables are also normally distributed.

For example if

X1 ∼ N(5, 20) and X2 ∼ N(1, 0.5)

then for Y = 0.1X1 + 0.9X2

Y ∼ N(m, v).

where m = 0.1(5) + 0.9(1) = 1.4 and v = 0.12(20) + 0.92(0.5) = 0.605.

56

Page 57: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Linear combinations of normal RVs

Linear combinations of independent normal random variables

For i = 1, . . . , n, let

Xiiid∼N(µi , σ

2i ).

Define Y =∑n

i=1 aiXi for weights a1, a2, . . . , an. Then

Y ∼ N(m, v)

where

m =n∑

i=1

aiµi and v =n∑

i=1

a2i σ2i .

57

Page 58: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Example: two-stock portfolio

Consider two stocks, A and B, with annual returns (in percent ofinvestment) distributed according to normal distributions

XA ∼ N(5, 20) and XB ∼ N(1, 0.5).

What fraction of our investment should we put into stock A, with theremainder put in stock B?

58

Page 59: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Example: two-stock portfolio (cont’d)

For a given fraction α, the total return on our portfolio is

Y = αXA + (1− α)XB

with distribution

Y ∼ N(m, v).

where m = 5α + (1− α) and v = 20α2 + 0.5(1− α)2.

59

Page 60: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Example: two-stock portfolio (cont’d)

Suppose we want to find α so that P(Y ≤ 0) is as small as possible.

-5 0 5 10 15 20

0.0

0.1

0.2

0.3

0.4

0.5

0.6

Two-stock portfolio

Percent return

Density

Stock AStock B

The blue distributions correspond to varying values of α.60

Page 61: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Example: two-stock portfolio (cont’d)

We can plot the probability of a loss as a function of α.0.04

0.06

0.08

0.10

0.12

Probability of a loss

α

Probability

0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00

We see that this probability is minimized when α = 11% approximately.This is the LLN at work!

61

Page 62: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Variance of a sum of correlated random variables

For correlated (dependent) random variables, we have a modified formula:

Variance of linear combinations of two correlated random variables

A weighted sum/difference of random variables Y = a1X1 + a2X2 can beexpressed as

V(Y ) = a21V(X1) + a22V(X2) + 2a1a2Cov(X1,X2).

There is a homework problem that asks you to find the variance ofportfolios of stocks, as in the example above, for stocks which are relatedto one another (in a common industry, for example).

62

Page 63: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Vignettes

1. Differential dispersion

2. Average number of sex partners

3. mean reversion

63

Page 64: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Vignette: a difference in dispersion

In this vignette we observe how selection (in the sense of evolution, orhiring, or admissions) can turn higher variability into over-representation.The analysis uses the ideas of random variables, distribution functions,and conditional probability.

For more background, read the article “Sex Ed” from the February 2005issue of the New Republic (available at the course home page).

64

Page 65: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

A difference in dispersion

Consider two groups of college graduates with “employee fitness scores”following the distributions shown below.

-5 -4 -3 -2 -1 0 1 2 3 4 5

0.0

0.1

0.2

0.3

0.4

0.5

0.6

Distribution of Capabilities, Group A

Score

Probability

0.043 0.051 0.064 0.0850.128

0.256

0.1280.085 0.064 0.051 0.043

-5 -4 -3 -2 -1 0 1 2 3 4 5

0.0

0.1

0.2

0.3

0.4

0.5

0.6

Distribution of Capabilities, Group B

Score

Probability

0.003 0.008 0.0230.063

0.171

0.464

0.171

0.0630.023 0.008 0.003

These distributions have the same mean, the same median, and the samemode. But they differ in their dispersion, or variability.

65

Page 66: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

A difference in dispersion (cont’d)Let X denote the random variables recording the scores and let A and Bdenote membership in the respective groups.

-5 -4 -3 -2 -1 0 1 2 3 4 5

0.0

0.1

0.2

0.3

0.4

0.5

0.6

Distribution of Capabilities, Group A

Score

Probability

0.043 0.051 0.064 0.0850.128

0.256

0.1280.085 0.064 0.051 0.043

-5 -4 -3 -2 -1 0 1 2 3 4 5

0.0

0.1

0.2

0.3

0.4

0.5

0.6

Distribution of Capabilities, Group B

Score

Probability

0.003 0.008 0.0230.063

0.171

0.464

0.171

0.0630.023 0.008 0.003

V (X | A) = 5.87 and V (X | B) = 1.666.

The corresponding standard deviations are σ(X | A) = 2.42 andσ(X | B) = 1.29. 66

Page 67: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

A difference in dispersion (cont’d)But now consider only elite jobs, for which it is necessary that fitnessscore X ≥ 4.

-5 -4 -3 -2 -1 0 1 2 3 4 5

0.0

0.1

0.2

0.3

0.4

0.5

0.6

Distribution of Capabilities, Group A

Score

Probability

0.043 0.051 0.064 0.0850.128

0.256

0.1280.085 0.064 0.051 0.043

-5 -4 -3 -2 -1 0 1 2 3 4 5

0.0

0.1

0.2

0.3

0.4

0.5

0.6

Distribution of Capabilities, Group B

Score

Probability

0.003 0.008 0.0230.063

0.171

0.464

0.171

0.0630.023 0.008 0.003

We can use Bayes’ rule to calculate P(A | X ≥ 4) and P(B | X ≥ 4). 67

Page 68: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

A difference in dispersion (cont’d)

If we assume a priori that P(A) = P(B) = 1/2, we find

P(A | X ≥ 4) =P(X ≥ 4 | A)P(A)

P(X ≥ 4 | A)P(A) + P(X ≥ 4 | B)P(B)

=0.094(0.5)

0.094(0.5) + 0.012(0.5)

= 0.89.

Why don’t we need to calculate P(B | X ≥ 4) separately?

68

Page 69: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Larry Summers and women-in-science

“Summers’s critics have repeatedly mangled his suggestion thatinnate differences might be one cause of gender disparities ... intothe claim that they must be the only cause. And they haveconverted his suggestion that the statistical distributions of men’sand women’s abilities are not identical to the claim that all men aretalented and all women are not–as if someone heard that womentypically live longer than men and concluded that every woman liveslonger than every man. . . .

In many traits, men show greater variance than women, and aredisproportionately found at both the low and high ends of thedistribution. Boys are more likely to be learning disabled or retardedbut also more likely to reach the top percentiles in assessments ofmathematical ability, even though boys and girls are similar in thebulk of the bell curve. . . .”

Stephen Pinker in The New Republic

69

Page 70: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Example: gender and aptitudes revisitedAssume that job“aptitude” can be represented as a continuous randomvariable and that the distribution of scores differs by gender.

-6 -4 -2 0 2 4 6

0.0

0.1

0.2

0.3

0.4

Aptitude distribution

Score

Density

womenmen

For women, 93.7% of the scores are between the vertical dashed lines,whereas only 68.6% of the men’s scores fall in this range. 70

Page 71: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Example: gender and aptitudes revisited (cont’d)

The corresponding CDFs reveals the same difference.

-6 -4 -2 0 2 4 6

0.0

0.2

0.4

0.6

0.8

1.0

Cumulative distribution function

Score

F(x)

These distributions are meant to be illustrative rather than factual.71

Page 72: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Sex partners vignette: which average?

Here is a torn-from-the-headlines example of why it pays to know a littleprobability.

“Everyone knows men are promiscuous by nature...Surveys bearthis out. In study after study and in country after country, menreport more, often many more, sexual partners than women...

But there is just one problem, mathematicians say. It islogically impossible for heterosexual men to have more partnerson average than heterosexual women. Those survey resultscannot be true.”

72

Page 73: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

A sex-partners statistical model

Question: is it possible for men to have more sex partners, on average,than women?

To answer this question, we will consider a “toy” probability model forhomo sapiens mating behavior.

John Lenny Romeo

Sally 0.07 0.06 0.05

Chastity 0.5 0.5 0.5

Maude 0.05 0.04 0.09

Let’s call it the “summer camp” model.

73

Page 74: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

A sex-partners random variable

The quantity of interest is the number of sex partners. In our model, thiswill be a number between 0 and 3.

For each individual we can compute the distribution of this randomvariable. We will denote individuals by their first initial. A red initialmeans they partnered, a black initial means they did not.

We will assume independence. This means, for example, that Sallyhooking up with Romeo makes it neither more nor less likely that she willhook up with Lenny.

74

Page 75: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Sally’s sex-partner distribution

Xs

Event x P(Xs = x)

JLR 0 (1-0.07)(1-0.06)(1-0.05)

JLR or JLR or JLR 1 (0.07)(1-0.06)(1-0.05) +(1-0.07)(0.06)(1-0.05) +(1-0.07)(1-0.06)(0.05)

JLR or JLR or JLR 2 (0.07)(0.06)(1-0.05) +(1-0.07)(0.06)(0.05) +(0.07)(1-0.06)(0.05)

JLR 3 (0.07)(0.06)(0.05)

Can you see the probability laws in action here?

75

Page 76: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Sally’s sex-partner distribution

Xs

Event x ps(x) = P(Xs = x)

JLR 0 0.83

JLR or JLR or JLR 1 0.16

JLR or JLR or JLR 2 0.01

JLR 3 0.0002

Here is what it looks like after the calculation (rounded a bit). We cando similarly for each individual.

76

Page 77: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Sally’s sex-partners distribution

Here is a picture of Sally’s sex partner distribution.

0 1 2 3

0.0

0.2

0.4

0.6

0.8

1.0

0.8305

0.1592

0.0101 2e-04

Distribution of sex partners for Sally

Number of partners

Probability

The mean is 0(0.83) + 1(0.16) + 2(0.01) + 3(0.0002) = 0.18. What is themode? What is the median?

77

Page 78: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Female sex-partner distribution

To get the distribution for all females, we sum over the individual women.We apply the law of total probability using all three conditionaldistributions:

pfemale(x) = ps(x)P(Sally) + pc(x)P(Chastity) + pm(x)P(Maude).

We assume that the women are selected at random with equal probabilityP(Maude) = P(Chastity) = P(Sally) = 1/3.

78

Page 79: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Female sex-partner distribution

At the end we get a distribution like this.

0 1 2 3

0.0

0.2

0.4

0.6

0.8

1.0

0.5951

0.23150.1315

0.0418

Distribution of sex partners for females

Number of partners

Probability

The mean is 0.62, the mode is 0, and the median is 0.79

Page 80: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Male sex-partner distribution

We can do the same thing for the males, and we get this.

0 1 2 3

0.0

0.2

0.4

0.6

0.8

1.0

0.44170.4983

0.05830.0017

Distribution of sex partners for males

Number of partners

Probability

The mean is 0.62, the mode is 1, and the median is 1.80

Page 81: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Sex-partners vignette recap

The narrow lesson is that it pays to be specific about which measure ofcentral tendency you’re talking about!

The more general lesson is that using probability models and a little bitof algebra can help us see a situation more clearly.

This example uses the concepts of random variable, independence,conditional distribution, mean, median...and others.

81

Page 82: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Idea: statistical “null” hypotheses

The hypothesis that events are independent often makes a nice contrastto other explanations, namely that random events are somehow related.

This vantage point allows us to judge if those other explanations fit thefacts any better than the uninteresting “null” explanation that events areindependent.

82

Page 83: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Vignette: making better pilots

Flight instructors have a policy of berating pilots who make bad landings.They notice that good landings met with praise mostly result insubsequently less-good landings, while bad landings met with harshcriticism mostly result in subsequently improved landings.

Is their causal reasoning necessarily valid?

To stress-test their judgment that “criticism works” we consider theevidence in light of the null hypothesis that subsequent landings are infact independent of one another, regardless of criticism or praise.

83

Page 84: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Example: making better pilots (cont’d)

Contrary to the assumptions of the instructors, consider each landing asindependent of subsequent landings (irrespective of feedback).

Assume that landings can be classified into three types: poor, adequate,or excellent. Further assume the following probabilities:

Event Probability

bad pb

adequate pa

good pg

Remember that pb + pa + pg = 1.

84

Page 85: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Example: making better pilots (cont’d)

Assume that the policy of criticism is judged to work when a poorlanding is followed by a not-poor landing. Then

P(criticism seems to work) = P(not bad2 | bad1) = P(not bad2) = pa+pg

by independence.

Conversely, the policy of praise appears to work when an good landing isfollowed by another good landing. So

P(good2 | good1) = P(good2) = pg .

Praise always appears to work less often than criticism!

85

Page 86: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Remark: null and alternative hypotheses

The previous example shows that the evidence can appear to favorcriticism over praise even if criticism and praise are totally irrelevant.

Does this mean that criticism does not work?

No, it just means that the observed facts are not compelling evidencethat criticism works, because they are entirely consistent with the nullhypothesis that landing quality is independent of previous landings andfeedback.

In cases like this we say we “fail to reject the null hypothesis”. We’llrevisit this terminology a couple weeks from now.

86

Page 87: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Example: making better pilots (continuous version)

What if we want to take pilot skill into account?

We will model this situation using normal random variables and see if thesame conclusions (that praise appears to hurt performance and criticismseems to boost it) could arise by chance.

87

Page 88: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Example: making better pilots (continuous version, cont’d)

Assume that each pilot has a certain ability level, call it A. Eachindividual landing score arises as a combination of this ability and certainrandom fluctuations, call them ε. The landing score at time t can beexpressed as

St = A + εt .

Assuming that εtiid∼N(0, σ2), then

St ∼ N(A, σ2).

88

Page 89: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Example: making better pilots (continuous version, cont’d)Denote an average landing score as M. Consider a pilot with A > M.When he makes an exceptional landing, because ε1 > 2σ, he is unlikely tobest it on his next landing.

0.0

0.2

0.4

0.6

0.8

Distribution of landing scores

S2

Density

M A A+ε1

For this reason, praise is unlikely to work even though landings areindependent of one another. 89

Page 90: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Example: making better pilots (continuous version, cont’d)For a poor pilot with A < M a similar argument holds. When he makes avery poor landing, because ε1 < −2σ, he is unlikely to do worse on hisnext landing.

0.0

0.2

0.4

0.6

0.8

Distribution of landing scores

S2

Density

A+ε1 A M

For this reason, criticism is likely to “work” even though landings areindependent. 90

Page 91: Binomial and normal distributions - faculty.chicagobooth.edu 3.pdf · A sum of two random variables Suppose X is a random variable denoting the pro t from one wager and Y is a random

Idea: mean reversion

The previous example illustrates an idea known as mean reversion.

This name refers to the fact that subsequent observations tend to be“pulled back” towards the overall mean even if the events areindependent of one another.

Mean reversion describes a probabilistic fact, not a physical process.

What might the flight instructors have done (as an experiment) to reallyget to the bottom of their question?

91