continuous distributions. continuous random variables are numerical variables whose values fall...

44
Continuous Continuous Distributio Distributio ns ns

Upload: hilary-berry

Post on 28-Dec-2015

226 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

Continuous Continuous DistributioDistributio

nsns

Page 2: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

Continuous random Continuous random variablesvariables

•Are numerical variables whose values fall within a range or interval

•Are measurements•Can be described by density curves

Page 3: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

Density curvesDensity curves• Is always on or aboveon or above the

horizontal axis• Has an area exactly equal to oneequal to one

underneath it• Often describes an overall

distribution• Describe what proportionsproportions of the

observations fall within each range of values

Page 4: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

Unusual density Unusual density curvescurves

•Can be any shape•Are generic continuous distributions

•Probabilities are calculated by finding the finding the area under the curvearea under the curve

Page 5: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

1 2 3 4 5

.5

.25

P(X < 2) =

25.

225.2

How do you find the area of a triangle?

Page 6: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

1 2 3 4 5

.5

.25

P(X = 2) =

0

P(X < 2) =

.25

What is the area of a line

segment?

Page 7: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

In continuous distributions, P(P(XX < 2) & P( < 2) & P(XX << 2)2) are the same answer.

Hmmmm…

Is this different than

discrete distributions?

Page 8: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

1 2 3 4 5

.5

.25

P(X > 3) =

P(1 < X < 3) =

Shape is a trapezoid –

How long are the bases?

2

21 hbbArea

.5(.375+.5)(1)=.4375

.5(.125+.375)(2) =.5

b2 = .375

b1 = .5

h = 1

Page 9: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

1 2 3 4

0.25

0.50 P(X > 1) =.75

.5(2)(.25) = .25

(2)(.25) = .5

Page 10: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

1 2 3 4

0.25

0.50P(0.5 < X < 1.5) =

.28125

.5(.25+.375)(.5) = .15625

(.5)(.25) = .125

Page 11: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

Special Continuous Distributions

Page 12: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

Uniform DistributionUniform Distribution• Is a continuous distribution that is

evenly (or uniformly) distributed• Has a density curve in the shape

of a rectangle• Probabilities are calculated by

finding the area under the curve

12

22

2 ab

ba

x

x

Where: a & b are the endpoints of the uniform distribution

How do you find the area of a rectangle?

Page 13: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

4.98 5.044.92

The Citrus Sugar Company packs sugar in bags labeled 5 pounds. However, the packaging isn’t perfect and the actual weights are uniformly distributed with a mean of 4.98 pounds and a range of .12 pounds.

a)Construct the uniform distribution above.

How long is this rectangle?

What is the height of this rectangle?

What shape does a uniform distribution

have?

1/.12

Page 14: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

• What is the probability that a randomly selected bag will weigh more than 4.97 pounds?

4.98 5.044.92

1/.12

P(X > 4.97) =

.07(1/.12) = .5833What is the length of the shaded

region?

Page 15: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

• Find the probability that a randomly selected bag weighs between 4.93 and 5.03 pounds.

4.98 5.044.92

1/.12

P(4.93<X<5.03) =

.1(1/.12) = .8333What is the length of the shaded

region?

Page 16: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

The time it takes for students to The time it takes for students to drive to school is evenly distributed drive to school is evenly distributed with a minimum of 5 minutes and a with a minimum of 5 minutes and a range of 35 minutes.range of 35 minutes.

a)Draw the distribution

5

Where should the rectangle

end?

40

What is the height of the rectangle?

1/35

Page 17: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

b) What is the probability that it takes less than 20 minutes to drive to school?

5 40

1/35

P(X < 20) =

(15)(1/35) = .4286

Page 18: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

c) What is the mean and standard deviation of this distribution?

= (5 + 40)/2 = 22.5

= (40 - 5)2/12 = 102.083

= 10.104

Page 19: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

Normal Normal DistributionsDistributions

• Symmetrical bell-shaped (unimodal) density curve

• AboveAbove the horizontal axis• N(, )• The transition points occur at + • Probability is calculated by finding the area area

under the curveunder the curve• As increasesincreases, the curve flattens &

spreads out• As decreasesdecreases, the curve gets

taller and thinner

How is this done

mathematically?

Page 20: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

Normal distributions occur Normal distributions occur frequently.frequently.

• Length of newborn child• Height• Weight• ACT or SAT scores• Intelligence• Number of typing errors • Chemical processes

Page 21: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

A

B

Do these two normal curves have the same mean? If so, what is it?

Which normal curve has a standard deviation of 3?

Which normal curve has a standard deviation of 1?

6

YESYES

BB

AA

Page 22: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

Empirical RuleEmpirical Rule•Approximately 68%68% of the

observations fall within of •Approximately 95%95% of the

observations fall within 2 of •Approximately 99.7%99.7% of the

observations fall within 3 of

Page 23: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

Suppose that the height of male students at AHS is normally distributed with a mean of 71 inches and standard deviation of 2.5 inches. What is the probability that the height of a randomly selected male student is more than 73.5 inches?P(X > 73.5) = 0.16

71

68%

1 - .68 = .32

Page 24: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

Standard Normal Standard Normal Density CurvesDensity Curves

Always has = 0 & = 1

To standardize:

x

zMust have

this memorize

d!

Page 25: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

Strategies for finding Strategies for finding probabilities or proportions in probabilities or proportions in

normal distributionsnormal distributions

1.State the probability statement

2.Draw a picture3.Calculate the z-score4.Look up the probability

(proportion) in the table

Page 26: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

The lifetime of a certain type of battery is normally distributed with a mean of 200 hours and a standard deviation of 15 hours. What proportion of these batteries can be expected to last less than 220 hours?P(X < 220) =

33.115

200220

z

.9082

Write the probability statement

Draw & shade the

curve

Calculate z-score

Look up z-score in

table

Page 27: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

The lifetime of a certain type of battery is normally distributed with a mean of 200 hours and a standard deviation of 15 hours. What proportion of these batteries can be expected to last more than 220 hours?P(X>220) =

33.115

200220

z

1 - .9082 = .0918

Page 28: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

The lifetime of a certain type of battery is normally distributed with a mean of 200 hours and a standard deviation of 15 hours. How long must a battery last to be in the top 5%?P(X > ?) = .05

675.22415

200645.1

x

x .95.05

Look up in table 0.95 to find z- score

1.645

Page 29: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

The heights of the female students at AHS are normally distributed with a mean of 65 inches. What is the standard deviation of this distribution if 18.5% of the female students are shorter than 63 inches?P(X < 63) = .185

6322.2

9.2

65639.

What is the z-score for the 63?

-0.9

Page 30: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

The heights of female teachers at AHS are normally distributed with mean of 65.5 inches and standard deviation of 2.25 inches. The heights of male teachers are normally distributed with mean of 70 inches and standard deviation of 2.5 inches. •Describe the distribution of differences of heights (male – female) teachers.

Normal distribution with = 4.5 & = 3.3634

Page 31: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

• What is the probability that a randomly selected male teacher is shorter than a randomly selected female teacher?

4.5

P(X<0) =

34.13634.3

5.40

z

.0901

Page 32: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

Will my calculator do any of this normal

stuff?• Normalpdf – use for graphing

ONLYONLY

• Normalcdf – will find probability of area from lower bound to upper bound

• Invnorm (inverse normal) – will find z-score for probability

Page 33: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

Ways to Assess NormalityWays to Assess Normality

•Use graphs (dotplots, boxplots, or histograms)

•Use the Empirical Rule•Normal probability (quartile) plot

Page 34: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

Normal Probability (Quartile) Normal Probability (Quartile) plotsplots

• The observation (x) is plotted against known normal z-scores

• If the points on the quartile plot lie close to a straight line, then the data is normally distributed

• Deviations on the quartile plot indicate nonnormal data

• Points far away from the plot indicate outliers

• Vertical stacks of points (repeated observations of the same number) is called granularity

Page 35: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

To construct a normal probability plot, you can use quantities called normal score. The values of the normal scores depend on the sample size n. The normal scores when n = 10 are below:

-1.539 -1.001 -0.656 -0.376 -0.123 0.123 0.376 0.656 1.001 1.539

Think of selecting sample after sample of size 10 from a

standard normal distribution. Then -1.539 is the average of the smallest observation from each

sample & so on . . .

Suppose we have the following observations of widths of contact windows in integrated circuit chips:

3.21 2.49 2.94 4.38 4.02 3.62 3.30 2.85 3.34 3.81

Sketch a scatterplot by pairing the smallest normal score

with the smallest observation from the data set & so on

1 2 3 4 5

-1

1N

orm

al S

core

s

Widths of Contact Windows

What should happen if our data

set is normally distribute

d?

Page 36: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

Are these approximately normally distributed?

50 48 54 47 51 52 46 53 52 51 48 48 54 55 57 45 53 50 47 49 50 56 53 52

Both the histogram & boxplot are approximately symmetrical, so these data are approximately normal.

The normal probability plot is approximately linear, so these data are approximately normal.

Page 37: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

Normal Approximation to Normal Approximation to the Binomialthe Binomial

Before widespread use of technology, binomial probability calculations were very tedious. Let’s see how statisticians estimated these calculations in the past!

Page 38: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

Premature babies are those born more than 3 weeks early. Newsweek (May 16, 1988) reported that 10% of the live births in the U.S. are premature. Suppose that 250 live births are randomly selected and that the number X of the “preemies” is determined. What is the probability that there are between 15 and 30 preemies, inclusive?1) Find this probability using the binomial distribution.

2) What is the mean and standard deviation of the above distribution?

P(15<X<30) = binomialcdf(250,.1,30) – binomialcdf(250,.1,14) =.866

= 25 & = 4.743

Page 39: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

3) If we were to graph a histogram for the above binomial distribution, what shape do you think it will have?

4) What do you notice about the shape?

Since the probability is only 10%, we would expect the histogram to be strongly skewed right.

Let’s graph this distribution –Let’s graph this distribution –

•Put the numbers 1-45 in L1

•In L2, use binomialpdf to find the probabilities.

Overlay a normal curve on Overlay a normal curve on your histogram:your histogram:

•In Y1 = normalpdf(X,,)

Page 40: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

Normal distributions can be used Normal distributions can be used to estimate probabilities for to estimate probabilities for binomial distributions when: binomial distributions when:

1) the probability of success is close to .5oror2) n is sufficiently large

Rule: if n is large enough,then np > 10 & n(1 –p) > 10

Why 10?

Page 41: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

Normal distributions extend infinitely in both directions; however, binomial distributions are between 0 and n. If we use a normal distribution to estimate a binomial distribution, we must cut off the tails of the normal distribution. This is OK if the mean of the normal distribution (which we use the mean of the binomial) is at least three standard deviations (3) from 0 and from n.

Page 42: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

We require:

Or

As binomial:

Square:

Simplify:

Since (1 - p) < 1:

And p < 1:

Therefore,

3

pnpnp 13

we say the np should be at least 10 and n (1 – p) should be at least 10.

9np

pnp 19

pnppn 1922

03

91 pn

Page 43: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

Normal distributions can be used to estimate probabilities for binomial distributions when: 1) the probability of success is close to .5oror2) n is sufficiently large

Rule: if n is large enough,then np > 10 & n(1 –p) > 10

Since a continuous distribution is used to estimate the probabilities of a discrete distribution, a continuity correction is used to make the discrete values similar to continuous values.(+.5 to discrete values)

Why?

Think about how discrete histograms are made. Each

bar is centered over the discrete values. The bar for “1” actually goes from 0.5 to

1.5 & the bar for “2” goes from 1.5 to 2.5. Therefore, by adding or subtracting .5 from the discrete values, you find the actually width of the bars

that you need to estimate with the normal curve.

Page 44: Continuous Distributions. Continuous random variables Are numerical variables whose values fall within a range or interval Are measurements Can be described

(Back to our example) Since P(preemie) = .1 which is not close to .5, is n large enough?

5) Use a normal distribution with the binomial mean and standard deviation above to estimate the probability that between 15 & 30 preemies, inclusive, are born in the 250 randomly selected babies.Binomial written as Normal (w/cont. correction)

P(15 < X < 30)

6) How does the answer in question 5 compare to the answer in question 1 (Binomial answer =0.866)?

Normalcdf(14.5,30.5,25,4.743) = .8635

np = 250(.1) = 25 & n(1-p) = 250(.9) = 225

Yes, Ok to use normal to approximate binomial

P(14.5 < X < 30.5) =