chapter 6 continuous random variables statistics for science (env) 1

70
Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Upload: amanda-lee

Post on 29-Dec-2015

231 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Chapter 6

Continuous Random Variables

Statistics for Science(ENV)

1

Page 2: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Continuous Random Variables

6.1 Continuous Probability Distributions6.2 The Uniform Distribution6.3 The Normal Probability Distribution6.4 Approximating the Binomial Distribution by

Using the Normal Distribution

Page 3: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Continuous Probability Distributions

• A continuous random variable may assume any numerical value in one or more intervals

• Use a continuous probability distribution to assign probabilities to intervals of values

Page 4: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Probability density function (PDF)• PDF P(x) has no trivial meaning before integration• But after integration, it become probability

P(a<X<b) = ab p(x) dx

(Area under the graph from a to b)• As a result, P( X=a ) = a

a p(x) dx = 0• = E(X) = -

+ x p(x) dx.• 2 = Var(X) = E((X- )2)

= -+ (x- )2 p(x) dx.

Page 5: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Properties of Probability density function (PDF)

• Properties of P(x): P(x) is a continuous function such that

1. P(x) 0 for all x2. The total area under the curve of P(x) is equal to 1

• Essential point: An area under a PDF is a probability

Page 6: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Area and Probability• The blue area under the curve f(x) from

x = a to x = b is the probability that x could take any value in the range a to b– Symbolized as P(a x b)– Or as P(a < x < b), because each of the

interval endpoints has a probability of 0

Page 7: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Distribution Shapes

• Symmetrical and rectangular– The uniform distribution (Section 6.2)

• Symmetrical and bell-shaped– The normal distribution (Section 6.3)

• Skewed– Skewed either left or right (Section 6.5)

Page 8: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

The Uniform Distribution• If c and d are numbers on the real line (c < d), the

probability curve describing the uniform distribution is

• The probability that x is any value between the values a and b (a < b) is

• Note: The number ordering is c < a < b < d

dx c cd=x

otherwise0

for1

f

cd

abbxaP

Page 9: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

The Uniform Distribution Continued

• The mean X and standard deviation X of a uniform random variable x are

• These are the parameters of the uniform distribution with endpoints c and d (c < d)

12

2cd

dc

X

X

Page 10: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

The Uniform Probability Curve

Page 11: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Notes on the Uniform Distribution

• The uniform distribution is symmetrical– Symmetrical about its center X

– X is the median

• The uniform distribution is rectangular– For endpoints c and d (c < d) the width of the distribution

is d – c and the height is1/(d – c)

– The area under the entire uniform distribution is 1• Because width height = (d – c) [1/(d – c)] = 1• So P(c x d) = 1

Page 12: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Example 6.1: Uniform Waiting Time #1

• Suppose you appear at an elevator entry at a random time.

• The amount of time, X, that you need to wait for the elevator is uniformly distributed between zero and four minutes– So c = 0 and d = 4, and

– and

x =x

otherwise0

40for4

1

04

1

f

404

ababbxaP

Page 13: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Example 6.1: Uniform Waiting Time #2

• What is the probability that you have to wait at least 2.5 minutes for an elevator?– The probability is the area under the uniform

distribution in the interval [2.5, 4] minutes• The probability is the area of a rectangle with height ¼

and base 4 – 2.5 = 1.5

– P(x ≥ 2.5) = P(2.5 ≤ x ≤ 4) = ¼ 1.5 = 0.375

Page 14: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Example 6.1: Uniform Waiting Time #3

• What is the probability that you have to wait less than one minutes for an elevator?– P(x ≤ 1) = P(0 ≤ x ≤ 1) = ¼ 1 = 0.25

Page 15: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Example 6.1: Uniform Waiting Time #4

• Expect to wait the mean time µX=(0 + 4)/2 = 2• With a standard deviation σX of

– The mean plus/minus one standard deviation is

µX ± σX = 2 ± 1.1547 = [0.8453, 1.1547] minutes• The probability that the waiting time is within one

standard deviation of the mean time is

P(0.8453 ≤ x ≤ 3.1547)=¼ (3.1547–0.8453) = 0.57735

minutes 1.154712

04

X

Page 16: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Assignment: waiting for the green light• Cars arrive a traffic light randomly in time. The

traffic light timing is set so that the green light lasts for 80 seconds, and follows by 40 seconds of red light.

• What is the probability that a car does not have to wait in front of this traffic light?

• What is the probability that a car has to wait less than 20 seconds?

• What is the average waiting time?

16

Page 17: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

The Normal Probability Distribution

• The normal probability distribution is defined by the equation

for all values x on the real number line– is the mean and is the standard deviation – = 3.14159… and e = 2.71828 is the base of natural

logarithms

2

1=)f(

2

2

1

eπσ

x

x

Page 18: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

The Normal Probability Distribution Continued

• The normal curve is symmetrical about its mean – The mean is in the middle under the curve– So is also the median

• It is tallest over its mean • The area under the entire normal curve is 1– The area under either half of the curve is 0.5

Page 19: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

The Normal Probability Distribution Continued

Page 20: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

- 5

0 . 4

0 . 3

0 . 2

0 . 1

. 0

x

f(

x

r a l i t r b u i o n : = 0 , = 1

Mean, median, and mode are equal

Theoretically, curve extends toinfinity

a

Characteristics of a Normal Distribution

Normal curve is symmetrical

Page 21: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Properties of the Normal Distribution

• There are an infinite number of normal curves– The shape of any individual normal curve depends on

its specific mean and standard deviation The standard normalstandard normal N(0,1)N(0,1) distribution is a

normal distribution with a mean of 0 and a standard deviation of 1.

• P(z) ~ exp(-0.5*z2) also called the z z distribution.distribution.

x

z

Page 22: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Properties of the Normal Distribution Continued

• The tails of the normal extend to infinity in both directions– The tails get closer to the horizontal axis but never

touch it

• The area under the normal curve to the right of the mean equals the area under the normal to the left of the mean– The area under each half is 0.5

Page 23: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

The Position and Shape of the Normal Curve

(a) The mean positions the peak of the normal curve over the real axis (b) The variance 2 measures the width or spread of the normal curve

Page 24: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Relationship with other distributions

• The binomial distribution B(n,  p) is approximately normal N(np, np(1 − p)) for large n and for p not too close to zero or one.

• The Poisson(λ) distribution is approximately normal N(λ, λ) for large values of λ.

Page 25: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Normal Probabilities• Suppose X is a normally distributed random

variable with mean and standard deviation

• The probability that X could take any value in the range between two given values a and b (a < b) is P(a ≤ X ≤ b)

Page 26: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Normal Probabilities Continued

P(a ≤ X ≤ b) is the area colored in blue under the normal curve and between the values x = a and x = b

Page 27: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Three Important Areas under the Normal Curve

1. P(µ – σ ≤ X ≤ µ + σ) = 0.6826– So 68.26% of all possible observed values of x are within

(plus or minus) one standard deviation of µ2. P(µ – 2σ ≤ X ≤ µ + 2σ) = 0.9544– So 95.44% of all possible observed values of x are within

(plus or minus) two standard deviations of µ3. P(µ – 3σ ≤ X ≤ µ + 3σ) = 0.9973– So 99.73% of all possible observed values of x are within

(plus or minus) three standard deviations of µ

Page 28: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Three Important Percentages

Page 29: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

The Standard Normal Distribution

• If x is normally distributed with mean and standard deviation , then the random variable z

is normally distributed with mean 0 and standard deviation 1

• This normal is called the standard normal distribution

x

z

Page 30: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

The Standard Normal Distribution Continued

– The algebraic sign on z indicates on which side of is x

– z is positive if x > (x is to the right of on the number line)

– z is negative if x < (x is to the left of on the number line)

•z measures the number of standard deviations that x is from the mean m

Page 31: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

The Standard Normal Table• The standard normal table is a table that lists

the area under the standard normal curve to the right of the mean (z=0) up to the z value of interest– See Table 6.1, Table A.3 in Appendix A, and the

table on the back of the front cover• This table is so important that it is repeated 3 times in

the textbook!• Always look at the accompanying figure for guidance

on how to use the table

Page 32: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

The Standard Normal Table Continued

• The values of z (accurate to the nearest tenth) in the table range from 0.00 to 3.09 in increments of 0.01– z accurate to tenths are listed in the far left

column– The hundredths digit of z is listed across the top of

the table• The areas under the normal curve to the right

of the mean up to any value of z are given in the body of the table

Page 33: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

A Standard Normal Table

Page 34: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

The Standard Normal Table Example

• Find P(0 ≤ z ≤ 1)– Find the area listed in the table corresponding to a z value

of 1.00– Starting from the top of the far left column, go down to

“1.0”– Read across the row z = 1.0 until under the column headed

by “.00”– The area is in the cell that is the intersection of this row

with this column– As listed in the table, the area is 0.3413, so

• P(0 ≤ z ≤ 1) = 0.3413

Page 35: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Areas under the Standard Normal Curve

Page 36: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Calculating P(-2.53 ≤ z ≤ 2.53)

• First, find P(0 ≤ z ≤ 2.53)– Go to the table of areas under the standard normal curve– Go down left-most column for z = 2.5– Go across the row 2.5 to the column headed by .03– The area to the right of the mean up to a value of z = 2.53

is the value contained in the cell that is the intersection of the 2.5 row and the .03 column

– The table value for the area is 0.4943

Continued

Page 37: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Calculating P(-2.53 ≤ z ≤ 2.53) Continued

• From last slide, P(0 ≤ z ≤ 2.53)=0.4943• By symmetry of the normal curve, this is also

the area to the left of the mean down to a value of z = –2.53

• Then P(-2.53 ≤ z ≤ 2.53) = 0.4943 + 0.4943 = 0.9886

Page 38: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Calculating P(z -1)• An example of finding the area under the

standard normal curve to the right of a negative z value

• Shown is finding the under the standard normal for z ≥ -1

Page 39: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Calculating P(z 1)

Page 40: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Finding Normal Probabilities

1. Formulate the problem in terms of x values2. Calculate the corresponding z values, and restate

the problem in terms of these z values3. Find the required areas under the standard normal

curve by using the table

Note: It is always useful to draw a picture showing the required areas before using the normal table

Page 41: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Example: scores of IQ test • Most of the IQ tests are standardized with a mean

score of 100 and a standard deviation of 15. If scores on an IQ test are normally distributed, what is the probability that a person who takes the test will score between 90 and 110?

• Solution: We want to know the probability that the test score falls between 90 and 110. Since the mean is 100 and the sd is 15, we need to find the probability between z>-0.67 and z<0.67 from table. The answer is 2X0.2486 = 0.4972.

Page 42: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Example: scores of IQ test 2

• With the same assumption as last example, what is the probability that a person with IQ score above 130?

• Solution:We have z= (130-100)/15 = 2. From the table, P(z>2) = 0.5-0.4772 = 2.28%

Page 43: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

The Cumulative Normal Curve

• The cumulative normal curve is shown below:

The cumulative normal table gives the shaded area

Page 44: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Cumulative area under the standard normal curve

Page 45: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

The Cumulative Normal Table

• Useful for finding the probabilities of threshold values like P(z ≤ a) or P(z ≥ b)– Find P(z ≤ 1)• Find directly from cumulative normal table that P(z ≤ 1)

= 0.8413

– Find P(z ≥ 1)• Find directly from cumulative normal table that P(z ≤ 1)

= 0.8413• Because areas under the normal sum to 1

P(z ≥ 1) = 1 – P(z ≤ 1) = 1 – 0.8413 = 0.1587

Page 46: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Cumulative area under the standard normal curve

Page 47: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

• Albert Einstein, the famous physicist of last century, was considered to "only" have an IQ of about 160. What is the probability that a person with IQ score above 160?

• Solution:We have z= (160-100)/15 = 4. From table 6.1, P(z<4) = 0.99997 P(z>4) = 1-0.99997 = 0.00003or lest than 3 in 100,000 persons

Cumulative area under the standard normal curve: example

Page 48: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Example: probability of getting an “A”• Only the top 10% students in this class can

get the “A” grade. If the final scores of the class are normally distributed, what is the minimum z-value one should have in order to get an “A”?

• Solution: We need to find z such that the cumulative area under the standard normal curve is 0.9. From table 6.1, we get z>1.28.

Page 49: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Example 6.2: The Car Mileage Case • Want the probability that the mileage of a randomly selected

midsize car will be between 32 and 35 mpg– Let x be the random variable of mileage of midsize cars– Want P(32 ≤ x ≤ 35 mpg)

• Given:– x is normally distributed– µ = 33 mpg– σ = 0.7 mpg

• Procedure (from previous slide):1. Formulate the problem in terms of x (as above)2. Restate in terms of corresponding z values (see next slide)3. Find the indicated area under the standard normal curve using the z

table (see the slide after that)

Page 50: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Example 6.2: The Car Mileage Case2

• For x = 32 mpg, the corresponding z value is

• So the mileage of 32 mpg is 1.43 standard deviations below (to the left of) the mean

• For x = 35 mpg, the corresponding z value is

• Then P(32 ≤ x ≤ 35 mpg) = P(-1.43 ≤ z ≤ 2.86)

86207

2

70

3335.

..

xz

43107

1

70

3332.

..

xz

Page 51: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Example 6.2: The Car Mileage Case3

• Want: the area under the normal curve between 32 and 35 mpg

• Will find: the area under the standard normal curve between -1.43 and 2.86

Continued

Page 52: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Example 6.2: The Car Mileage Case4

• Break this into two pieces, around the mean– The area to the left of µ between -1.43 and 0

• By symmetry, this is the same as the area between 0 and 1.43• From the standard normal table, this area is 0.4236

– The area to the right of µ between 0 and 2.86• From the standard normal table, this area is 0.4979

– The total area of both pieces is 0.4236 + 0.4979 = 0.9215– Then, P(-1.43 ≤ z ≤ 2.86) = 0.9215

• Returning to x, P(32 ≤ x ≤ 35 mpg) = 0.9215

Page 53: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Illustrating the Results of Example 6.2

Page 54: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Finding Z Points on a Standard Normal Curve

Page 55: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Example 6.5: Stocking to prevent out of inventory

• A discount store sells packs of 50 DVD’s• Want to know how many packs to stock so that there

is only a 5 percent chance of stocking out during the week– Let x be the random variable of weekly demand– Let st be the number of packs so there is only a 5%

probability that weekly demand will exceed st– Want the value of st so that P(x ≥ st) = 0.05

• Given x is normally distributed, µ = 100 packs, and σ = 10 packs

Page 56: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Example 6.5 #2

Page 57: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Example 6.5 #3

• In panel (a), st is located on the horizontal axis under the right-tail of the normal curve having mean µ = 100 and standard deviation σ = 10

• The z value corresponding to st is

• In panel (b), the right-tail area is 0.05, so the area under the standard normal curve to the right of the mean z = 0 is 0.5 – 0.05 = 0.45

10

100

stst

z

Page 58: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Example 6.5 #4

• Use the standard normal table to find the z value associated with a table entry of 0.45– But do not find 0.45; instead find values that bracket 0.45– For a table entry of 0.4495, z = 1.64– For a table entry of 0.4505, z = 1.65– For an area of 0.45, use the z value midway between these– So z = 1.645

645110

100.

st

Page 59: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Example 6.5 #5

• Combining this result with the result from a prior slide:

• Solving for st gives st = 100 + (1.645 10) = 116.45– Rounding up, 117 packs should be stocked so that

the probability of running out will not be more than 5 percent

645110

100.

st

Page 60: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Finding a Tolerance IntervalFinding a tolerance interval [ k] that contains 99% of the measurements in a normal population

Page 61: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Example:• Given a population with mean height of 170

cm and sd=15 cm• What is the interval in height which contains

99% of the population?

61

Page 62: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Approximating the Binomial Distribution by Using the Normal Distribution

• The figure below shows several binomial distributions

• Can see that as n gets larger and as p gets closer to 0.5, the graph of the binomial distribution tends to have the symmetrical, bell-shaped, form of the normal curve

Page 63: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

63

Page 64: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Normal Approximation to the Binomial Continued

• Generalize observation from last slide for large p

• Suppose x is a binomial random variable, where n is the number of trials, each having a probability of success p– Then the probability of failure is 1 – p

• If n and p are such that np 5 and n(1–p) 5, then x is approximately normal with

pnpnp 1 and

Page 65: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Example 6.8: Normal Approximation to a Binomial #1

• Suppose there is a binomial variable X with n = 50 and p = 0.5

• Want the probability of X = 23 successes in the n = 50 trials– Want P(x = 23)– Approximating using the normal curve with

53553500500501

2550050

...pnp

.np

Page 66: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Example 6.8: Normal Approximation to a Binomial #2

• With continuity correction, find the normal probability P(22.5 ≤ x ≤ 23.5)

Page 67: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Example 6.8: Normal Approximation to a Binomial #3

• For x = 22.5, the corresponding z value is

• For x = 23.5, the corresponding z value is

• Then P(22.5 ≤ x ≤ 23.5) = P(-0.71 ≤ z ≤ -0.42) = 0.2611 – 0.1628 = 0.0983

• Therefore the estimate of the binomial probability P(x = 23) is 0.0983

71053553

25522.

.

.xz

42053553

25523.

.

.xz

Page 68: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

68

Example:Suppose that you are taking a multiple-choice test where each question has 4 possible answers. If there are 48 questions on the test, what is the probability that you would get exactly 14 questions correct by simply guessing at the answers?

If you are just guessing, then the probability of getting a question correct isp = 1/4, and the probability of guessing wrong is q = 3/4. With a sample of n = 48questions, this example meets the criteria for using the normal approximation to the binomial:pn = 0.25*(48) = 12qn = 0.75*(48) = 36 (both greater than 10)Therefore, the distribution showing the number correct out of 48 questions will benormal with parameters:mean = pn = 12; sd = sqrt(npq) = sqrt(9) = 3We want the proportion of this distribution that corresponds to X = 14, that is, the portion bounded by X = 13.5 and X = 14.5.

Page 69: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

69

Notice that the score X = 14 corresponds to an interval bounded by X = 13.5 and X = 14.5.

The binomial distribution (normal approximation) for the number of answersguessed correctly (X) on a 48-question multiple-choice test. The shaded portion corresponds to the probability of guessing exactly 14 questions.

Page 70: Chapter 6 Continuous Random Variables Statistics for Science (ENV) 1

Summary

6.1 Continuous Probability Distributions6.2 The Uniform Distribution6.3 The Normal Probability Distribution6.4 Approximating the Binomial Distribution

by using the Normal Distribution