1 random variables… a random variable is a symbol that represents the outcome of an experiment....

37
1 Random Variables… •A random variable is a symbol that represents the outcome of an experiment. Alternatively, the value of a random variable can be either numerical [one we will concentrate on] or categorical data “the number of heads when flipping a coin 10 times” “the time it takes a doctor to complete an operation” “the number of infections last week at a hospital”

Upload: june-jennings

Post on 12-Jan-2016

221 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: 1 Random Variables… A random variable is a symbol that represents the outcome of an experiment. Alternatively, the value of a random variable can be either

1

Random Variables…• A random variable is a symbol that represents the

outcome of an experiment.

• Alternatively, the value of a random variable can be either numerical [one we will concentrate on] or categorical data

• “the number of heads when flipping a coin 10 times”• “the time it takes a doctor to complete an operation”• “the number of infections last week at a hospital”

Page 2: 1 Random Variables… A random variable is a symbol that represents the outcome of an experiment. Alternatively, the value of a random variable can be either

2

Two Types of Random Variables…

• Discrete Random Variable• – one that takes on a countable number of values• – E.g. values on the roll of dice: 2, 3, 4, …, 12• [normally “count” type data]• Continuous Random Variable• – one whose values are not discrete, not

countable• – E.g. time (30.1 minutes? 30.10000001 minutes?)• [normally measurement type data]• Analogy:• Integers are Discrete, while Real Numbers are Continuous

Page 3: 1 Random Variables… A random variable is a symbol that represents the outcome of an experiment. Alternatively, the value of a random variable can be either

3

Probability Distributions…

• A probability distribution is a table, formula, or graph that describes the values of a random variable and the probability associated with these values.

• Since we’re describing a random variable (which can be discrete or continuous) we have two types of probability distributions:

• – Discrete Probability Distribution, and• – Continuous Probability Distribution

Page 4: 1 Random Variables… A random variable is a symbol that represents the outcome of an experiment. Alternatively, the value of a random variable can be either

4

Probability Notation…

• An upper-case letter will represent the name of the random variable, usually X.

• Its lower-case counterpart will represent the value of the random variable.

• The probability that the random variable X will equal x is: • P(X = 6) = 0.57 • or more simply • P(6) = 0.57

Page 5: 1 Random Variables… A random variable is a symbol that represents the outcome of an experiment. Alternatively, the value of a random variable can be either

5

Discrete Probability Distributions…

• The probabilities of the values of a discrete random variable may be derived by means of probability tools such as tree diagrams or by applying one of the definitions of probability, so long as these two conditions apply:

X P(X) Cumulative0 0.40 0.401 0.30 0.702 0.20 0.903 0.05 0.954 0.05 1.00

X = # Classes Missed

Page 6: 1 Random Variables… A random variable is a symbol that represents the outcome of an experiment. Alternatively, the value of a random variable can be either

6

Population Mean (Expected Value)

• The population mean is the weighted average of all of its values. The weights are the probabilities.

• This parameter is also called the expected value of X and is represented by E(X).

X P(X) X*P(X)0 0.40 0.001 0.30 0.302 0.20 0.403 0.05 0.154 0.05 0.20

μ = 1.05

Page 7: 1 Random Variables… A random variable is a symbol that represents the outcome of an experiment. Alternatively, the value of a random variable can be either

7

Population Variance…

• The population variance is calculated similarly. It is the weighted average of the squared deviations from the mean.

X P(X) X*P(X) (X - μ)2 (X - μ)2 * P(X)0 0.40 0.00 1.1025 0.44101 0.30 0.30 0.0025 0.00082 0.20 0.40 0.9025 0.18053 0.05 0.15 3.8025 0.19014 0.05 0.20 8.7025 0.4351

μ = 1.05 σ2 = 1.2475σ = 1.1169

Page 8: 1 Random Variables… A random variable is a symbol that represents the outcome of an experiment. Alternatively, the value of a random variable can be either

8

Laws of Expected Value…NOT IN TEXT

1. E(c) = c• The expected value of a constant (c) is just the value of the constant.

2. E(X + c) = E(X) + c3. E(cX) = cE(X)

4. E(c1X1 + c2X2 + c3X3 + c4X4 + c5X5)

5. = c1E(X1) + c2E(X2) + c3E(X3) + c4E(X4) + c5E(X5) 6. Example: what is the expected mean weight of a

surgical pack containing 5 components [maybe we could weigh the pack to determine if one of the components is missing].

7. True when random variables are independent!!!

Page 9: 1 Random Variables… A random variable is a symbol that represents the outcome of an experiment. Alternatively, the value of a random variable can be either

9

Laws of Variance…NOT IN TEXT

1. V(c) = 0• The variance of a constant (c) is zero.

2. V(X + c) = V(X)• The variance of a random variable and a constant is just the variance of the random variable (per 1 above).

3. V(cX) = c2V(X) • The variance of a random variable and a constant coefficient is the coefficient squared times the variance of the random variable.

4. V(c1X1 + c2X2 + c3X3 + c4X4 + c5X5)

= c12V(X1) + c2

2V(X2) + c32V(X3) + c4

2V(X4) + c52V(X5)

Page 10: 1 Random Variables… A random variable is a symbol that represents the outcome of an experiment. Alternatively, the value of a random variable can be either

10

Example: Convert Celsius to Fahrenheit

• Patient temperature (celsius) data collected over the last 5 years resulted in

• μc = 100 and σc2 = 0.4

• Your boss wants these numbers in fahrenheit• F = (9/5)* C + 32

• μF = (9/5) * μc + 32 = {(9/5) * 100} + 32 = 180+ 32 = 212

• σF2 = (9/5)2 * σc

2 = (9/5)2 * (.4) = 0.72

• σF = SQRT(.72) = 0.849

Page 11: 1 Random Variables… A random variable is a symbol that represents the outcome of an experiment. Alternatively, the value of a random variable can be either

11

HOMEWORK

Exercises:

4.2.1

4.2.3

Extra Problem: A study of the weights of 12 year old children resulted in a mean weight of 112 lbs. and variance of 16 lbs.2 . After the study was finished, someone noticed that the scales were not zeroed out and all the data showed a child’s weight 2 lbs. heavier than they actually weighed [100 lbs. should have been 98 lbs.]. What is the mean and variance of the actual weights of the children?

Page 12: 1 Random Variables… A random variable is a symbol that represents the outcome of an experiment. Alternatively, the value of a random variable can be either

12

HW Problem: Function of Random Variables

• Administrators at a local hospital decided it would be more efficient to assign a given task to one nurse rather than have this task performed by several nurses. Studies have shown that the average time to complete this task is 15 minutes with a standard deviation of 2 minutes.

• The head nurse calculated the number of tasks a nurse could expect to complete in a 8 hour shift to be 32. [15 min/task = 4 tasks/hour = 32 tasks/8Hr Shift]. Since nurses have no break time, the head nurse assigned Nurse Wilson 30 of these tasks for tomorrow.

• *What is the expected time to complete all 30 tasks?• *What is the std. dev. of the time to complete all 30 tasks?

Note: will have to calculate variance first.• *Do you think Nurse Wilson be able to complete all 30

tasks in her 8 hr shift?

Page 13: 1 Random Variables… A random variable is a symbol that represents the outcome of an experiment. Alternatively, the value of a random variable can be either

13

Flip a coin 5 times!

• Your Professor told you that if he flipped a coin 5 times and observed any tails, the class could go home early. However, if he observed all 5 heads, the class had to stay 15 minutes after class.

• Describe this experiment [tell me everything you know about this game]

Page 14: 1 Random Variables… A random variable is a symbol that represents the outcome of an experiment. Alternatively, the value of a random variable can be either

14

Binomial Distribution…

• The binomial distribution is the probability distribution that results from doing a “binomial experiment”. Binomial experiments have the following properties:

• “A NATURAL DISTRIBUTION”

1. Fixed number of trials, represented as n.

2. Each trial has two possible outcomes, a “success” and a “failure”.

3. P(success)=p (and thus: P(failure)=1–p), for all trials.

4. The trials are independent, which means that the outcome of one trial does not affect the outcomes of any other trials.

Page 15: 1 Random Variables… A random variable is a symbol that represents the outcome of an experiment. Alternatively, the value of a random variable can be either

15

Binomial Random Variable…

• Experiment: flip a fair coin 5 times…

• Random Variable: X = # heads

• Want to know: P(# heads = 5) = P(5) = ???

– 1) Fixed number of trials n= 5

– 2) Each trial has two possible outcomes {heads (success), tails (failure)}

– 3) P(head)= 0.50; P(tail)=1–0.50 = 0.50 – 4) The trials are independent (i.e. the outcome of heads on the first

flip will have no impact on subsequent coin flips).

• Hence flipping a coin 5 times is a binomial experiment since all conditions were met.

Page 16: 1 Random Variables… A random variable is a symbol that represents the outcome of an experiment. Alternatively, the value of a random variable can be either

16

Binomial Random Variable…

• The binomial random variable counts the number of successes in n trials of the binomial experiment. It can take on values from 0, 1, 2, …, n. Thus, its a discrete random variable.

• In the old days we had to use the binomial formula (or binomial tables) below, but now we can calculate using excel statistical functions.

• The Binomial Distribution [formula]:

• Coin flip problem: X = # heads, n = 5, p = 0.5, (1 – p) = 0.5

• P(You stay late) = P(X = 5) = ?

for x=0, 1, 2, …, n

Page 17: 1 Random Variables… A random variable is a symbol that represents the outcome of an experiment. Alternatively, the value of a random variable can be either

17

Binomial Tables• Tables in Text

• How do you get individual probabilities?

TABLE B Cumulative Binomial Probability Distribution

x\p … 0.5

0 … 0.0312 …1 … 0.1875 …2 … 0.5000 …3 … 0.8125 …4 … 0.9688 …5 … 1.0000 …

n = 5

Individual Probabilities:

x\p … 0.5

0 … 0.0312 …1 … 0.1563 …2 … 0.3125 …3 … 0.3125 …4 … 0.1563 …5 … 0.0312 …

n = 5

Page 18: 1 Random Variables… A random variable is a symbol that represents the outcome of an experiment. Alternatively, the value of a random variable can be either

18

=BINOMDIST() Excel Function…

• There is a binomial distribution function in Excel that can also be used to calculate these probabilities. For example:

• What is the probability that you get two answers correct guessing on a 10 multiple choice test with 5 options for each question?

# successes

# trials

P(success)

cumulative(i.e. P(X≤x)?)

P(X=2)=.3020

Page 19: 1 Random Variables… A random variable is a symbol that represents the outcome of an experiment. Alternatively, the value of a random variable can be either

19

Binomial Distribution…

• Statisticians have developed general formulas for the mean, variance, and standard deviation of a binomial random variable. They are:

• These are the true parameter values for this random variable.• For our coin flipping example:• μ = n*p = 5*(0.5) = 2.5• σ2 = n*p*(1-p) = 5*(0.5)*(0.5) = 1.25• σ = SQRT[ n*p*(1-p)] = SQRT[5*(0.5)*(0.5)] = SQRT[1.25]

= 1.118

Page 20: 1 Random Variables… A random variable is a symbol that represents the outcome of an experiment. Alternatively, the value of a random variable can be either

20

WORK IN CLASS: Binomial

• Given a binomial random variable with n = 15 and p = .25, find the following probabilities

–P(X = 5) =

–P( X < 5) =

–P(3 < X < 5) =

Page 21: 1 Random Variables… A random variable is a symbol that represents the outcome of an experiment. Alternatively, the value of a random variable can be either

21

HW: Binomial

• Exercise: 4.3.1(also calculate the variance and standard deviation for exercise 4.3.1), 4.3.2, Review Question 15

Page 22: 1 Random Variables… A random variable is a symbol that represents the outcome of an experiment. Alternatively, the value of a random variable can be either

22

Poisson Distribution…

• Named for Simeon Poisson, the Poisson distribution is a discrete probability distribution and refers to the number of events (a.k.a. “things that occur”) within a specific time period or region of space [“a sample unit”]. For example:• The number of cars arriving at a service station in 1 hour. (The interval

of time is 1 hour.) • The number of flaws in a one square foot of cloth. (The specific region

is one square foot of cloth.)• The number of accidents in 1 day on a particular stretch of highway.

(The interval is defined by both time, 1 day, and space, the particular stretch of highway.)

• The number of infections at a hospital in one week.• The number of critters in a bottle of coke.

NOTE: these random variables MAY of MAY NOT be Poisson random variables. We have ways to test the data to see if Poisson would be an appropriate distribution to use for that example.

Page 23: 1 Random Variables… A random variable is a symbol that represents the outcome of an experiment. Alternatively, the value of a random variable can be either

23

The Poisson Experiment…

• Like a binomial experiment, a Poisson experiment has four defining characteristic properties:

1. The number of successes that occur in any interval [sample unit] is independent of the number of successes that occur in any other interval [sample unit].

2. The probability of a success in an interval is the same for all equal-size intervals

3. The probability of a success is proportional to the size of the interval.

4. The probability of more than one success in an interval approaches 0 as the interval becomes smaller.

Page 24: 1 Random Variables… A random variable is a symbol that represents the outcome of an experiment. Alternatively, the value of a random variable can be either

24

Poisson Probability Distribution…• The probability that a Poisson random variable assumes a value of x is given

by:

• and e is the natural logarithm base.• This text will use λ instead of μ

• FYI: μ = σ2 = λ • or

Page 25: 1 Random Variables… A random variable is a symbol that represents the outcome of an experiment. Alternatively, the value of a random variable can be either

25

Use Poisson to Approximate Binomial Probabilities

• We often work problems which are “actually” binomial but we use the poisson.

• Condition: If the binomial sample size is very large and the probability of success is very small, it is easier to use the poisson [normally n*p < 5].

• Example: The probability of an infection in a hospital is known to be 0.0012. You sample the last 1000 patients that stayed in your hospital and wish to calculate the probability that fewer than 6 will have an infection [mean of binomial: μ = n*p = 1000*(.0012) = 1.2]

• Try to work this problem using the binomial!

Page 26: 1 Random Variables… A random variable is a symbol that represents the outcome of an experiment. Alternatively, the value of a random variable can be either

26

Using Excel to Calculate Poisson Probabilities

• Example: x = 5, λ = 6, P(X = 5) = .4457

Page 27: 1 Random Variables… A random variable is a symbol that represents the outcome of an experiment. Alternatively, the value of a random variable can be either

27

Cumulative Poisson Distribution Tables

• You can find poisson table in the back of your text for select values of λ.

• Verify Excel answer using Poisson Tables

TABLE C Cumulative Poisson Probability Distribution

x\λ … 60 … 0.002 …

1 … 0.017 …2 … 0.062 …3 … 0.151 …4 … 0.285 …5 … 0.446 …6 … 0.606 …

… … … …infinity … 1.000 …

Page 28: 1 Random Variables… A random variable is a symbol that represents the outcome of an experiment. Alternatively, the value of a random variable can be either

28

Back to the Poisson Approximation to the Binomial.

• Example: The probability of an infection in a hospital is known to be 0.0012. You sample the last 1000 patients that stayed in your hospital and wish to calculate the probability that fewer than 6 will have an infection [mean of binomial: μ = n*p = 1000*(.0012) = 1.2]

• Use Poisson Tables to approximate this probability (λ = 1.2)

• Answer: P(X < 6) = P(X < 5) = ???

• If you actually had 6 infections for the last 1000 patients who stayed in your hospital, what would you tell the chief hospital administrator?

• If you only sampled 500 patients, what would λ now be equal to?

Page 29: 1 Random Variables… A random variable is a symbol that represents the outcome of an experiment. Alternatively, the value of a random variable can be either

29

Students Work in Class: Poisson

• The number of infections [X] in a hospital each week has been shown to follow a poisson distribution with mean 3.0 infections per week. Calculate the following probabilities.

• P(X = 0) =

• P(X < 8) =

• P(X > 9) =

• If you found 9 infections next week, what would you say??

Page 30: 1 Random Variables… A random variable is a symbol that represents the outcome of an experiment. Alternatively, the value of a random variable can be either

30

Homework for Students: Poisson

Extra Problem:• With infections running wild in many hospitals, the chief administrator• of Local Hospital decided to find out how Local Hospital stacks up• against the national norm [national norm states that the average • number of bacteria per square yard of surface area should be no more• than 9 bacteria/square yard]. The number of bacteria per square yard • is assumed to be a poisson random variable.

–*If you go into the hospital, randomly sample one square yard of surface area, and count the number of bacteria found, calculate the probability of finding 19 or fewer bacteria.–*If you actually found 15 bacteria, what would you conclude about the state of the hospital?–*In order to continuous monitor the state of the hospital, it was decided to randomly sample one square foot of surface area each day to insure that the hospital is being cleaned properly [takes too much time to sample 1 square yard]. If you do this, what would the mean of the poisson be in this case?

TEXTBOOK EXERCISES: 4.4.3 and 4.4.5, Review Question 17

Page 31: 1 Random Variables… A random variable is a symbol that represents the outcome of an experiment. Alternatively, the value of a random variable can be either

31

Continuous Probability Distributions

• A function f(x) is called a continuous probability distribution (over the range a ≤ x ≤ b if it meets the following requirements:

1) f(x) ≥ 0 for all x between a and b, and

2) The total area under the curve between a and b is 1.0

f(x)

xba

area=1

Page 32: 1 Random Variables… A random variable is a symbol that represents the outcome of an experiment. Alternatively, the value of a random variable can be either

32

The Normal Distribution…

• The normal distribution is the most important of all probability distributions. The probability density function of a normal random variable is given by:

• It looks like this:• Bell shaped,• Symmetrical around the mean μ• Two Parameters: μ = mean and σ = std. dev.

Page 33: 1 Random Variables… A random variable is a symbol that represents the outcome of an experiment. Alternatively, the value of a random variable can be either

33

Some Facts About The Normal Distribution…

• The area (probability) within + 1σ is ~ .68 (68%)• The area (probability) within + 2σ is ~ .95 (95%)• The area (probability) within + 1σ is ~ .997 (99.7%)• The area (probability) to the right or left of the mean is

exactly .5 (50%)

• This fact allows us to use one set of Normal Tables to calculate all normal probabilities, provided we know how many standard deviations a given value of the random variable is away from the mean. This is called a Z-Score

Page 34: 1 Random Variables… A random variable is a symbol that represents the outcome of an experiment. Alternatively, the value of a random variable can be either

34

The Standard Normal Distribution

• This Z-Score is also a random variable and is called “the standard normal distribution” whose mean (μ) is equal to 0 and standard deviation (σ) is equal to 1. From this standard normal distribution we can calculate any normal probability where the mean and std. dev. is something other than 0 and 1.

Page 35: 1 Random Variables… A random variable is a symbol that represents the outcome of an experiment. Alternatively, the value of a random variable can be either

35

Calculating Normal Probabilities…

• P(45 < X < 60) ?

0

…mean of 50 minutes and astandard deviation of 10 minutes…

Page 36: 1 Random Variables… A random variable is a symbol that represents the outcome of an experiment. Alternatively, the value of a random variable can be either

36

Calculating Normal Probabilities…

• Standard Normal Distribution (Table D)• P(Z < 1.3) = 0.9032

• Use “Table D: Normal Curve Areas” in text• NOTE: True for any normal distribution when Z-

Score is 1.3

Page 37: 1 Random Variables… A random variable is a symbol that represents the outcome of an experiment. Alternatively, the value of a random variable can be either

37

Calculating Normal Probabilities…

• Work following EXERCISES in class:

• 4.6.1, 4.6.2, 4.6.11, 4.6.13, 4.7.1

• HW for students to work:

4.6.3, 4.6.5, 4.6.7, 4.6.9, 4.7.3, 4.7.5, REVIEW QUESTION 23