ap statistics chapter 6 random variables. what is a random variable? a random variable is a variable...
TRANSCRIPT
AP StatisticsChapter 6
Random Variables
2
What is a random variable?
• A random variable is a variable whose value is a numerical outcome of a random phenomenon.
EXAMPLEIf we toss four coins, how would we record the
results?We could record it as a string of tails and heads
like “HTTH” or “HTHH”.This is not a random variable because it has no
numerical value to work with.Instead, we may elect to record the number of
heads in the four tosses.This would make our sample space 0, 1 , 2, 3, 4
… all numerical outcomes.
3
Discrete vs. Continuous Variables
• A discrete random variable has a countable number of possible values.
• A continuous random variable can take any possible value over an interval.
EXAMPLESThe number of heads in four coin tosses.
A number generated by a spinner that covers the numbers between 0 and 1.
4
Discrete Random Variables
• The probability distribution of a discrete variable lists the values and their probabilities.
• The probabilities must satisfy two requirements:
• Every probability is between 0 and 1.• p1 + p2 + … + pk = 1
• Find the probability of any event by adding the individual probabilities that make up that event.
Value X X1 X2 X3 … Xk
P(X) p1 p2 p3 … pk
5
EXAMPLEDetermine the probability distribution of the
discrete random variable X that counts the number of heads in four coin tosses.
We can do this if we make two reasonable assumptions:
1. The coin is balanced, so each toss is equally likely to give an H or T.
2. The coin has no memory, so each toss is independent.
Since each outcome is equally likely, what is the probability of each combination?
6
Continued…The number X represents the number
of heads in four tosses. These values are NOT equally likely.
Use this information to complete your probability distribution.
What is the probability of getting 2 or more heads?
7
Means and Variances• The mean of a set of observations is:• The mean of a random variable X is also an average of the
possible values of X.• This average must take in to account that some values of X
may occur more frequently than others.• We can handle this adjustment by multiplying each outcome
by its probability.
x
Value X X1 X2 X3 … Xk
P(X) p1 p2 p3 … pk
1 1 2 2 ...X k kx p x p x p
X i ix p
8
EXAMPLEAccording to Benford’s Law, the distribution of the first digit V in a
set of legitimate business records is:
Use this information to compute the expected value of any randomly selected first digit. (expected value = mean)
The mean of V is:
First Digit V: 1 2 3 4 5 6 7 8 9
P(V) 0.301 0.176 0.125 0.097 0.079 0.067 0.058 0.051 0.046
1(0.301)V 2(0.176) 3(0.125) 4(0.097) 5(0.079)6(0.067) 7(0.058) 8(0.051) 9(0.046)
3.441V
9
EXAMPLE Continued…• While the mean of 3.441 is not a possible outcome of V, it still
gives us an idea of where we can expect most values to occur.• If each digit was truly random, we would have a uniform
distribution.• What would the mean be in this case?
• Notice how this compares to the distribution of Benford’s Law.
10
Variance• In a set of discrete values, the variance is based off
of how much each value “varies” from the expected amount.
• In the case of a random variable’s distribution, we must account for the differences in frequency among outcomes.
• …and the standard deviation is the square root of the variance.
Value X X1 X2 X3 … Xk
P(X) p1 p2 p3 … pk
2 2 221 1 2 2 ...X X X k X kx p x p x p
22X i X ix p
11
EXAMPLEGain Communications sells aircraft communication units to both
military and civilian markets. Gain uses the modern practice of using probability estimates to
estimate sales for the upcoming year.The military division of the company estimates its sales as
follows:
Calculate the expected number of sales and the standard deviation.
Units Sold (X) 1000 3000 5000 10,000
P(X) 0.1 0.3 0.4 0.2
HOMEWORKComplete the problems: pg. 353 (#1 –
16). This assignment will be due for completion at the start of the next
session of class.
13
Continuous Random Variables
• As mentioned before, continuous random variables deal with an infinite number of possible outcomes over a pre-determined interval.
• Since there are an infinite number of possibilities, the probability of any individual occurrence is practically zero.
• Suppose we wanted to make a probability distribution for an event like,
• What would be the theoretical probability assigned to 0.47?
0.3 0.7x
1(0.47)P
0.0
14
Density Curves• In order to assign probabilities to events we can
use density curves to describe a distribution.• The horizontal axis of the density curve will
represent all of the occurrences and its height over each occurrence will represent its frequency.
• The area under the curve over an interval will represent the probability of an event within that interval occurring.
• The total area under the curve will equal 1.
15
EXAMPLELet’s revisit the spinner that generates a
random number between 0 and 1.
What would be the probability of generating a number X between 0.3 and 0.7 ?
(0.3 0.7)P X
16
EXAMPLE Continued• Since each number on the spinner has an equal chance of being
generated, we will call this a uniform distribution.• The area under the curve is 1. Since this is uniform, the curve will
be rectangular in shape.
• The probability of getting a value between 0.3 and 0.7 will be the area between those two values.
(0.3 0.7)P X 0.4
17
Taking it further…• With the same example in mind, what would be the
following:( 0.5)P X 0.5( 0.8)P X 0.2
( 0.5 or 0.8)P X X 0.7
Is there a difference between P(X>8) and P(X>8)?
18
The Normal Distribution• We have discussed a density curve in prior chapters.
It was the NORMAL CURVE.• The normal distribution is considered a probability
distribution.• Recall that N(μ, σ) is our shorthand way of referring
to the normal distribution having a mean of μ and a standard deviation of σ.
• To standardize our values and use our normal distribution table, we must use a z-score.
XZ
19
EXAMPLEAn opinion poll ask an SRS of 1500 American adults what the
biggest issue facing schools was.Based on the sample data, 30% of the adults said drugs. We will
learn how to analyze this later, but for now, we will say that this is an estimate of the population with a distribution mean of 0.3 and a standard deviation of 0.0118.
In other words… What is the probability that the result differs from the truth by
more than two percentage points?In other words…
Hint: Start off by “standardizing” the data.
(0.3,0.0118)N
( 0.28 or 0.32)P p p
20
EXAMPLE Continued…
0.28 0.3( 0.28)
0.0118P p P Z
1.69P Z 0.0455
0.32 0.3( 0.32)
0.0118P p P Z
1.69P Z 0.0455
( 0.28 or 0.32) 0.0910P p p
HOMEWORKComplete the problems: pg. 355 (#17 –
30). This assignment will be due for completion at the start of the next
session of class.
22
Rules for Means• If the values of a random variable, X, are increased
or decreased by addition or subtraction, then the mean value of X is also increased in the same manner.
• If the values of a random variable, X, are increased or decreased by multiplication, then the mean value of X is also increased in the same manner.
• In other words,
a bX xa b
23
Rules for Means• If we have two random variables, X and Y, then the
sum of those two variables will have a mean that is equal to the sum of their individual means.
• In other words,
X Y X Y
24
EXAMPLEGain Communications sells aircraft communication units to both
military and civilian markets. Gain uses the modern practice of using probability estimates to
estimate sales for the upcoming year.The military division of the company estimates its sales as
follows:
The civilian division of the company estimates its sales as follows:
Compute the mean sales of each.
Units Sold (X) 1000 3000 5000 10,000
P(X) 0.1 0.3 0.4 0.2
Units Sold (Y) 300 500 750
P(Y) 0.4 0.5 0.1
25
EXAMPLE• Gain makes a profit of $2000 on each military unit
and $3500 on each civilian unit that is sold.• The mean military sales profit is:
• The mean civilian sales profit is:
• The total profit, Z, is the sum of all sales profits.• The mean value of Z would be:
2000 $2000(5000)X $10,000,000
3500 $3500(445)Y $1,557,500
$2000 $3500Z X Y
2000 3500Z X Y
26
Rules for Variance• We can apply similar rules to the variances of
random variables.• In order to do this, we must know if there the two
random variables are independent of one another.• This would mean that there was a correlation of
ZERO between them.• If there is a correlation between them, we must
account for that correlation when we try to combine variances.
• It should also be noted that we are working with variances here and not standard deviations.
27
Rules for Variance• If X is a random variable and a and b are fixed
numbers, then:• Notice that addition to X does not affect the
variation. Only multiplication does.• If X and Y are random variables with complete
independence (no correlation):
2 2 2a bX Xb
2 2 2X Y X Y
2 2 2X Y X Y
28
EXAMPLEA college uses SAT scores as one criterion for
admission. Experience has shown that the distribution of SAT scores among its entire population of applicants is:
What are the mean and standard deviation of the total score X + Y among students applying to this college?
NOTE: This is based on the assumption that the scores are independent, which many may argue that they are not.
SAT Math Score (X) μx = 625 σx = 90
SAT Verbal Score (Y) μY = 590 σY = 100
1215X Y 18100 134.54X Y
29
EXAMPLE• A large auto dealership keeps track
of sales and lease agreements made during each hour of the day. Let X = the number of cars sold, and let Y = the number of cars leased during the first hour of a randomly selected Friday.
• Based on previous records, the distributions of X and Y are:Sold X 0 1 2 3
p 0.3 0.4 0.2 0.1Leased Y 0 1 2
p 0.4 0.5 0.1
30
CONTINUED…• Find the mean and standard
deviation of both X and Y.
• Now let’s define the total number of deals as T. (T = X + Y)
• Find and interpret the mean of T.
• Now compute the standard deviation of T.
1.1X 0.943X
0.7Y 0.64Y
T X Y 1.8
31
CONTINUED• Remember that you must deal with
variances instead of standard deviations.
• The dealership’s manager receives a $500 bonus for each car sold and a $300 bonus for each car leased. Find the mean and standard deviation of the manager’s total bonus.
2 2 20.943 0.64T 1.29881.14T
2 2 2 2500 (0.943) 300 (0.64)B $509.09
500(1.1) 300(0.7)B $760
32
Check Your Understanding
Complete the Check Your Understanding problem on the top of pg. 372. We will discuss the answers in a moment.
HOMEWORKComplete the problems: pg. 378 (#37 –
51). This assignment will be due for completion at the start of the next
session of class.
34
The Binomial Setting• We have a binomial situation when the
following things are in place:1. Each observation will fall in to one of
two categories, usually considered “success” or “failure”.
2. There is a fixed number of observations, “n”.
3. All of the n observations are independent.
4. The probability of success is the same for each observation.
35
Binomial Distributions• In a binomial setting, the random variable
X is equal to the number of successes.• The probability distribution of X in this case
is considered a binomial distribution.• The parameters of the distribution are n
and p. • n represents the number of observations• p is the probability of success on any
observation.• As an abbreviation, we say that X is B(n, p).
36
EXAMPLEBlood type is a trait that is passed through
heredity. If both parents carry the genes for both O and A blood types, there is a probability of 0.25 of having a child with Type O blood.
If these parents have 5 children, how many children would have Type O blood?
This is a binomial distribution B(5, 0.25).Deal 10 cards and let X be the count of the
number of red cards.This would not be a binomial distribution
because each occurrence is not independent.
37
Computing Binomial Probabilities
• If X has the binomial distribution with n observations, having a probability of p for success on each, then the possible values of X are 0, 1, 2, …, n. If k is any of these values,
This formula can be applied to find the probability of k number of successes in the situation described.
!( ) (1 )
!( )!k n kn
P X k p pk n k
38
EXAMPLEA quality engineer selects an SRS of 10 switches from
a large shipment for detailed inspection. Unknown to the engineer, 10% of the switches in the shipment fail to meet the specifications. What is the probability that exactly 1 of the ten switches in the sample will fail inspection?
This is a distribution defined as B(10, .1).In this situation, k = 1.
What would be the probability of the engineer finding 1 or fewer defective switches?
1 910!( 1) (.1) (.9)
1!(10 1)!P X
0.3874
0.7361
39
EXAMPLE 2Each child of a particular pair of parents has a
probability 0.25 of having type O blood. If they have 5 children, what is the probability that exactly 3 of the children have type o blood?
There is basically, an 8.8% chance that this could happen!
What is the probability that MORE THAN 3 of the children have type O blood?
3 25!( 3) (.25) (.75)
3!(5 3)!P X
3 210 (.25) (.75)
0.08789
HOMEWORKComplete the problems pg. 403 (#69 –
80). This assignment will be due for completion at the start of the next
session of class.
41
Geometric Probability• We have a geometric setting when the
following characteristics are in place:1. Each observation will fall in to one of two
categories, usually considered “success” or “failure”.
2. The probability of success is the same for each observation.
3. All of the n observations are independent.4. The variable of interest, X, is the number
of trials required to obtain the first success.
42
EXAMPLEIf we are rolling a single die, and we want to
roll a “5”, then how many rolls would it take to get a five for the first time?
If we are rolling a die four times, and we want to count the number of fives that we roll …
GEOMETRIC DISTRIBUTION
BINOMIAL DISTRIBUTION
43
Calculating Geometric Probabilities
• If X has a probability p of occurring, and a probability q of not occurring, the possible values of X are 1, 2, 3, …
• If n is any of these values, the probability that the first success occurs on the nth trial is:
• What would be the probability that it would take 3 rolls before we got our first five? 6 rolls?
1( ) nP X n q p
44
Using the TI-84: pdf
• Just as with binomial probabilities, we can use the calculator to quickly compute geometric probabilities.
• The geometpdf function will quickly compute the probability for a set number of trials being required to achieve first success.
• To compute the probability that it would take five rolls to roll a “5” for the first time, we would use: (1/ 6,5)geometpdf 0.0804
45
The Geometric Distribution
• The geometric probability distribution also has a mean and standard deviation.
• The mean, or expected value, of a geometric random variable is:
• The standard deviation of a geometric random variable is:
1
p
2
q
p
HOMEWORKComplete the problems 8.37 – 8.46.
This assignment will be due for completion at the start of the next
session of class.