discrete random variables & binomial distributioncathy/math3339/lecture/lecture5...discrete...
Post on 15-Jul-2020
17 Views
Preview:
TRANSCRIPT
Discrete Random Variables & Binomial DistributionSec 4.1 - 4.6
Cathy Poliak, Ph.D.cathy@math.uh.eduOffice in Fleming 11c
Department of MathematicsUniversity of Houston
Lecture 5 - 3339
Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 1 / 42
Outline
1 Random Variables
2 Probability Distribution
3 Expected Value
4 Standard Deviation
5 Chebyshev’s inequality
6 Rules for Means and Variances
7 Bernoulli Random Variables
8 Binomial Distribution
9 Cumulative Distributions
Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 2 / 42
Review Questions
For each random variable, determine if it is:
a. Discrete b. Continuous
1. The number of cars passing a busy intersection between 4:30 PMand 6:30 PM.
2. The weight of a fire fighter.
3. The amount of soda in a can of Pepsi.
Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 3 / 42
Random Variables
A random variable is a variable whose value is a numericaloutcome of a random phenomenon.Examples
I X = the sum of two dice, X = 2,3,4, . . . ,12.I X = the number of customers that order a muffin in a coffee shop
between 7:00 am and 9:00 am, X = 0,1, . . ..I X = weight of a box of Lucky Charms, X ≥ 0.
Random variables can either be discrete or continuous.
Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 4 / 42
Discrete random variables
Discrete random variables has either a finite number of valuesor a countable number of values, where countable refers to thefact that there might be infinitely many values, but they result froma counting process.Example of discrete random variable:
I The sum of two dice.I The number of customers who order a muffin in a coffee shop
between 7:00 am and 9:00 am.
The possible values for a discrete random variable has "gaps"between each value.
Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 5 / 42
Continuous random variables
Continuous random variables are random variables that canassume values corresponding to any of the points contained inone or more intervals.Example of continuous random variable:
I The weight of a box of Lucky Charms.
We want to know how both of these types of random variables aredistributed.
Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 6 / 42
Probability Distribution
The probability distribution of a random variable X tells us whatvalues X can take and how to assign probabilities to those values. Thisis the "ideal" distribution for a random variable. Requirements for aprobability distribution:
1. The sum of all the probabilities equal 1.2. The probabilities are between 0 and 1, including 0 and 1.
We will determine the three characteristics of a probability distribution:shape, center, and spread.
Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 7 / 42
Discrete Probability Distribution
The probability distribution of a discrete random variable X liststhe possible values of X and their probabilities. This is called aprobability mass function, pmf or frequency function.A probability distribution table of X consists of all possiblevalues of a discrete random variable with their correspondingprobabilities.
Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 8 / 42
Example
Suppose a family has 3 children. What is the sample space of allpossible ways for gender?
Now suppose we want to the probability of the number of girls in afamily of 3 children, X would be:
What is the probability that in a family of 3 children, they are all girls?
Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 9 / 42
Family of three
Given the probability distribution table of X for the number of girls in afamily of 3 children. Find P(X > 2), P(X < 1) and P(1 < X ≤ 3)
X = number of girls 0 1 2 3P(X) 0.125 0.375 0.375 0.125
Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 10 / 42
Probability Distribution
Suppose you want to play a carnival game that costs 8 dollars eachtime you play. If you win, you get $100. The probability of winning is
3100 . Give the probability distribution, pmf, of the amount that you, theplayer, stand to gain.
Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 11 / 42
Determining Center of a Discrete Random Variable
Suppose that X is a discrete random variable whose distribution is
Values of X x1 x2 x3 · · · xkProbability p1 p2 p3 · · · pk
To find the mean of the random variable X , multiply each possiblevalue by its probability, then add all the products:
µX = E [X ] = x1p1 + x2p2 + x3p3 + · · ·+ xkpk .
This is also called the expected value E [X ].Note: The list here is not a list of observations but a list of all possibleoutcomes. So we are finding µ, the population mean not x̄ , the samplemean.
Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 12 / 42
Determine the Expected Value
The following is a probability distribution.
X 0 1 2 3P(X ) 0.7 0.1 0.05 0.15
Find the expected value (mean) of this probability distribution.
Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 13 / 42
Example 2
Let X=The number of traffic accidents daily in a small city. Thefollowing table is the probability distribution for X .
X Probability0 0.101 0.202 0.453 0.154 0.055 0.05
Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 14 / 42
Determining the spread of a Discrete Random Variable
The variance of a discrete random variable X is
σ2X = (x1 − µX )2p1 + (x2 − µX )2p2 + · · ·+ (xk − µX )2pk
=k∑
i=1
(xi − µX )2pi
The standard deviation of X is the square root of the variance
σX =√σ2
X .
Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 15 / 42
Easier Calculation for Variance and StandardDeviation from a Discrete Probability Distribution
VAR(X) = σ2x = E(X 2)− [E(X )]2
Where E(X 2) = x21 p1 + x2
2 p2 + x23 p3 + . . .+ x2
n pn.
Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 16 / 42
Determine the Standard Deviation
The following is a probability distribution.
X 0 1 2 3P(X ) 0.7 0.1 0.05 0.15
Find the standard deviation of this probability distribution.
Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 17 / 42
Finding Expected Value and Standard Deviation in R
> x=c(0,1,2,3)> px=c(.7,.1,.05,.15)> ex = sum(x*px)> ex[1] 0.65> ex2 = sum(x^2*px)> ex2[1] 1.65> vx = ex2-ex^2> vx[1] 1.2275> sx = sqrt(vx)> sx[1] 1.107926
Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 18 / 42
Determine Standard Deviation of Number of Accidents
Determine the standard deviation of the number of accidents on agiven day. The expected value is E [X ] = 2.
X 0 1 2 3 4 5P(X ) 0.10 0.20 0.45 0.15 0.05 0.05
Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 19 / 42
Chebyshev’s Inequality
Chebyshev’s inequality, places a universal restriction on theprobabilities of deviations of random variables from their means.
If X is a random variable with mean µ and standard deviation σand if k is a postivie constant, then
P(|X − µ| > kσ) ≤ 1k2
That is: For any random variable, if we are k standard deviationsaway from the mean, then no more than 1
k2 ∗ 100% is beyond kstandard deviations from the mean. Or at least (1− 1
k2 ) ∗ 100% iswithin k standard deviations from the mean.
Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 20 / 42
Chebyshev’s Inequality Application
Let X=The number of traffic accidents daily in a small city. Thefollowing table is the probability distribution for X . With, µ = 2 andσ = 1.18.
X Probability0 0.101 0.202 0.453 0.154 0.055 0.05
Confirm Chebyshev’s Inequality for k = 3.
Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 21 / 42
Example
Suppose that in the small city on a given day there was rain. So therewould be at least one accident, the probability distribution of thenumber of accidents will be:
Number of accidents 1 2 3 4 5 6Probability 0.10 0.20 0.45 0.15 0.05 0.05
Compute the mean number of accidents.
Compute the variance of the number of accidents.
Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 22 / 42
Rule for Means and Variances
If X is any random variable and a is a fixed numbers that is addedto all of the values of the random variable thenthe mean increases by that number:
E [a + X ] = a + E [X ].
the variance remains the same:
σ2a+X = σ2
X .
In the example of a rainy day, we added 1 accident to each value,1 + X and the probabilities remained the same.
E [1 + X ] = 1 + E [X ] = 1 + 2 = 3
σ21+X = σ2
X = 1.4
Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 23 / 42
Rule for Means and Variances
If X is any random variable and b is a fixed numbers that ismultiplied to all of the values of the random variable thenthe mean is changed by that multiplier:
E [bX ] = bE [X ].
the variance is also changed by the square of the multiplier:
σ2bX = b2σ2
X .
Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 24 / 42
Rule Means and Variances
Suppose X is a random variable with mean E [X ] and variance σ2X , and
we define W as a new variable such that W = a + bX , where a and bare real numbers. We can find the mean and variance of W by:
E [W ] = E [a + bX ] = a + bE [X ]
σ2W = Var [W ] = Var [a + bX ] = b2Var [X ]
σW = SD[W ] =√
Var [W ] =√
b2Var [X ] = |b|(SD[X ])
Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 25 / 42
Example
Suppose you have a distribution X , with mean = 22 and standarddeviation = 3. Define a new random variable Y = 4X + 1.
1. Find the variance of X .
2. Find the mean of Y .
3. Find the variance of Y .
4. Find the standard deviation of Y .Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 26 / 42
Introduction Question
Wallen Accounting Services specializes in tax preparation forindividual tax returns. Data collected from past records reveals that 9%of the returns prepared by Wallen have been selected for audit by theInternal Revenue Service (IRS).
1. What is the probability that a customer of Wallen will be selectedfor audit?
a. 0.09 b. 0.91 c. 1 d. 0
2. What is the probability that a customer is not selected for audit?a. 0.09 b. 0.91 c. 1 d. 0
Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 27 / 42
Bernoulli Random Variable
A Bernoulli random variable has only two possible values, usuallydesignated as 1 and 0.
Suppose we look at the i th customer of Wallen Accountingservices. Let X be the random variable that indicates that thecustomer is selected for audit. Thus X = 1 is the customer isselected for audit, X = 0 if not selected for audit. Selected foraudit is the "success" and not selected for audit is "failure."
The probability of success is p and the probability of failure is1− p. e.g. a coin is flipped (heads or tails), someone is eitheraudited or not audited, p = 0.09, 1 - p = 0.91.
Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 28 / 42
Probability Function for Bernoulli Variable
f (x) = P(x) =
p, if x = 11− p, if x = 00, if x 6= 0,1
A compact way of writing this is:
f (x) = P(x) = px (1− p)1−x
Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 29 / 42
The Bernoulli Process
The Bernoulli process must possess the following properties:1. The experiment consists of n repeated trials.
2. Each trial in an outcome that may be classified as a success orfailure.
3. The probability of success, denoted by p, remains constant fromtrial to trial.
4. The repeated trials are independent.
Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 30 / 42
Audit Example
Today, Wallen has six new customers. Assume the chances of thesesix customers being audited are independent. This is a sequence ofBernoulli trials. We are interested in calculation the probability ofobtaining a certain number of people being selected for audit. Let Xiindicate the i th customer being selected for audit.Let Y = X1 + X2 + X3 + X4 + X5 + X6. What does Y represent?
What is the probability that Y = 0?
What is the probability that Y = 1?
What is the probability that Y = 2?
What is the probability that Y = n where n = 0,1,2,3,4,5,6?Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 31 / 42
Binomial Probability Distribution
The distribution of the count X of successes in the Binomialsetting has a Binomial probability distribution.
Where the parameters for a binomial probability distribution is:I n the number of observationsI p is the probability of a success on any one observation
The possible values of X are the whole numbers from 0 to n.
As an abbreviation we say, X ∼ B(n,p).
Binomial probabilities are calculated with the following formula:
P(X = k) =n Ckpk (1− p)n−k
Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 32 / 42
Stopping at an intersection
Suppose that only 25% of all drivers come to a complete stop at anintersection with a stop sign when not other cars are visible. What isthe probability that of the 20 randomly chosen drivers,
1. Exactly 6 will come to a complete stop?
2. No one will come to complete stop?
3. At least one will come to a complete stop?
4. At most 6 will come to a complete stop?
Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 33 / 42
Using R
To find P(X =k) use dbinom(k,n,p).
From previous example P(X = 6), k = 6, n = 20, p = 0..25> dbinom(6,20,.25)[1] 0.1686093
To find P(X ≤ k) use pbinom(k,n,p).
From previous example P(X ≤ 6)
> pbinom(6,20,0.25)[1] 0.7857819
What is the probability that less than 6 will come to a completestop? What is the probability of at least 6?
Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 34 / 42
Example #2
A fair coin is flipped 30 times.1. What is the probability that the coin comes up heads exactly 12
times? P(X = 12), n = 30, p = 0.5> dbinom(12,30,0.5)[1] 0.08055309
2. What is the probability that the coin comes up heads less than 12times? P(X < 12) = P(X ≤ 11)
> pbinom(11,30,0.5)[1] 0.1002442
3. What is the probability that the coin comes up heads more than 12times? P(X > 12) = 1− P(X ≤ 12)
> 1-pbinom(12,30,0.5)[1] 0.8192027
4. What is the probability that the coin comes up heads between 9and 13 times, inclusive? P(9 ≤ X ≤ 13) = P(X ≤ 13)− P(X ≤ 8)
> pbinom(13,30,0.5)-pbinom(8,30,.5)[1] 0.28427
Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 35 / 42
Mean and Variance of a Binomial Distribution
If a count X has the Binomial distribution with number of observationsn and probability of success p, the mean and variance of X are
µX = E [X ] = np
σ2X = Var [X ] = np(1− p)
Then the standard deviation is the square root of the variance.
Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 36 / 42
Example #3
Suppose it is known that 80% of the people exposed to the flu virus willcontract the flu. Out of a family of five exposed to the virus, what is theprobability that:
1. No one will contract the flu?
2. All will contract the flu?
3. Exactly two will get the flu?
4. At least two will get the flu?
Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 37 / 42
Example Continued
Suppose it is known that 80% of the people exposed to the flu virus willcontract the flu. Suppose we have a family of five that were exposed tothe flu.
1. Find the mean
2. Find the variance of this distribution.
Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 38 / 42
Cumulative Distributions
Any quantitative random variable X has a cumulative distributionfunction defined as
FX (x) = P(X ≤ x)
for all real numbers x .
For discrete random variables the relationship between the probabilityfunction, pmf, and the cumulative distribution function is
FX (x) =∑xi≤x
fX (xi)
where x1, x2, . . . are the values of X .
Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 39 / 42
Example of Cumulative Distribution
What is the probability that at most 2 people will contract the flu?
P(X ≤ 2) = F (2) = P(X = 0) + P(X = 1) + P(X = 2)
= sum(dbinom(0 : 2,5,0.8))
or in RStudio: F (2) = pbinom(2,5,0.8).
Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 40 / 42
Cumulative Distribution Function Properties
Any cumulative distribution function F has the following properties:1. F is a non-decreasing function defined on the set of all real
numbers.
2. F is right-continuous. That is, for each a,F (a) = F (a+) = limx→a+ F (x).
3. limx→−∞ F (x) = 0; limx→+∞ F (X ) = 1.
4. P(a < X ≤ b) = Fx (b)− Fx (a) for all real a and b, a < b.
5. P(X > a) = 1− Fx (a).
6. P(X < b) = Fx (b−) = limx→b− Fx (x).
7. P(a < X < b) = Fx (b−)− Fx (a).
8. P(X = b) = Fx (b)− Fx (b−).Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 41 / 42
Plot of Cumulative Distribution
0 1 2 3 4 5
0.0
0.2
0.4
0.6
0.8
1.0
x
flu.p
rob
Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c (Department of Mathematics University of Houston )Sec 4.1 - 4.6 Lecture 5 - 3339 42 / 42
top related