sta 2200 probability and statistics ii€¦ · time is precious, but we do not know yet how...
TRANSCRIPT
-
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 1
STA 2200 PROBABILITY AND STATISTICS II
Purpose By the end of the course the student should be able to solve problems involving
probability distributions of a discrete or a continuous random variable.
Objectives
By the end of this course the student should be able to;
(1) Define the probability mass, density and distribution functions, and to use these to determine expectation, variance, percentiles and mode for a given distribution
(2) Appreciate the form of the probability mass functions for the binomial, geometric, hyper-geometric and Poisson distributions, and the probability density functions for the
uniform, exponential, gamma , beta and normal, functions, and their applications
(3) Apply the moment generating function and transformation of variable techniques (4) Apply the principles of statistical inference for one sample problems.
DESCRIPTION
Random variables: discrete and continuous, probability mass, density and distribution
functions, expectation, variance, percentiles and mode. Moments and moment generating
function. Moment generating function and transformation Change of variable technique for
univariate distribution. Probability distributions: hyper-geometric, binomial, Poisson, uniform,
normal, beta and gamma. Statistical inference including one sample normal and t tests.
Pre-Requisites: STA 2100 Probability and Statistics I, SMA 2104 Mathematics for Science
Course Text Books
1) RV Hogg, JW McKean & AT Craig Introduction to Mathematical Statistics, 6th ed., Prentice Hall, 2003 ISBN 0-13-177698-3
2) J Crawshaw & J Chambers A Concise Course in A-Level statistics, with worked examples, 3rd ed. Stanley Thornes, 1994 ISBN 0-534- 42362-0
Course Journals:
1) Journal of Applied Statistics (J. Appl. Stat.) [0266-4763; 1360-0532] 2) Statistics (Statistics) [0233-1888]
Further Reference Text Books And Journals:
a) HJ Larson Introduction to Probability Theory and Statistical Inference(Probability and Mathematical Statistics) 3rd ed., Wiley, 1982
b) Uppal, S. M. , Odhiambo, R. O. & Humphreys, H. M. Introduction to Probability and Statistics. JKUAT Press, 2005
c) I Miller & M Miller John E Freund’s Mathematical Statistics with Applications, 7th ed., Pearsons Education, Prentice Hall, New Jersey, 2003 ISBN: 0131246461
d) Statistical Science (Stat. Sci.) [0883-4237] e) Journal of Mathematical Sciences f) The Annals of Applied Probability
http://www.jstor.org/journals/10505164.html
-
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 2
1. RANDOM VARIABLES 1.1 Introduction In application of probability, we are often interested in a number associated with the outcome
of a random experiment. Such a quantity whose value is determined by the outcome of a
random experiment is called a random variable. It can also be defined as any quantity or
attribute whose value varies from one unit of the population to another.
A discrete random variable is function whose range is finite and/or countable, Ie it can only
assume values in a finite or countably infinite set of values. A continuous random variable is
one that can take any value in an interval of real numbers. (There are uncountably many real
numbers in an interval of positive length.)
1.2 Discrete Random Variables and Probability Mass Function Consider the experiment of flipping a fair coin three times. The number of tails that appear is
noted as a discrete random variable. X= number of tails that appear in 3 flips of a fair coin.
There are 8 possible outcomes of the experiment: namely the sample space consists of
3 , 2 , 2 , 1, 2 , 1, 1 0X
TTT, TTH, THT, THH, HTT , HTH, HHT , HHH
S
are the corresponding values taken by the random variable X.
Now, what are the possible values that X takes on and what are the probabilities of X taking a
particular value?
From the above we see that the possible values of X are the 4 values
3 2, 1, 0, =X Ie the sample space is a disjoint union of the 4 events {X = j } for j=0,1,2,3
Specifically in our example:
TTT =3XTHTHTT,TTH, =2X
THHHTH,HHT, =1XHHH =0X
Since for a fair coin we assume that each element of the sample space is equally likely (with
probability81 , we find that the probabilities for the various values of X, called the probability
distribution of X or the probability mass function (pmf). can be summarized in the following
table listing the possible values beside the probability of that value
X 0 1 2 3
P(X=x) 81
83
83
81
Note: The probability that X takes on the value x, ie x)p(X , is defined as the sum of the
probabilities of all points in S that are assigned the value x.
We can say that this pmf places mass 83 on the value 2=X .
The “masses” (or probabilities) for a pmf should be between 0 and 1.
The total mass (i.e. total probability) must add up to 1.
Definition: The probability mass function of a discrete variable is a graph, table, or formula
that specifies the proportion (or probabilities) associated with each possible value the random
variable can take. The mass function x)P(X (or just p(x) has the following properties:
1 p(x) and1 p(x)0 xall
More generally, let X have the following properties
i) It is a discrete variable that can only assume values nxxx ....,, 21
ii) The probabilities associated with these values are
,)( 11 pxXP 22 )( pxXP ……. nn pxXP )(
-
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 3
Then X is a discrete random variable if
n
i
ii pp1
1 and10
Remark: We denote random variables with capital letters while realized or particular values
are denoted by lower case letters.
Example 1
Two tetrahedral dice are rolled together once and the sum of the scores facing down was noted.
Find the pmf of the random variable ‘the sum of the scores facing down.’
Solution
+ 1 2 3 4
1 2 3 4 5
2 3 4 5 6
3 4 5 6 7
4 5 6 7 8
8,7,6,5,4,3 2, 1, =X
Therefore the pmf of X is given by the table
below
x 2 3 4 5 6 7 8
P(X=x) 161
81
163
41
163
81
161
This can also be written as a function
8,7,6for
5,4,3,2for)(
169
161
x
xxXP
x
x
Example 2
The pmf of a discrete random variable W is given by the table below
W -3 -2 -1 0 1
P(W=w) 0.1 0.25 0.3 0.15 d
Find the value of the constant d, 0w3 P , -1w P and 1w1 P Solution
2.0115.03.025.01.01 w)p(W wall
dd
65.0)1()2()3(0w3 WPWPWPP 35.02.015.01w0w-1w PPP 15.0)0(1w1 WPP
Example 3
A discrete random variable Y has a pmf given by the table below
Y 0 1 2 3 4
P(Y=y) c 2c 5c 10c 17c
Find the value of the constant c hence computes 3Y1 P Solution
351
ally
1)1710521(1 y)p(Y cc
51
355
352)2()1(3Y1 YPYPP
Exercise 1.1
1. A die is loaded such that the probability of a face showing up is proportional to the face number. Determine the probability of each sample point.
2. Roll a fair die and let X be the square of the score that show up. Write down the
probability distribution of X hence compute 51XP and 03X3 P 3. Let X be the random variable the number of fours observed when two dice are rolled
together once. Show that X is a discrete random variable.
4. The pmf of a discrete random variable X is given by 6,5,4,3,2,1for)( xkxxXP
Find the value of the constant k, 4XP and 6X3 P
-
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 4
5. A fair coin is flip until a head appears. Let N represent the number of tosses required to realize a head. Find the pmf of N
6. A discrete random variable Y has a pmf given by .....,2,1,0for)(43 ycyYP
x
Find the value of the constant c and 3X P
7. Verify that kxkk
xx .....,2,1for
)1(
2)f(
can serve as a pmf of a random variable X.
8. For each of the following determine c so that the function can serve as a pmf of a random variable X.
a) 5,4,3,2,1for)f( xcxx
b) kxcxx .....,2,1,0for)f( 2
c) .....3,2,1,0for)f(61 xcx
x
d) ....,2,1,0for2)f( xcx x
9. A coin is loaded so that heads is three times as likely as the tails. For 3 independent tosses of the coin find the pmf of the total number of heads realized and the probability of
realizing at most 2 heads.
1.3 Continuous Random Variables and Probability Density Function A continuous random variable can assume any value in an interval on the real line or in a
collection of intervals. The sample space is uncountable. For instance, suppose an experiment
involves observing the arrival of cars at a certain period of time along a highway on a
particular day. Let T denote the time that lapses before the 1st arrival, then T is a continuous
random variable that assumes values in the interval ),0[
To illustrate the concept of a probability density function, consider a histogram with unequal
class widths. Frequency density is plotted on the vertical axis, where
widthclass
frequency=density frequency .
One can also draw a relative frequency histogram in which the area of a bar corresponds to
the proportion of the data falling into the corresponding interval. The relative frequency is
defined as frequency Total
densityfrequency =frequency Relative .
Eg The relative frequency histogram below shows the distribution of ages of the UK
population.
0
0.005
0.01
0.015
0.02
0 10 20 30 40 50 60 70 80 90 100
Re
lative
fre
qu
en
cy
den
sity
Age
Relative frequency histogram to show the age distribution in the UK
The area of each bar corresponds to the proportion of the population with ages in that
interval. The total area of all the bars is 1.
The distribution of the ages can be modelled by a curve. This curve is called a probability
density function.
-
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 5
Definition: A probability density function
(or p.d.f.) is a curve that models the shape
of the distribution corresponding to a
continuous random variable. The function
has several important properties
If f(x) is the p.d.f corresponding to a
continuous random variable X and if f(x) is
defined for X then the
following properties must hold:
i) 0 )f( x for all x. i.e. the graph of the p.d.f. never dips below the x-axis.
ii) 1d )f(-
xx for all x . i.e. the total area under a p.d.f. is 1.
iii) b
xxbxapa
d )f( i.e. probabilities correspond to areas under the curve.
Example 1: Sketch the graph of each of the following functions. Decide in each case
whether it could be the equation of a probability density function: (a) elsewhere0
1X)f(
3
x
x
b) elsewhere0
6X2)36324()f(
2
32
1
xx
x c) elsewhere0
4X034)f(
2
xx
x
Solution
The function is non-negative everywhere.
For f to represent a p.d.f, we need to check
that 1d )f( xall
xx . But:
2
1
2
1
0
2
2
1
1
3-
xall
)(0d )f(
xdxxxx
So f(x) could not represent a probability
density function.
The function is non-negative everywhere.
If f represents a p.d.f., then 1d )f( xall
xx
1)32(0
3612
36324d )f(
321
6
2
32
321
6
2
2
321
xall
xxx
dxxxxx
So f(x) could represent a probability density function.
-
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 6
The function is clearly negative for some
values of x.
Consequently f(x) cannot represent a
probability density function.
Remark A crucial property is that, for any real number x, we have 0 x)P(X (implying
there is no difference between x)P(X and x)P(X ); that is it is not possible to talk about
the probability of the random variable assuming a particular value. Instead, we talk about the
probability of the random variable assuming a value within a given interval which is defined
to be the area under the graph of the probability density function between bxax and .
Example 2
The time X, in hours, between computer failures is a continuous random variable with density
elsewhere
xe x
,0
0for f(x)
01.0Find hence compute 150X50 P and 100X P
Solution
0 f(x) for all x in x0 Thus 01.01001001 001.00
01.0
xx edxe ..
Now 0.3834005 01.0150X50 5.15.01505001.0150
50
01.0 eeedxePxx and
0.6321206101.0100X 1100001.0100
0
01.0 eedxePxx
Example 3 A continuous random variable X is defined by the probability density function
elsewhere,0
53)2)(5(
31),1(
)f( xxxk
xxk
x
a) Sketch the probability density function. b) Find the value of the constant k. c) Find P(X > 2).
Solution
To find k, we can use the property that
1d )f( xall
xx Note that
2107)2)(5( xxxx .
163
316
215
625
21
23
5
3
3
312
27
3
1
2
211
5
3
23
1 xall
)()(
10
107)1(d )f(
k
xxxxx
dxxxkdxxkxx
k
3229
21
163
2
1
2
21
163
2
1163 )(011)1(1 )2P(X1 )2P(X xxdxx
Example 4 A continuous random variable X has a probability density function given by
elsewhere
xcx
x
,0
32,5.0
20,25.0
f(x) Find c hence compute 5.2X1 P .
-
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 7
Solution
4
3
4
5
21
3
2 21
2
0 41
xall1f(x)dx ccdxcxdx
16
7
16
3
41
5.2
24
3_
41
5.2
2 4
3
21
2
1 41
2
5.2X22X15.2X1 xx
dxxdxPPP
Exercise 1.2
1) Suppose that the random variable X has p.d.f. given by
elsewhere
xcx
,0
10, f(x) Find the
value of the constant c hence determine m so that 21mX P
2) Let X be a continuous random variable with pdf
elsewhere
xkx
,0
30, f(x)
5 Find the value
of the constant k hence compute 3X1 P
3) A continuous random variable Y has the pdf given by
elsewhere
xyk
,0
74),1( f(y) Find
the value of the constant k hence compute 5YP and 6Y5 P 4) The continuous random variable X has probability density function
otherwise,0
20,)1( )f(
2 xxkx Find the value of the constant k k hence compute 5.1XP
5) A continuous random variable R is defined by the probability density function
otherwise,0
20),5( )f(
rrkr
a) Sketch the p.d.f of R
b) Find k hence compute P(1 ≤ R ≤ 3).
6) The life, T hours, of an electrical component is modelled by the probability density
function
otherwise,0
1000T, )f(
001.0 tket
a) Sketch the probability density function.
b) Find the value of the constant k.
c) Find P(1500 ≤ T ≤ 2000).
7) A continuous r.v Y has probability density function
otherwise,0
11),1( )f(
2 xykx Find
the value of the constant k hence compute 5.0YP and 0.5)Y5.0( P
8) Let X be a continuous r.v witha pdf
otherwise,0
10,)1(x)(f
xxkx. Calculate
4
1
2
1 XP
9) The pdf of a random variable X is given by
42,
20,x)(f
3
1
6
1
x
x
Sketch the graph of f(x) hence find )31( xP
10) A continuous random variable X has the pdf given by
elsewhere
xk
xxk
x
,0
04
02,)1(
)f(3
4
2
Find
the value of the constant k hence compute 1XP and 1X1 P
-
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 8
11) A continuous random variable X has the pdf given by
elsewhere
xxk
xkx
,0
424
20,
f(x) Find
the value of the constant k hence compute 3XP and 3X1 P 12) The length of time in minutes a customer queue in a post office is a random variable T with pdf
elsewhere
ttct
,0
90,81)f(
2
a) Find the value of the constant c and the cdf of T
b) Find the probability that a customer will wait for more than 3 minutes
c) A customer has been queuing for 3 minutes, find the probability that this customers will be
queuing for at least 7 minutes
d) Three customers are selected at random, find the probability that exactly 2 of them had to
queue for longer than 3 min
1.4 Distribution Function of a Random Variables Definition: For any random variable X, we define the cumulative distribution function (CDF),
)F(x as
continuous is X Iff(t)dt
discrete is X Ift)P(TP)F(
x
x
txXx for every x.
Here t is introduced to facilitate summation /integration//
Properties of any cumulative distribution function
0 )F( lim and 1 )F( limxx
xx
)F(x is a non-decreasing function.
)F(x ) is a right continuous function of x. In other words )F( F(t) limt
xx
Reminder If the cdf of X is )F(x and the pdf is )f(x , then differentiate F(x) to get f(x), and
integrate )f(x to get )F(x ;
Theorem: For any random variable X and real values a < b, )F(-)F(P abbXa
Example 1
Let X be a discrete random variable with pmf given by
elsewhere
xxx
,0
5,4,3,2,1for1 )f(
201
.
Determine the cdf of X hence compute 3P X Solution
40
)3(
220
1
20
1
120
1 14)....32(1)(tf(t))F(
xxx
x
t
x
t
xxx
5for1
5,4,3,2,1for
1for0
)F(40
)3(
x
x
x
xxx
Recall for an AP dnaS nn 122
2011
406313P13P XX
-
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 9
Example 2
Suppose X is a continuous random variable whose pdf f(x) is given by
elsewhere
xxx
,0
20, )f( 2
1
. Obtain the cdf of X hence compute 32P X
Solution
241
0
0
2
41
21
x
-
tdtf(t)dt )F( xtx
xx
thus
2,1
20,
0,0
)F(4
2
x
x
x
x x
982
32
41
32
32 1P1P XX
Example 3
A continuous random variable X has a probability density function given by
elsewhere
xcx
x
x
,0
32,5.0
20,25.0
)f( Find the cdf of X hence compute 5.2X5.1P .
Solution
kk
xx
x
x
21
43x-
2 43
21
4x
0 41x
-2
dt-t
dtf(t)dt )F(
Under the two levels, F(2) must be the same. (reason for introducing k)
k21F(2)
2for,1
32for 1
20for
0for 0
)F(
43x-
4
2
x
x
x
x
xx
x
3125.01F(1.5)-F(2.5)5.2X5.1P441.5)5.2(35.2
2
Exercise 1.3
1. The CDF of a discrete random variable X is given by 3,2,1,40
)(F3
xkx
X
a) Show that k=13 b) Find the pmf of x
2. The pdf of a continuous random variable X is given by
elsewhere
xx
xC
,0
40, )f( Find the
value of the constant C, the cdf of X and 1XP
3. The pdf of a random variable X is given by
elsewhere
xxkxx
,0
10),1( )g( Find the value
of the constant k, the cdf of X and the value of x such that 21 G(x)
4. A continuous random variable x has cumulative distribution function
4,1
42,
2),0
)F(3
1
6
1
x
xx
x
x Find the
probability density function f(x) of X.
-
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 10
5. A random variable X has p.d.f.
elsewhere,0
20),1( )g(
3
61 xx
x Find the
c.d.f. of X hence obtain P(X < 1).
6. A random variable X has pdf
otherwise
kxx
x
x
,0
1
10,
)f(21
21
where k is a
constant. Sketch a graph of f(x) hence
find the value of k and the cdf of x
hence compute )5.15.0( xP
7. A continuous random variable X has pdf f(x) given by
elsewhere
xx
x
xx
x
,0
104,30
10
43,2.0
30,45
)f(
2
Sketch f(x). Also find the cdf of x and
hence compute )8( Xp
8. If the cdf of a random variable Y is given by 3for9
1 )F(2
Yy
y and
3for0 F(x) Y , find 5XP , 8XP and the pdf of X
9. A random variable X has a cdf 422F(X) xx for 1,0x . Compute )(4
3
4
1 xP and
also find the pdf of x
10. Find the cdf of a random variable Y whose pdf is given by;
a)
elsewhere
x
x
x
,0
42,
10,
)f( 313
1
b)
elsewhere
x
x
x
xx
x
,0
32,
21,
10,
)f(2
)3(
21
2
11. The continuous random variable x has probability density function f(x) given by
elsewhere
xk
xxxk
,0
433
30),22(
f(X)
2
where k is a constant. Show that 9
1k
a) Find the cumulative distribution function F(x) and the mean of X. b) Show that the median of X lies between x = 2.6 and x = 2.7
12. The lifetime, X, in tens of hours, of a battery has a cumulative distribution function F(x)
given by
5.1,1
5.11,)32(
1,0
)F( 294
x
xxx
x
x
a) Find the median of X, giving your answer to 3 significant figures. b) Find, in full, the probability density function of the random variable X. c) Find P(X> 1.2)
13. The continuous random variable T represents the time in hours that students spend on homework. The cumulative distribution function of T is
5.1,1
5.10,)2(
0,0
)F( 43
t
tttk
t
t where k is a positive constant.
a) Show that 2716k hence and find the probability density function f(t) of T.
b) Find the proportion of students who spend more than 1 hour on homework.
-
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 11
c) Show that E(T ) = 0.9 and F(E(T )) = 0.4752. d) A student is selected at random. Given that the student spent more than the mean
amount of time on homework, find the probability that this student spent more than 1
hour on homework.
1.5 Derived Random Variables Give the pdf of a random variable say X, we can obtain the distribution of a second random
variable say Y provided that we know some functional relationship between X and Y say
).( Y Xu Eg cXXX Y, Y,32 Y 3 etc
For a 1-1 relationship between X and Y eg ,32 Y X g(y) )andf(x yields exactly the same
probabilities only the random variable and the set of values it can assume changes.
Example 1
Give the pmf of a random variable X as
elsewhere
xx
x
,0
3,2,1for )f(
6 find the pmf of 2 Y X
Solution
The only values of Y with non-zero probabilities are 9=Y and 4=Y ,1=Y . Now
6
12 1=X1=X1=Y PPP 312 2=X4=X4=Y PPP and
212 3=X9=X9=Y PPP
In some cases several values of X will give rise to the same value of Y. The procedure is just
the same as above but it is necessary to add the several probabilities that are associated with
each value x that provides a unique value y.
Example 2 Given the pmf of a r.v X as
elsewhere
xx
x
,0
4,3,2,1,0for15
1
)f( find the pmf of
22 Y X Solution
x 0 1 2 3 4
y 4 1 0 1 4
512=X0=Y PP
52
154
1523=X1=X1=Y PPP
52
31
1514=X0=X1=Y PPP Therefore the pmf of Y can be written as
\Exercise 1.4
1. Suppose the pmf of a r.v X is given by
elsewhere
xx
,0
6,5,4,3,2,1for )f(
61
, Obtain the pmf of
22 Y X and 3 Z X
2. Let the pmf of a r.v X be given by
elsewhere
xx
x
,0
3,2,1,0for18
1
)f(
2
, determine the pmf of
1 Y 2 X
Y 0 1 4
y=YP 52
52
52
-
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 12
3. Suppose the pmf of a r.v X is given by
elsewhere
xx
x
,0
4,3,2,1,0for )f(
10, Obtain the pmf of
2 Y X
4. Let the pmf of a r.v X be given by
elsewhere
xx
x
,0
....,3,2,1for )f( 2
1
, determine the pmf of
even is x if0
odd is x if1 Y
5. Suppose the r.v X has a pmf given by
elsewhere
xx
x
,0
....,3,2,1,0for )f( 6
5
6
1
, Obtain the pmf
of 1 Y X
6. Let X be a discrete r.v with pmf as tabulated below. Find the pmf of 2)2( Y X
X -2 -1 0 1 2
P(X=x)
1.6 Change of Variable Technique
.Let X be a r.v with pdf f(x) and let Y be a function of X, then the pdf of Y, y
xxy
d
d)f()g(
NB If F(x) is the cdf of a r.v X, then a r.v F(x)Y has a uniform distribution over 1 , 0
Example 1
A continuous r.v X has a pdf given by
otherwise
xxx
0
10 ,5)f(
4
. Determine the pdf of 3Y x
Solution
32
31
3
1
dy
dxY 3
YYXx
otherwise
xyyy
0
10 ,
3
15
dy
dxf(x)g(y)
32
32
31 3
54
Example 2 A r.v X has pdf
otherwise
xxx
0
0 ,24)f( 2
12
determine the pdf of 38Y X
Solution
32
31
61
213
dy
dx8Y
yYXx
elsewhere
xyy
,0
10 ,124
dy
dxf(x)g(y) 3
23
1
61
2
21
NB 38Y X is the cdf of X
Exercise 1.5
1. For a r.v X with
otherwise
xxx
0
10 ,5)f(
4
, determine the pdf of xln2Y and its range.
2. A r.v X has pdf
otherwise
xex
x
0
0 ,)f( determine the pdf of 4Y X
4
1
8
1
8
1
4
1
4
1
-
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 13
3. Let X be a continuous random variable with density function 2for )f( 21 xxx
and
otherwisex 0)f( . Determine the density function of 1
1Y
X
.
4. The probability density function of X is given by
otherwise
Xxx
0
)4(
2
)f(2 Obtain
the probability density function of )(tan2
1 xY
5. Which transformation will change a r.v X with pdf is as below to a uniform R.V Y whose
range is 10 x a)
otherwise
xex
x
0
0 ,2)f(
2
b)
otherwise
xx
0
53 3),-(x)f(
21
6. Suppose that X has probability density function
elsewhere
xxx
,0
1forln )f(
If 2XU , what
is the probability density function f(u) of U?
1.7 Expectation and Variance of a Random Variable
1.7.1 Expected Values One of the most important things we'd like to know about a random variable is: what value
does it take on average? What is the average price of a computer? What is the average value
of a number that rolls on a die? The value is found as the average of all possible values,
weighted by how often they occur (i.e. probability)
Definition: Let X be a random variable with probability distribution )( xXp . Then the
expected value of X, denoted or )(XE , is given by;
.
continuous is X if)(
discrete is X if)()E(
dxxXxp
xXxpx x
Theorem: Let X be a r.v. with probability distribution p(X=x) and let g(x) be a real-valued
function of X. ie , then the expected value of g(x) is given by
.continuous is X if)()(
discrete is X if)()()g(E
dxxXpxg
xXpxgx x
Theorem: Let X be a r.v. with probability distribution )( xXp . Then
(i) E(c) = c, where c is an arbitrary constant.
(ii) ba baxE where a and b are arbitrary constants (iii) g(x)Ekg(x)E k where g(x) is a function of X
(iv) (x)gE(x)gE(x)g(x)agE 2121 bab and in general
n
ic1i
i
n
1i
ii (x)gE(x)gcE
are functions of X. This property of expectation is called linearity property
Proof
-
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 14
We will sketch the proof using a continuous random variable since the proof using the
discrete random variable is similar and was also discussed in probability and statistics I.
ccdxxXPcdxxXcPxAllxAll
)1()()(cE(i)
badxxXPdxxXxP
dxxXPdxxXxPdxxXPxx
xAllxAll
xAllxAllxAll
)(b)(a
)(b)(a)()ba(baE(ii)
g(x)E)()()()(kg(x)E(iii) kdxxXPxgkdxxXPxkgxAllxAll
iiipart from ie(x)gE(x)gE(x)g(x)agE
)((x)g)((x)ag)((x)g(x)ag(x)g(x)agE(iv)
2121
212121
babE
dxxXPbdxxXPdxxXPbbxAllxAllxAll
1.7.2 Variance and Standard Deviation
Definition: Let X be a r.v with mean )(XE , the variance of X, denoted Var(X)or 2 , is
given by .)( )( 22 XEXVar The units for variance are square units. The quantity that has
the correct units is standard deviation, denoted . It’s actually the positive square root of Var(X) .
.)()( 2 XEXVar
Theorem: 222 )()()( XEXEXVar
Proof:
2222222 )()(2)(2)()( XEXEXEXXEXEXVar Theorem: )var()( 2 XabaXVar
Proof:
Recall that babaXE therefore
)var()()()( 2222222 XaXEaXaEXaEbabaXEbaXVar Remark
(i) The expected value of X always lies between the smallest and largest values of X. (ii) In computations, bear in mind that variance cannot be negative!
Example 1
Given a probability distribution of X as below, find the mean and standard deviation of X.
X 0 1 2 3
P(X=x) 1/8 1/4 3/8 1/4
Solution
X 0 1 2 3 total
)( xXp 81 41 83 41 1
)( xXxp 0 41 43 43 47
)(2 xXpx 0 41 23 49 4
75.1)()(3
0
x
xXxpXE and
standard deviation
0.96824675.14)( 222 XE
-
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 15
Example 2 The probability distribution of a r.v X is as shown below, find the mean and
standard deviation of; a) X b) 612 XY .
X 0 1 2
P(X=x) 1/6 1/2 1/3
Solution
X 0 1 2 total
)( xXp 61 21 31 1
)( xXxp 0 21 32 67
)(2 xXpx 0 21 34 611
67
2
0
)()( x
xXxpXE and
611
2
0
22 )()( x
xXpxXE
Standard deviation 1.6833)()( 6172
67
61122 XE
Now 206)(126)(12)( 67 XEYE
242.38812144)(12)612()( 6172 XVarXVarYVar
Example 3 A continuous r.v X has a pdf given by
elsewhere
xx
,0
20, f(x)
21
, find the mean
and standard of X
Solution
3
4dx)f(E(x)
2
0
2
0
3
612
21
xxdxxx and 2dx)f()E(x
2
0
2
0
4
813
2122
xxdxxx
Standard deviation 3
223
422 )(2)( XE
Exercise 1.6
1. Suppose X has a probability mass function given by the table below
X 2 3 4 5 6
P(X=x) 0.01 0.25 0.4 0.3 0.04
Find the mean and variance of; X
2. Suppose X has a probability mass function given by the table below
X 11 12 13 14 15
P(X=x) 0.4 0.2 0.2 0.1 0.1
Find the mean and variance of; X
3. Let X be a random variable with P(X = 1) = 0.2, P(X = 2) = 0.3, and P(X = 3) = 0.5. What is the expected value and standard deviation of; a) X b) 105 XY ?
4. A random variable W has the probability distribution shown below,
W 0 1 2 3
P(W=w) 2d 0.3 d 0.1
Find the values of the constant d hence determine the mean and variance of W. Also find
the mean and variance of 2510 XY
5. A random variable X has the probability distribution shown below,
X 1 2 3 4 5
P(X=x) 7c 5c 4c 3c c
Find the values of the constant c hence determine the mean and variance of X.
6. The random variable Z has the probability distribution shown below,
Z 2 3 5 7 11
-
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 16
P(Z=z) 61 31 41 x y
If 324)( ZE , find the values of x and y hence determine the variance of Z
7. A discrete random variable M has the probability distribution
elsewhere
mm
,0
8,...,3,2,1, f(m) 36 ,
find the mean and variance of M
8. For a discrete random variable Y the probability distribution is
elsewhere
yy
,0
4,3,2,1, f(y) 10
5
,
calculate )(YE and var(Y)
9. Suppose X has a pmf given by
elsewhere
xkx
,0
4,3,2,1for f(x) , find the value of the constant k
hence obtain the mean and variance of X
10. A discrete random variable X has a probability function
elsewhere
xxk
xkx
xXP
,0
8,2
6,4,2,
)(
a) Show that 181k hence find )(XE and )(
2XE
b) Calculate X43var , give your answer to 3 significant figures.
11 A biased die with six faces is rolled. The discrete random variable X represents the score
of the uppermost face. The probability distribution of X is shown in the table below
X 1 2 3 4 5 6
P(X=x) A a a b B 0.3
a) Given that E(X) = 4.2, find the values of a and b.
b) Show that 4.202 XE c) Find x35var
12 A discrete random variable X has probability function
elsewhere
xxkxX
,0
2,1,0,1,1)(P
2
a) Show that 6
1k
b) Find )(XE
c) Show that 3
42)( XE
d) Find x31var 13 A team of 3 is to be chosen from 4 girl and 6 boys. If X is the number of girls in the
team, find the probability distribution of X hence determine the mean and variance of X
14 A fair six sided die has; ‘1’ on one face, ‘2’ on two of its faces and ‘3’ on the remaining three faces. The die is rolled twice. If T is the total score write down the probability
distribution of T hence determine; a) the probability that T is more than 4. b) the mean
and variance of T
15 The pdf of a continuous r.v R is given by
elsewhere
rkr
,0
40for f(r) , (a) Determine c. hence
Compute )20 P(1 r , Var(X). and E(X)
16 A continuous r.v M has the pdf given by
elsewhere
Mk m
,0
10for)1( f(m) 10 , find the value of
the constant k, the mean and the variance of X
-
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 17
17 A continuous r.v X has the pdf given by
elsewhere
xxk
,0
10for)1( f(x) , findt the value of
the constant k. Also find the mean and the variance of X
18 The lifetime of new bus engines, T years, has continuous pdf
1f,0
1f f(t)
2
Ti
Tit
d
find the
value of the constant d hence determine the mean and standard deviation of T
19 An archer shoots an arrow at a target. The distance of the arrow from the centre of the
target is a random variable X whose p.d.f. is given by
3f,0
3f)x-2x+3( f(x)
2
xi
xik find
the value of the constant k. Also find the mean and standard deviation of X 20 The random variable Y has probability density function f(y) given by
elsewhere
xyaky
,0
30),( f(y) where k and a are positive constants.
a) Explain why a ≥ 3 and then show that )2(9
2
ak
b) Given that E(Y) = 1.75 , show that a = 4 and write down the value of k.
c) For these values of a and k, sketch the probability density function,
d) Write down the mode of Y.
21 A continuous random variable x has the following pdf
otherwise
xbxax
,0
50, )f( where a and b
are constants. Show that 22510 ba
a) Given 12
35XE , find a second equation in a and b hence find the values of a and b.
b) Find the median of X
22 The queuing time X minutes of a customer at a till of a supermarket has a pdf
otherwise
kxxkxx
,0
0,)( )f( 32
3
a) Show that 4k . Also find )(xE and )var(x
b) Find the probability that a randomly chosen customers queuing time will differ from the mean
by at least half a minute
23 The probability density function f(x) can be written in the following form.
otherwise
xaxb
xax
x
,0
42,
20,
)f(
a) Find the values of the constants a and b. b) Show that σ, the standard deviation of X, is 0.816 to 3 decimal places.
c) Find the lower quartile of X. d) State, giving a reason, whether P(2 – σ < X < 2 + σ) is more or less than 0.5
24 A continuous r.v X has the pdf given by
elsewhere
xxk
xxk
,0
10),1(2
01,)1(
f(x) , find the value of
the constant k. Also find the mean and the variance of X
-
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 18
25 A continuous r.v X has the pdf given by
elsewhere
xe x
,0
0for f(x) , find the mean and
standard deviation of; a) X b) x
e 43
Y
1.8 Mode, Median, Quartiles and Percentiles Another measure commonly used to summarize random variables are the mode and median;
Mode
Mode is the value of x that maximizes the pdf. That is the value of x for which 0= (x)'f .
Suppose the p.d.f of a random variable X is defined by the function f(x) for a ≤ x ≤ b.
The mode of X is an x value that produces the largest value for f(x) in the interval a ≤ x ≤ b.
A sketch of the probability density function can be very helpful when determining the mode.
For the 3rd figure, mode can be found using differentiation.
Median
The median, m, of a random variable X is defined to be the value such that “half of the
distribution lies to the left of m and half to the right”. More formally, m should
satisfy 0.5m)P(X= (m)FX where F is the cumulative distribution function of X.
Note: If there is a value m such that the graph of y= f(x) is symmetric about x=m, then both
the expected value and the median of X are equal to m.
The lower quartile lQ and the upper quartile 3Q are similarly defined by
0.25= )(QF lX and 0.75= )(QF 3X
Thus, the probability that X lies between lQ and 3Q is 0.5= 0.25-0.75 , so the quartiles give
an estimate of how spread-out the distribution is. More generally, we define the nth percentile
of X to be the value of xn such that 100nnX or 0.01n = )(xF , that is, the probability that X is
smaller than xn is n%.
Example A random variable X has the pdf given by
elsewhere
xx
,0
10,2 f(x) State the mode
hence find the lower, middle and upper quartiles.
Solution
On the interval 10 x , f(x) is maximized at 1x so the mode is 1
On the interval 10 x , the cdf of X is given by 2 F(x) x thus
a) At lower quartile lQ , 5.025.0Q25.0Q )F(Q l2
ll b) At median m,
2
12 5.05.0 F(m) mm
c) At upper quartile u 2
3
3
2
33 75.0Q75.0Q )F(Q
Qn Find the 64th percentile of the pdf in the above example.
-
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 19
Exercise 1.7
1) A continuous r.v X has the pdf given by
elsewhere
xxk
,0
10for)1( f(x) , find the mode.
2) A random variable X has p.d.f.
otherwise,0
20)2( f(x)
2 xxxFind the mode of X.
3) Let X be a continuous random variable with density function
elsewhere
xe x
,0
0for f(x) 22
1
.
Determine the 25th percentile of the distribution of X.
4) A random variable X is defined by the cumulative distribution function
5,1
52),6(
2,0
)F( 2
x
xxxk
x
x
a) Find the value of the constant k hence work out P(3 ≤ X ≤ 4). b) Obtain and sketch the probability density function. c) State the mode. Also find median value and the interquartile range.
5) A random variable X has p.d.f. f(x), where
otherwise,0
42)2(
211)32(
f(x)21
2
xxk
xxxk
a) Find the C.D.F of X and verify that the lower quartile is at x = 2. b) Obtain the mode andt the median value of X.
2. PROBABILITY DISTRIBUTION 2.1 Discrete Distribution Among the discrete distributions that we will discuss in this topic includes the Bernoulli,
binomial, Poisson, geometric and hyper-geometric
Definition: A Bernoulli trial is a random experiment in which there are only two possible
outcomes - success and failure. Eg
Tossing a coin and considering heads as success and tails as failure.
Checking items from a production line: success = not defective, failure = defective.
Phoning a call centre: success = operator free; failure = no operator free. A Bernoulli random variable X takes the values 0 and 1 and pXP )1( and
pXP 1)0(
Definition: A r.v X is said to be a real Bernoulli distribution if it’s pmf is given by;
otherwise
xforppxXP
xx
0
1,0)1()(
1
We abbreviate this as B(p) ~ X ie p is the only parameter here. It can be easily checked that
the mean and variance of a Bernoulli random variable are p and )11(2 pp
2.1.1 Binomial Distribution Consider a sequence of n independent, Bernoulli trials, with each trial having two possible
outcomes, success or failure. Let p be the probability of a success for any single trial. Let X
-
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 20
denote the number of successes on n trials. The random variable X is said to have a binomial
distribution and has probability mass function
nxforppCxXP xnxrn .....2,1,0)1()(
We abbreviate this as p), (Bin ~ X n read as “X follows a binomial distribution with
parameters pn and ”. rn C Counts the number of outcomes that include exactly x successes
and xn failures.
The mean and variance of a Binomial random variable are respectively given by;
np and )11(2 pnp
Let’s check to make sure that if X has a binomial distribution, then 1)(0
n
x
xXP . We will
need the binomial expansion for any polynomial:
111)1( therefore00
nnn
x
xnx
xn
n
x
xnx
rn
nppppCqpCqp
Example 1
A biased coin is tossed 6 times. The probability of heads on any toss is 0:3. Let X denote the
number of heads that come up. Calculate: (i) )2( XP (ii) )3( XP (iii) )51( XP
Solution
If we call heads a success then X has a binomial distribution with parameters n=6 and p=0:3.
(i) 0.324135)7.0()3.0()2( 4226 CXP
(ii) 0.18522)7.0()3.0()3( 3336 CXP
0.5780.01 0.059 0.185 0.324
)5()4()3()2()51( (iii)
XPXPXPXPXP
Example 2 A quality control engineer is in charge of testing whether or not 90% of the
DVD players produced by his company conform to specifications. To do this, the engineer
randomly selects a batch of 12 DVD players from each day's production. The day's
production is acceptable provided no more than 1 DVD player fails to meet specifications’.
Otherwise, the entire day's production has to be tested.
a) What is the probability that the engineer incorrectly passes a day's production as acceptable if only 80% of the day's DVD players actually conform to specification?
b) What is the probability that the engineer unnecessarily requires the entire day's production to be tested if in fact 90% of the DVD players conform to specifications?
Solution
Let X denote the number of DVD players in the sample that fail to meet specifications.
a) In part a we want )1( XP with binomial parameters 0.2p and 12 n
0.275 0.206 0.069
)8.0()2.0()8.0()2.0()1( )0()1( 111112120
012
CCXPXPXP
b) In part b we require )1(1)1( XPXP with parameters 0.1p and 12 n .
0.659 )9.0()1.0()9.0()1.0()1( )0()1( 111112120
012 CCXPXPXP
0.341 1) P(X So
Example 3 Bits are sent over a communications channel in packets of 12. If the probability
of a bit being corrupted over this channel is 0.1 and such errors are independent, what is the
probability that no more than 2 bits in a packet are corrupted?
-
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 21
If 6 packets are sent over the channel, what is the probability that at least one packet will
contain 3 or more corrupted bits?
Let X denote the number of packets containing 3 or more corrupted bits. What is the
probability that X will exceed its mean by more than 2 standard deviations?
Solution
Let C denote the number of corrupted bits in a packet. Then in the first question, we want
)2( )1( )0()2( CPCPCPCP
0.889. 0.23 0.377 0.282
)9.0()1.0()9.0()1.0()9.0()1.0( 102212111
112
120
012
CCC
Implying the probability of a packet containing 3 or more corrupted bits is
0.111. = 0.889 - 1)2(1)3( CPCP
Therefore X=’number of packets containing 3 or more corrupted bits” can be modelled with a
binomial distribution with parameters 0.111p and 6 n . The probability that at least one
packet will contain 3 or more corrupted bits is:
0.494.)889.0((0.111)- 1)0(1)1( 6006 CXPXP
The mean of X is 0.666 = 6(0.111) =E(X) and its standard deviation is
0.77 =(0.889) 6(0.111) =
So the probability that X exceeds its mean by more than 2 standard deviations is
)3()2.2()2( XPXPXP since X is discrete.
422651166006 )889.0((0.111))889.0((0.111))889.0((0.111)- 1)2()1()0(1)2(1)3( Now
CCC
XPXPXPXPXP
0.032 0.1026) 0.3698 (0.4936 -1
Exercise 2.1
1. A fair coin is tossed 10 times. What is the probability that exactly 6 heads will occur. 2. If 3% of the electric bulbs manufactured by a company are defective find the probability
that in a sample of 100 bulbs exactly 5 bulbs are defective.
3. Suppose that 10% of inmates in a large prison are known to be innocent. A non-profit group randomly selects 20 inmates from this prison. Find the probability the group will
find at least 3 innocent inmates.
4. An oil exploration firm is formed with enough capital to finance 10 explorations. The probability of a particular exploration being successful is 0.1. Find the mean and variance
of the number of successful explorations.
5. Emily hits 60% of her free throws in basketball games. She had 25 free throws in last week’s game.
a) What is the expected number and the standard deviation of Emily’s hit? b) Suppose Emily had 7 free throws in yesterday’s game, what is the probability that she
made at least 5 hits?
6. A coin is loaded so that heads has 60% chance of showing up. This coin is tossed 3 times. a) What are the mean and the standard deviation of the number of heads that turned out? b) What is the probability that the head turns out at least twice? c) What is the probability that an odd number of heads turn out in 3 flips?
7. According to the 2009 current Population Survey conducted by the U.S. Census Bureau, 40% of the U.S. population 25 years old and above have completed a bachelor’s degree or
more. Given a random sample of 50 people 25 years old or above, what is expected
number of people and the standard deviation of the number of people who have
completed a bachelor’s degree.
-
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 22
8. Joe throws a fair die six times and face number 3 appeared twice. It he incredibly lucky or unusual?
9. If the probability of being a smoker among a group of cases with lung cancer is .6, what’s the probability that in a group of 8 cases you have; a) less than 2 smokers? b) More than
5? c) What are the expected value and variance of the number of smokers?
10. The manufacturer of the disk drives in one of the well-known brands of microcomputers expects 2% of the disk drives to malfunction during the microcomputer’s warranty
period. Calculate the probability that in a sample of 100 disk drives, that not more than
three will malfunction
11. Manufacturer of television set knows that on an average 5% of their product is defective. They sells television sets in consignment of 100 and guarantees that not more than 2 set
will be defective. What is the probability that the TV set will fail to meet the guaranteed
quality?
12. Suppose 90% of the cars on Thika super highways does over 17 km per litre. a) What is the expected number and the standard deviation of cars on Thika super
highways that will do over 17 km per litre.in a random sample of 15 cars ?
b) What is the probability that in a random sample of 15 cars exactly 10 of these will do over 17 km per litre?
2.1.2 Poisson distribution Named after the French mathematician Simeon Poisson, the distribution is used to model the
number of events, (such as the number of telephone calls at a business, number of customers in
waiting lines, number of defects in a given surface area, airplane arrivals, or the number of
accidents at an intersection), occurring within a given time interval. Other such random events
where Poisson distribution can apply includes;
the number of hits to your web site in a day
the number of calls that arrive in each day on your mobile phone
the rate of job submissions in a busy computer centre per minute.
the number of messages arriving to a computer server in any one hour. Poisson probabilities are useful when there are a large number of independent trials with a
small probability of success on a single trial and the variables occur over a period of time. It
can also be used when a density of items is distributed over a given area or volume. The
formula for the Poisson probability mass function is ....,2,1,0,!
)(
xx
exXP
x This is
abbreviated as ) ( Po ~ X . is the shape parameter which indicates the average number of
events in the given time interval. The mean and variance of this distribution are equal ie
2
Let’s check to make sure that if X has a poisson distribution, then 1)(0
x
xXP . We will
need to recall that .....!4!3!2!1
1432
e . Consequently
1!!
)( 0
000
eeex
ex
exXP
x
x
x
x
x
Remark The major difference between Poisson and Binomial distributions is that the
Poisson does not have a fixed number of trials. Instead, it uses the fixed interval of time or
space in which the number of successes is recorded.
-
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 23
Example 1 Consider a computer system with Poisson job-arrival stream at an average of 2
per minute. Determine the probability that in any one-minute interval there will be
a) 0 jobs; b) exactly 3 jobs; c) at most 3 arrivals. d) more than 3 arrivals Solution
Job Arrivals with =2
a) No job arrivals: 0.1353353)0P(X 2 e
b) Exactly 3 job arrivals: 0.1804470!3
2)3(
23
e
XP
c) At most 3 arrivals
d) 0.8571 !3
2
2
2
1
21)3()2()1()0()3( 2
32
eXPXPXPXPXP
e) more than 3 arrivals 0.1429 0.8571 - 1 )3(1)3( XPXP
Example 2 If there are 500 customers per eight-hour day in a check-out lane, what is the
probability that there will be exactly 3 in line during any five-minute period?
Solution
The expected value during any one five minute period would be 500 / 96 = 5.2083333. The
96 is because there are 96 five-minute periods in eight hours. So, you expect about 5.2
customers in 5 minutes and want to know the probability of getting exactly 3.
(approx) 0.1288!3
(-500/96))3(
500/96-3
e
XP
Example 3 If new cases of West Nile in New England are occurring at a rate of about 2 per
month, then what’s the probability that exactly 4 cases will occur in the next 3 months?
Solution
X ~ Poisson (=2/month)
%4.13!4
6
!4
)3*2( months) 3in 4P(X
)6(4)3*2(4
ee
Exactly 6 cases?
%16!6
6
!6
)3*2( months) 3in 6P(X
)6(6)3*2(6
ee
Exercise 2.2
1. Calculate the Poisson distribution whose λ (Average Rate of Success)) is 3 & X (Poisson Random Variable) is 6.
2. Customers arrive at a checkout counter according to a Poisson distribution at an average of 7 per hour. During a given hour, what are the probabilities that
a) No more than 3 customers arrive?
b) At least 2 customers arrive?
c) Exactly 5 customers arrive?
3. It is known from the past experience that in a certain plant there are on the average of 4 industrial accidents per month. Find the probability that in a given year will be less that 3
accidents.
4. Suppose that the change of an individual coal miner being killed in a mining accident during a year is 1.1499. Use the Poisson distribution to calculate the probability that in
the mine employing 350 miners- there will be at least one accident in a year.
5. The number of road construction projects that take place at any one time in a certain city follows a Poisson distribution with a mean of 3. Find the probability that exactly five road
construction projects are currently taking place in this city. (0.100819)
-
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 24
6. The number of road construction projects that take place at any one time in a certain city follows a Poisson distribution with a mean of 7. Find the probability that more than four
road construction projects are currently taking place in the city. (0.827008)
7. The number of traffic accidents that occur on a particular stretch of road during a month follows a Poisson distribution with a mean of 7.6. Find the probability that less than three
accidents will occur next month on this stretch of road. (0.018757)
8. The number of traffic accidents that occur on a particular stretch of road during a month follows a Poisson distribution with a mean of 7. Find the probability of observing exactly
three accidents on this stretch of road next month. (0.052129)
9. The number of traffic accidents that occur on a particular stretch of road during a month follows a Poisson distribution with a mean of 6.8. Find the probability that the next two
months will both result in four accidents each occurring on this stretch of road. (0.00985)
10. Suppose the number of babies born during an 8-hour shift at a hospital's maternity wing follows a Poisson distribution with a mean of 6 an hour. Find the probability that five
babies are born during a particular 1-hour period in this maternity wing. (0.160623)
11. The average number of claims per day made to the Insurance Company for damage or losses is 3.1. What is the probability that in any given day; (i)exactly 2 (ii) at most 2
(iii) more than 2 claims will be made?
12. The university policy department must write, on average, five tickets per day to keep department revenues at budgeted levels. Suppose the number of tickets written per day
follows a Poisson distribution with a mean of 8.8 tickets per day. Find the probability that
less than six tickets are written on a randomly selected day from this distribution.
(0.128387)
13. A taxi firm has two cars which it hires out day by day. The number of demands for a car on each day is distributed as Poisson distribution with mean 1.5. Calculate the proportion
of days on which neither car is used and the proportion of days on which some demands
is refused
14. If calls to your cell phone are a Poisson process with a constant rate =0.5 calls per hour, what’s the probability that, if you forget to turn your phone off in a 3 hour lecture, your
phone rings during that time? How many phone calls do you expect to get during this
lecture?
15. The average number of defects per wafer (defect density) is 3. The redundancy built into the design allows for up to 4 defects per wafer. What is the probability that the
redundancy will not be sufficient if the defects follow a Poisson distribution?
16. The mean number of errors due to a particular bug occurring in a minute is 0.0001 a) What is the probability that no error will occur in 20 minutes? b) How long would the program need to run to ensure that there will be a 99.95% chance
that an error wills show up to highlight this bug?
Properties of Poisson
The mean and variance are both equal to .
The sum of independent Poisson variables is a further Poisson variable with mean equal to the sum of the individual means.
As well as cropping up in the situations already mentioned, the Poisson distribution provides an approximation for the Binomial distribution.
2.1.3 Geometric Distribution
Suppose a Bernoulli trial with success probability p is performed repeatedly until the first
success appears we want to find the probability that the first success occurs on the yth trial. ie
let Y denote the number of trials needed to obtain the first success. The sample space
S={s;fs;ffs, fffs ffffs …}. This is an infinite sample space (though it is still discrete). What is
-
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 25
the probability of a sample point, say 4)P(Y ffs)P( f )? Since successive trials are
independent (this is implicit in the statement of the problem), we have
pq 4)P(Y ffs)P( 3f where 10 and 1 ppq
Definition: A r.v. Y is said to have a geometric probability distribution if and only if
otherwise
pqypq y
0
1 where.....3,2,1for y)P(Y
1
. Abbreviated as Y ~ Geo(p).
The only parameter for this geometric distribution is p (ie the probability of success in each
trial). To be sure everything is consistent; we should check that the probabilities of all the
sample points add up to 1. Now
11
)(1
1
1
q
ppqyYP
y
y
y
Recall sum to infinity of a convergent G.P is r
as
1
The cdf of a geometric distributions is given by
yyy qq
qppqpqpqp
11
1.....
y)P(Y... 3)P(Y 2)P(Y 1)P(Y y)P(Y F(y)
12
Let Y~ Geo(p), then p
YE1
)( and 2
2)(p
qXVar Show?
Example 1 A sharpshooter normally hits the target 70% of the time. Find the;
a) probability that her first hit is on the second shot b) mean and standard deviation of the number of shots required to realize the 1st hit Solution
Let X be the random variable ‘the number of shoots required to realize the 1st hit’
)7.0(~ Geox and ....,3,2,1,7.017.0)( 1 xxXP x a) 21.03.07.01)2( pXP
b) 1.4285717.0
11
and 78.0
7.0
7.01
p
p-1
Example 2
The State Department is trying to identify an individual who speaks Farsi to fill a foreign
embassy position. They have determined that 4% of the applicant pool are fluent in Farsi.
a) If applicants are contacted randomly, how many individuals can they expect to interview in order to find one who is fluent in Farsi?
b) What is the probability that they will have to interview more than 25 until they find one who speaks Farsi?
Solution
a) 2504.0
11
b) 3604.0)25(1)25(0.639696.011)25( 2525 XPXPqXP
Example 3
From past experience it is known that 3% of accounts in a large accounting population are in
error. What is the probability that 5 accounts are audited before an account in error is found?
What is the probability that the first account in error occurs in the first five accounts audited?
-
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 26
Solution
0.858797.097.011)5(1 5)P(Y 55 F 0.141397.01 5)P(Y 5
Exercise 2.3
1. Over a very long period of time, it has been noted that on Friday’s 25% of the customers at the drive-in window at the bank make deposits. What is the probability that it takes 4
customers at the drive-in window before the first one makes a deposit.
2. It is estimated that 45% of people in Fast-Food restaurants order a diet drink with their lunch. Find the probability that the fourth person orders a diet drink. Also find the
probability that the first diet drinker of th e day occurs before the 5th person.
3. What is the probability of rolling a sum of seven in fewer than three rolls of a pair of dice? Hint (The random variable, X, is the number of rolls before a sum of 7.)
4. In New York City at rush hour, the chance that a taxicab passes someone and is available
is 15%. a) How many cabs can you expect to pass you for you to find one that is free and
b) what is the probability that more than 10 cabs pass you before you find one that is free.
5. An urn contains N white and M black balls. Balls are randomly selected, one at a time, until a black ball is obtained. If we assume that each selected ball is replaced before the
next one is drawn, what is the;
a) probability that; (i) exactly n draws are needed? (ii) at least k draws are needed? b) expected value and Variance of the number of balls drawn?
6. In a gambling game a player tosses a coin until a head appears. He then receives $2n , where n is the number of tosses.
a) What is the probability that the player receives $8.00 in one play of the game? b) If the player must pay $5.00 to play, what is the win/loss per game?
7. An oil prospector will drill a succession of holes in a given area to find a productive well. The probability of success is 0.2.
a) What is the probability that the 3rd hole drilled is the first to yield a productive well? b) If the prospector can afford to drill at most 10 well, what is the probability that he will
fail to find a productive well?
8. A well-travelled highway has itstraffic lights green for 82% of the time. If a person travelling the road goes through 8 traffic intersections, complete the chart to find a)
the probability that the first red light occur on the nth traffic light and b) the
cumulative probability that the person will hit the red light on or before the nth
traffic light.
9. An oil prospector will drill a succession of holes in a given area to find a productive well. The probability of success is 0.2.
a) What is the probability that the 3rd hole drilled is the first to yield a productive well? b) If the prospector can afford to drill at most 10 well, what is the probability that he will
fail to find a productive well?
2.1.4 The negative binomial distribution
Suppose a Bernoulli trial is performed until the tth success is realized. Then the random
variable “the number of trials until the tth success is realized” has a negative binomial
distribution
Definition: A random variable X has the negative binomial distribution, also called the Pascal
distribution, denoted p) NB(r, ~X , if there exists an integer 1n and a real number
1) , (0p such that . . 3,. 2, 1,)1()( 1 xr
xxr ppCxrXP
If r=1 the negative binomial distribution reduces to a geometric distribution.
-
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 27
2.1.6 Hyper geometric Distribution
Hyper geometric experiments occur when the trials are not independent of each other and
occur due to sampling without replacement hyper-geometric probabilities involve the
multiplication of two combinations together and then division by the total number of
combinations
Suppose we have a population of N elements that possess one of two characteristics, e.g. D of
them are defective and DN are non defective. A sample of n elements is randomly
selected from the population. The r.v. of interest, Y, is the number of defective elements in
the sample.
Definition: A r.v. Y is said to have a hyper geometric probability distribution if and only if
y-n and nyn . . 3,. 2, 1,yfor )( DNC
CCyYP
nN
ynDNyD
Theorem: If Y is a r.v with a hyper geometric distribution, then;
N
nDYE )( and
11)(var 2
N
nN
N
D
N
nDX
Example 1 Boxes contain 2000 items of which 10% are defective. Find the probability that
no more than 2 defectives will be obtained in a sample of size 10 drawn Without
Replacement
Solution
Let Y be the number of defectives
0653.0)2(0.93470.19750.3974.33980
)2()1()0()2(10200
8180220
10200
9180120
10200
10180
YP
C
CC
C
CC
C
CYPYPYPYP
Example 2
How many ways can 3 men and 4 women be selected from a group of 7 men and 10 women?
Solution
The answer is (approx) 0.3779 = 19448
7350 =
717
41027
C
CC
Note that the sum of the numbers in the numerator are the numbers used in the combination
in the denominator. This can be extended to more than two groups and called an extended
hypergeometric problem.
Exercise 2.4
1. A bottle contains 4 laxative and 5 aspirin tablets. 3 tablets are drawn at random from the bottle. Find the probability that; a) exactly one, b) at most 1 c) at least 2 are laxative
tablet.
2. Want is the probability of getting at most 2 diamonds in the5 selected without replacement from a well shuffled deck?
3. A massager has to deliver 10 out of 16 letters to computing department the rest to statistics department. She mixed up the letters and delivered 10 letters at random to
computing department. What is the probability that, only 6 letters for computing
department actually got there?
4. In a class there are 20 students. 6 are compulsive smokers and they always keep cigarette in their lockers. One day prefects checked at random on 10 lockers. What is the
probability that they find cigarette in at most 2 lockers?
5. A box holds 8 green, 4 white and 8 red beads. 6 beads are drawn at random without replacement from the box. What is the probability that 3 red, 2 green and 1 white beads
are drawn?
-
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 28
2.2 Continuous Distribution 2.2.1 Uniform (Rectangular) Distribution
A uniform density function is a density function that is constant, (Ie all the values are equally likely outcomes over the domain). Often referred as the Rectangular distribution because the graph of
the pdf has the form of a rectangle, making it the simplest kind of density function. The uniform
distribution lies between two values on the x-axis. The total area is equal to 1.0 or 100%
within the rectangle
Definition: A random variable X has a
uniform distribution over the range b][a, If
elsewhere
bxaab
,0
, f(x)
1
We denote this
distribution by b)U(a,~X where: a and b
are the smallest and largest value
respectively the variable can assume..
The mean and Variance of X are given by 2
ba and
12
2
2 ab respectively.
The cdf F(x) is given by
bx
bxa
ax
ab
axdt
ababax
x
a
1
,
0
F(x)1
F(x)
Example Prof Hinga travels always by plane. From past experience he feels that
take off time is uniformly distributed between 80 and 120 minutes after check in.
determine the probability that: a) he waits for more than 15 minutes for take-off after
check in. b) the waiting time will be between 1.5 standard deviation from the mean,
Solution
elsewhere
x
,0
12080, f(x) U(80,120)~X
401
83
4080105-1105)P(X-1105)P(X
40
3)1.51.5P(
1.5
401
1.5
1.5401
01.541
xdxx But 1240
12
ab
12
3
40
3)1.51.5P(
x
Exercise 2.5
1. Uniform: The amount of time, in minutes, that a person must wait for a bus is uniformly distributed between 0 and 15 minutes, inclusive. What is the probability
that a person waits fewer than 12.5 minutes? What is the probability that will be
between 0.5 standard deviation from the mean,
2. The current (in mA) measured in a piece of copper wire is known to follow a uniform distribution over the interval [0, 25]. Write down the formula for the
probability density function f(x) of the random variable X representing the current.
Calculate the mean and variance of the distribution and find the cumulative
distribution function F(x)
3. Slater customers are charged for the amount of salad they take. Sampling suggests that the amount of salad taken is uniformly distributed between 5 ounces
and 15 ounces. Let x = salad plate filling weight, find the expected Value and the
Variance of x. What is the probability that a customer will take between 12 and 15
ounces of salad?
-
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 29
4. The thickness x of a protective coating applied to a conductor designed to work in corrosive conditions follows a uniform distribution over the interval [20, 40]
microns. Find the mean, standard deviation and cumulative distribution function
of the thickness of the protective coating. Find also the probability that the coating
is less than 35 microns thick.
5. The average number of donuts a nine-year old child eats per month is uniformly distributed from 0.5 to 4 donuts, inclusive. Determine the probability that a
randomly selected nine-year old child eats an average of;
a) more than two donuts b) more than two donuts given that his or her amount is more than 1.5 donuts.
6. Starting at 5 pm every half hour there is a flight from Nairobi to Mombasa. Suppose that none of these plane tickets are completely sold out and they always
have room for passagers. A person who wants to fly to Mombasa arrives at the
airport at a random time between 8.45 AM and (.45 AM. Determine the
probability that he waits for
a) At most 10 minutes b) At least 15 minutes
2.2.2 Exponential Distribution The exponential distribution is often concerned with the amount of time until some
specific event occurs. For example, the amount of time (beginning now) until an
earthquake occurs has an exponential distribution. Other examples include the length,
in minutes, of long distance business telephone calls, and the amount of time, in
months, a car battery lasts. It can be shown, too, that the amount of change that you
have in your pocket or purse follows an exponential distribution. Values for an
exponential random variable occur in the following way. There are fewer large values
and more small values. For example, the amount of money customers spend in one
trip to the supermarket follows an exponential distribution. There are more people that
spend less money and fewer people that spend large amounts of money.
The exponential distribution is widely used in the field of reliability. Reliability deals
with the amount of time a product lasts
In brief this distribution is commonly used to model waiting times between
occurrences of rare events, lifetimes of electrical or mechanical devices
Definition: A RV X is said to have an exponential distribution with parameter 0 if
the pdf of X is:
otherwise
e x
0
0 and 0for xf(x)
we abbreviate this as )exp(~X
is called the rate parameter A plot of this pdf is below
The mean and variance of this distribution
are
1
and 2
2 1
respectively.
The CDF,
otherwise
e x
0
0for x1F(x)
Example Torch batteries have a lifespan T years with pdf
otherwise
e t
0
0T ,01.0f(t)
01.0
.
Determine the probability that the battery life; a) is less than 25 hours. b) is between
35 and 50 hours. c) exceeds 120 hours. d) exceeds the mean lifespan.
Solution
a) 2212.0101.0F(25)25)P(T )25(01.025
0
01.0 edtet
-
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 30
b) 0982.001.050)TP(35 50.035.050
35
01.050
35
01.0 eeedtett
c) 3012.0001.0120)P(T 2.1120
01.0
120
01.0
eedtett
d) 3679.001.0100)P(T 1000.01
1 1100
01.0
100
01.0
eedtett
Exercise 2.6:
1. Jobs are sent to a printer at an average of 3 jobs per hour. a) What is the expected time between jobs?
b) What is the probability thatthe next job is sent within 5 minutes?
2. The time required to repair a machine is an exponential random variable with rate λ= 0.5 downs/hour
a) what is the probability that a repair time exceeds 2 hours? b) what is the probability that the repair time will take at least 4 hours
given that the repair man has been working on the machine for 3
hours?
3. Buses arrive to a bus stop according to an exponential distribution with rate λ= 4 busses/hour. If you arrived at 8:00 am to the bus stop,
a) what is the expected time of the next bus? b) Assume you asked one of the people waiting for the bus about the
arrival time of the last bus and he told you that the last bus left at
7:40 am. What is the expected time of the next bus?
4. Break downs occur on an old car with rate λ= 5 break-downs/month. The owner of the car is planning to have a trip on his car for 4 days.
a) What is the probability that he will return home safely on his car. b) If the car broke down the second day of the trip and the car was fixed,
what is the probability that he doesn’t return home safely on his car.
5. Suppose that the amount of time one spends in a bank is exponentially distributed with mean 10 minutes. What is the probability that a customer will spend more
than 15 minutes in the bank? Also find the probability that the customer will
spend more than 15 minutes given that he is still in the bank after 10 minutes?
6. Suppose the lifespan in hundreds of hours, T, of a light bulb of a home lamp is exponentially distributed with lambda = 0.2. compute the probability that the light
bulb will last more than 700 hours Also, the probability that the light bulb will last
more than 900 hours
7. Let X = amount of time (in minutes) a postal clerk spends with his/her customer. The time is known to have an exponential distribution with the average amount of
time equal to 4 minutes.
a) Find the probability that a clerk spends four to five minutes with a randomly selected customer.
b) Half of all customers are finished within how long? (Find median) c) Which is larger, the mean or the median?
8. The length of life of a certain type of electronic tube is exponentially distributed with a mean life of 500 hours. Find the probability that;
a) a tube will; (i) last more than 800 hours (ii) fail within the first 200 hours. b) the length of life of a tube will be between 400 and 700 hours.
9. On the average, a certain computer part lasts 10 years. The length of time the computer part lasts is exponentially distributed.
a) What is the probability that a computer part lasts more than 7 years? b) On the average, how long would 5 computer parts last if they are used one
after another?
-
Time is precious, but we do not know yet how precious it really is. We will only know when we are no longer able to take advantage of it…
Proverbs 21:5 The plans of the diligent lead to profit as surely as haste leads to poverty By J. K. Kiingati Page 31
c) Eighty percent of computer parts last at most how long? d) What is the probability that a computer