lec 03 - oct25
TRANSCRIPT
1
Statistics and Experimental Design
Lecture 3
Cairo University – Faculty of Engineering
Graduate Course – Fall 2009
Statistical Measures & Sampling Distributions
• Data Analysis and Statistical Measures
• Distributions
• Continuous Distributions
• Discrete Distributions2
A probability distribution provides the possible values of the random variable and their corresponding probabilities.
A probability distribution can be in the form of a table, graph or mathematical formula.
Probability Density Function
∫=<<b
a
dxxfbXaP )()(
A Cumulative distribution function ?? ∫∞−
=<=x
duufxXPXF )()()(
A discrete random variable is a random variable that has values that result from counting.
A continuous random variable is a random variable that result from measurement.
A random variable is a numerical measure of the outcome from a probability experiment, so its value is determined by chance. It is a numerical variable whose measured value can change from one replicate of the experiment to another. Random variables are denoted using letters such as X.
Random Experiments
Distribution Shape & Box Plot
Right-SkewedLeft-Skewed Symmetric
1Q 1Q 1Q2Q 2Q 2Q3Q 3Q3Q
Important DistributionsContinuous Probability Distribution
• Uniform probability distribution • Normal probability distribution• Exponential probability distribution• Weibul distribution• …
Discrete Probability Distribution• Bernoulli distribution • Poisson Distribution• Binomial / Negative-Binomial distribution• Geometric / Hyper-geometric distribution
2
Continuous Distribution Discrete Distribution
Probability∫=≤≤b
a
dx)x(f)bxa{P )x(p)x(P ii =
Distribution mean∫
+∞
∞−
= dx)x(xfμ ∑∞
=
=1i
ii )x(pxμ
Distribution variance∫
+∞
∞−
−== dx)x(f)x()x(V 22 μσ ∑∞
=
−==1i
i2
i2 )x(p)x()x(V μσ
Sample mean
Sample variancen
xx
n
1ii∑
==
1n
)xx(Sˆ
n
1i
2i
22
−
−==
∑=σ
Continuous vs Discrete Distributions
Uniform Probability Distribution
• pdf f (x) = 1/(b – a) for a < x < b= 0 elsewhere
• Expected Value E(x) = (a + b)/2• Variance Var(x) = (b - a)2/12
x
f(x)
Normal Probability Distribution
• pdf2 2( ) /21( )
2xf x e μ σ
σ π− −=
2 2( ) /21( )2
xf x e μ σ
σ π− −=
μμ = mean= mean
σσ = standard deviation= standard deviation
ππ = 3.14159= 3.14159
ee = 2.71828= 2.71828
xx
μμ –– 33σσ μμ –– 11σσμμ –– 22σσ
μμ + 1+ 1σσμμ + 2+ 2σσ
μμ + 3+ 3σσμμ
68.26%68.26%95.44%95.44%99.72%99.72%
Normal Probability Distribution
• Characteristics:–– The distribution is The distribution is symmetricsymmetric; its ; its skewnessskewness measure is zeromeasure is zero–– The entire family of normal probability distributions is The entire family of normal probability distributions is
defined by itsdefined by its meanmean mm and its and its standard deviationstandard deviation ss–– The The highest pointhighest point on the normal curve is at the on the normal curve is at the meanmean, ,
which is also the which is also the medianmedian and and modemode–– The mean can be any numerical value: negative, zero, The mean can be any numerical value: negative, zero,
or positive.or positive.–– The standard deviation determines the width of the The standard deviation determines the width of the
curve: large values result in wider, flatter curvescurve: large values result in wider, flatter curves–– Probabilities for the normal random variable are given Probabilities for the normal random variable are given
by by areas under the curveareas under the curve. The total area under the . The total area under the curve is 1 (.5 to the left of the mean and .5 to the right).curve is 1 (.5 to the left of the mean and .5 to the right).
Normal Probability Distribution
• Standard
2 2( ) /21( )2
xf x e μ σ
σ π− −=
2 2( ) /21( )2
xf x e μ σ
σ π− −=
μμ = mean= mean
σσ = standard deviation= standard deviation
ππ = 3.14159= 3.14159
ee = 2.71828= 2.71828
xx
μμ –– 33σσ μμ –– 11σσμμ –– 22σσ
μμ + 1+ 1σσμμ + 2+ 2σσ
μμ + 3+ 3σσμμ
68.26%68.26%95.44%95.44%99.72%99.72%
z x=
− μσ
z x=
− μσ
π−=
2 /21( )2
zf x eπ
−=2 /21( )
2zf x e
The letter The letter z z is used to designate is used to designate the standard normal random the standard normal random variable.variable.
Exponential Probability DistributionExponential Probability Distribution
The exponential probability distribution is useful in The exponential probability distribution is useful in describing the time it takes to complete a task.describing the time it takes to complete a task.The exponential random variables can be used to The exponential random variables can be used to describe:describe:
Time betweenTime betweenvehicle arrivalsvehicle arrivalsat a toll boothat a toll booth
Time requiredTime requiredto completeto complete
a questionnairea questionnaire
Distance betweenDistance betweenmajor defectsmajor defectsin a highwayin a highway
SLOW
3
• Density Function
Exponential Probability Distribution
where: where: μμ = mean= meanee = 2.71828= 2.71828
f x e x( ) /= −1μ
μf x e x( ) /= −1μ
μ for for xx >> 0, 0, μμ > 0> 0
• Cumulative ProbabilitiesExponential Probability Distribution
P x x e x( ) /≤ = − −0 1 o μP x x e x( ) /≤ = − −0 1 o μ
where:where:xx00 = some specific value of = some specific value of xx
Exponential Probability Distribution• Example: Al’s Full-Service Pump
The time between arrivals of carsat Al’s full-service gas pump followsan exponential probability distributionwith a mean time between arrivals of 3 minutes. Al would like to know theprobability that the time between two successivearrivals will be 2 minutes or less.
x
f(x)
.1
.3
.4
.2
1 2 3 4 5 6 7 8 9 10Time Between Successive Arrivals (mins.)Time Between Successive Arrivals (mins.)
Exponential Probability Distribution
PP((xx << 2) = 1 2) = 1 -- 2.718282.71828--2/32/3 = 1 = 1 -- .5134 = .4866.5134 = .4866
Exponential Probability DistributionExponential Probability Distribution
A property of the exponential distribution is thatA property of the exponential distribution is thatthe mean, the mean, μμ, and standard deviation, , and standard deviation, σσ, are equal., are equal.
Thus, the standard deviation, Thus, the standard deviation, σσ, and variance, , and variance, σ σ 22, for, forthe time between arrivals at Althe time between arrivals at Al’’s fulls full--service pump are:service pump are:
σσ = = μμ = 3 minutes= 3 minutes
σσ 2 2 = (3)= (3)2 2 = 9= 9
WeibullWeibull Probability DistributionProbability Distribution
The The weibullweibull probability distribution is useful in probability distribution is useful in describing the time to failure.describing the time to failure.The parameters in the distribution model systems in The parameters in the distribution model systems in which:which:
Number of failure Number of failure increase with timeincrease with time
Bearing wearBearing wear
Number of failure Number of failure decrease with timedecrease with time
semiconductorssemiconductors
Number of failure Number of failure constant with timeconstant with time
External shocksExternal shocks
4
• Density Function
Weibull Probability Distribution
where: where: xx > 0> 0the scale parameter the scale parameter δδ > 0> 0the shape parameter the shape parameter ββ > 0> 0
β
δβ
δδβ ⎟
⎠⎞
⎜⎝⎛−
−
⎟⎠⎞
⎜⎝⎛=
x
exxf1
)(β
δβ
δδβ ⎟
⎠⎞
⎜⎝⎛−
−
⎟⎠⎞
⎜⎝⎛=
x
exxf1
)(
• Shape and scale parameters
Weibull Probability Distributionβ
δβ
δδβ ⎟
⎠⎞
⎜⎝⎛−−
⎟⎠⎞
⎜⎝⎛=
x
exxf1
)(
β
δβ
δδβ ⎟
⎠⎞
⎜⎝⎛−−
⎟⎠⎞
⎜⎝⎛=
x
exxf1
)(
• Cumulative ProbabilitiesWeibull Probability Distribution
β
δ⎟⎠⎞
⎜⎝⎛−
−=≤o
1)( 0
x
exxPβ
δ⎟⎠⎞
⎜⎝⎛−
−=≤o
1)( 0
x
exxP
where:where:xx00 = some specific value of = some specific value of xx
• The mean and varianceWeibull Probability Distribution
⎟⎠
⎞⎜⎝
⎛+Γ==
βδμ 11)(xE ⎟
⎠
⎞⎜⎝
⎛+Γ==
βδμ 11)(xE
2222 1121)( ⎥
⎦
⎤⎢⎣
⎡⎟⎠
⎞⎜⎝
⎛+−⎟
⎠
⎞⎜⎝
⎛+Γ==
βδ
βδσ xV
2222 1121)( ⎥
⎦
⎤⎢⎣
⎡⎟⎠
⎞⎜⎝
⎛+−⎟
⎠
⎞⎜⎝
⎛+Γ==
βδ
βδσ xV
Weibull Probability Distribution• Example: Bearing service center
The time to failure (in hours) of a bearing in a mechanical shaft is satisfactory modeled as a Weibullrandom variable with
ββ = ½ and δ = 5000 hours.1- Determine the mean time until failure.2- Determine the probability that a bearing
lasts at least 6000 hours.
Weibull Probability Distribution
⎟⎠
⎞⎜⎝
⎛+Γ==
βδμ 11)(xE ⎟
⎠
⎞⎜⎝
⎛+Γ==
βδμ 11)(xE
• Example: Bearing service center
For For ββ = ½ and δ = 5000 hours.
1- E(x)= 5000 Γ[1+(1/0.5)] = 5000 Γ3 = 5000 2! = 10,000 hours
2- P(X>6000) = 1- F(6000) and = e[-√(6000/5000)] = e-1.095 = 0.334
β
δ⎟⎠⎞
⎜⎝⎛−
−=≤o
1)( 0
x
exxPβ
δ⎟⎠⎞
⎜⎝⎛−
−=≤o
1)( 0
x
exxP
5
Important DistributionsContinuous Probability Distribution
• Uniform probability distribution • Normal probability distribution• Exponential probability distribution• Weibul distribution• …
Discrete Probability Distribution• Bernoulli distribution • Poisson Distribution• Binomial / Negative-Binomial distribution• Geometric / Hyper-geometric distribution
The table below shows the probability distribution for the random variable X, where X represents the number of DVDs a person rents from a video store during a single visit.
Example of a discrete distribution
EXAMPLE Drawing a Probability Histogram
Draw a probability histogram of the following probability distribution which represents the number of DVDs a person rents from a video store during a single visit.
EXAMPLE Identifying Probability Distributions
Is the following a probability distribution?
EXAMPLE Identifying Probability Distributions
Is the following a probability distribution?
6
EXAMPLE The Mean of a Discrete Random Variable
Compute the mean of the following probability distribution which represents the number of DVDs a person rents from a video store during a single visit.
The following data represent the number of DVDs rented by 100 randomly selected customers in a single visit. Compute the mean number of DVDs rented.
49.1100
... 10021 =+++
=xxxX
EXAMPLE Expected Value
A term life insurance policy will pay a beneficiary a certain sum of money upon the death of the policy holder. These policies have premiums that must be paid annually. Suppose a life insurance company sells a $250,000 one year term life insurance policy to a 49-year-old female for $520. According to the National Vital Statistics Report, Vol. 47, No. 28, the probability the female will survive the year is 0.99791. Compute the expected value of this policy to the insurance company.
7
EXAMPLE Standard deviation
Compute the variance and standard deviation of the following probability distribution which represents the number of DVDs a person rents from a video store during a single visit.
Important Discrete Probability Distributions
• Bernoulli distribution • Poisson Distribution• Binomial / Negative-Binomial distribution• Geometric / Hyper-geometric distribution
Bernoulli DistributionBernoulli Distribution
Success Success ppTwo Outcomes Two Outcomes
Failure Failure qq
pp + + qq = 1= 1
x
x
qp
q
1
P(x)
F(x)
Binomial DistributionBinomial Distribution
y : The number of success in n y : The number of success in n statistically independent statistically independent BernoullyBernoully trials trials
y = y = ∑∑ XXii where Xwhere X11, X, X22 , .. are , .. are independantindependant Bernoulli trialsBernoulli trials
p(yp(y) = ) = nnCCyy ppyy qqnn--yy
Binomial DistributionBinomial DistributionExample:Example:Suppose that in a very large shipment of some Suppose that in a very large shipment of some
product, the probability of a defect for any one product, the probability of a defect for any one item is 0.35. item is 0.35.
A Sample of 17 items has been drawn A Sample of 17 items has been drawn a)a) What is the probability that at most 3 items have What is the probability that at most 3 items have
defects ? defects ? b)b) What is the probability that 5 or more items have What is the probability that 5 or more items have
defects ?defects ?c)c) What is the probability that Exactly 12 items do not What is the probability that Exactly 12 items do not
have defects ?have defects ?d)d) What is the probability that 3,4 or 5 have defects ?What is the probability that 3,4 or 5 have defects ?
8
Geometric DistributionGeometric Distribution
X : The number of failures before the first X : The number of failures before the first success in a sequence of independent success in a sequence of independent Bernoulli trialsBernoulli trials
p(xp(x)) = = p p qqxx
NegativeNegative--Binomial DistributionBinomial Distribution
X : The number of failures before the X : The number of failures before the rrthth
success in a sequence of independent success in a sequence of independent BernouliBernouli trials. trials.
P(xP(x) = ) = x+rx+r--11CCxx ppr r qqxx
Note: if r = 1 Geometric
Discrete DistributionDiscrete DistributionExample:Example:Historical records associated with automobile in a Historical records associated with automobile in a
city indicated that the probability that such an city indicated that the probability that such an automobile will pass the federal EPA automobile will pass the federal EPA (Environmental Protection Agency) standards (Environmental Protection Agency) standards is 0.45. is 0.45.
How many such cars will the city inspector have to check How many such cars will the city inspector have to check before the probability is greater than 0.95 that he will before the probability is greater than 0.95 that he will have found 3 cars that pass the EPA standards ?have found 3 cars that pass the EPA standards ?
Poisson DistributionPoisson Distribution
X : The number of X : The number of occurencesoccurences in a specified time periodin a specified time period
where x = 0,1,2,where x = 0,1,2,……λλ = The rate parameter = The rate parameter
(the (the avgavg # of # of occurencesoccurences in ain aspecified time periospecified time period) d)
!)(
xexp
x λλ −
=
Poisson DistributionPoisson DistributionExample:Example:The number of customer orders for a PC at a firm The number of customer orders for a PC at a firm
averages 20 PC/week. averages 20 PC/week.
a)a) Find the probability that more than 20 PC will be Find the probability that more than 20 PC will be purchased in a particular week ? purchased in a particular week ?
b)b) Find the probability that exactly 17 PC will be Find the probability that exactly 17 PC will be purchased in a particular week ?purchased in a particular week ?
c)c) There are currently 22 PC in stock, and no more will There are currently 22 PC in stock, and no more will arrive from the manufacturer until the start of the next arrive from the manufacturer until the start of the next week, what is the probability that no PC will have to be week, what is the probability that no PC will have to be backordered ?backordered ?
Relationship between the Poissonand Exponential Distributions
The Poisson distributionThe Poisson distributionprovides an appropriate descriptionprovides an appropriate description
of the number of occurrencesof the number of occurrencesper intervalper interval
The exponential distributionThe exponential distributionprovides an appropriate descriptionprovides an appropriate description
of the length of the intervalof the length of the intervalbetween occurrencesbetween occurrences
9
HyperHyper--Geometric DistributionGeometric Distribution
N : The total number of unique itemsN : The total number of unique itemsa: The number of items having a property of interesta: The number of items having a property of interest
egeg. success. success
n: The size of the samplen: The size of the sampleegeg. The # of items drawn from N items without replacement. The # of items drawn from N items without replacement
x: Number of successes in the sample of n itemsx: Number of successes in the sample of n items
NNCCnn : The number of equally : The number of equally likellylikelly samplessamples
aaCCxx : The number of ways of getting x success: The number of ways of getting x success
NN--aaCCnn--xx : The number of ways of getting : The number of ways of getting nn--xx success.success.
nN
xnaNxa
CCCxp −−=)(
Questions ?