lec 03 - oct25

9
1 Statistics and Experimental Design Lecture 3 Cairo University – Faculty of Engineering Graduate Course – Fall 2009 Statistical Measures & Sampling Distributions Data Analysis and Statistical Measures • Distributions Continuous Distributions Discrete Distributions 2 A probability distribution provides the possible values of the random variable and their corresponding probabilities. A probability distribution can be in the form of a table, graph or mathematical formula. Probability Density Function = < < b a dx x f b X a P ) ( ) ( A Cumulative distribution function ?? = < = x du u f x X P X F ) ( ) ( ) ( A discrete random variable is a random variable that has values that result from counting. A continuous random variable is a random variable that result from measurement. A random variable is a numerical measure of the outcome from a probability experiment, so its value is determined by chance. It is a numerical variable whose measured value can change from one replicate of the experiment to another. Random variables are denoted using letters such as X. Random Experiments Distribution Shape & Box Plot Right-Skewed Left-Skewed Symmetric 1 Q 1 Q 1 Q 2 Q 2 Q 2 Q 3 Q 3 Q 3 Q Important Distributions Continuous Probability Distribution • Uniform probability distribution • Normal probability distribution • Exponential probability distribution • Weibul distribution •… Discrete Probability Distribution • Bernoulli distribution • Poisson Distribution • Binomial / Negative-Binomial distribution • Geometric / Hyper-geometric distribution

Upload: mohamed-abdou

Post on 26-May-2017

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lec 03 - Oct25

1

Statistics and Experimental Design

Lecture 3

Cairo University – Faculty of Engineering

Graduate Course – Fall 2009

Statistical Measures & Sampling Distributions

• Data Analysis and Statistical Measures

• Distributions

• Continuous Distributions

• Discrete Distributions2

A probability distribution provides the possible values of the random variable and their corresponding probabilities.

A probability distribution can be in the form of a table, graph or mathematical formula.

Probability Density Function

∫=<<b

a

dxxfbXaP )()(

A Cumulative distribution function ?? ∫∞−

=<=x

duufxXPXF )()()(

A discrete random variable is a random variable that has values that result from counting.

A continuous random variable is a random variable that result from measurement.

A random variable is a numerical measure of the outcome from a probability experiment, so its value is determined by chance. It is a numerical variable whose measured value can change from one replicate of the experiment to another. Random variables are denoted using letters such as X.

Random Experiments

Distribution Shape & Box Plot

Right-SkewedLeft-Skewed Symmetric

1Q 1Q 1Q2Q 2Q 2Q3Q 3Q3Q

Important DistributionsContinuous Probability Distribution

• Uniform probability distribution • Normal probability distribution• Exponential probability distribution• Weibul distribution• …

Discrete Probability Distribution• Bernoulli distribution • Poisson Distribution• Binomial / Negative-Binomial distribution• Geometric / Hyper-geometric distribution

Page 2: Lec 03 - Oct25

2

Continuous Distribution Discrete Distribution

Probability∫=≤≤b

a

dx)x(f)bxa{P )x(p)x(P ii =

Distribution mean∫

+∞

∞−

= dx)x(xfμ ∑∞

=

=1i

ii )x(pxμ

Distribution variance∫

+∞

∞−

−== dx)x(f)x()x(V 22 μσ ∑∞

=

−==1i

i2

i2 )x(p)x()x(V μσ

Sample mean

Sample variancen

xx

n

1ii∑

==

1n

)xx(Sˆ

n

1i

2i

22

−==

∑=σ

Continuous vs Discrete Distributions

Uniform Probability Distribution

• pdf f (x) = 1/(b – a) for a < x < b= 0 elsewhere

• Expected Value E(x) = (a + b)/2• Variance Var(x) = (b - a)2/12

x

f(x)

Normal Probability Distribution

• pdf2 2( ) /21( )

2xf x e μ σ

σ π− −=

2 2( ) /21( )2

xf x e μ σ

σ π− −=

μμ = mean= mean

σσ = standard deviation= standard deviation

ππ = 3.14159= 3.14159

ee = 2.71828= 2.71828

xx

μμ –– 33σσ μμ –– 11σσμμ –– 22σσ

μμ + 1+ 1σσμμ + 2+ 2σσ

μμ + 3+ 3σσμμ

68.26%68.26%95.44%95.44%99.72%99.72%

Normal Probability Distribution

• Characteristics:–– The distribution is The distribution is symmetricsymmetric; its ; its skewnessskewness measure is zeromeasure is zero–– The entire family of normal probability distributions is The entire family of normal probability distributions is

defined by itsdefined by its meanmean mm and its and its standard deviationstandard deviation ss–– The The highest pointhighest point on the normal curve is at the on the normal curve is at the meanmean, ,

which is also the which is also the medianmedian and and modemode–– The mean can be any numerical value: negative, zero, The mean can be any numerical value: negative, zero,

or positive.or positive.–– The standard deviation determines the width of the The standard deviation determines the width of the

curve: large values result in wider, flatter curvescurve: large values result in wider, flatter curves–– Probabilities for the normal random variable are given Probabilities for the normal random variable are given

by by areas under the curveareas under the curve. The total area under the . The total area under the curve is 1 (.5 to the left of the mean and .5 to the right).curve is 1 (.5 to the left of the mean and .5 to the right).

Normal Probability Distribution

• pdf

• Standard

2 2( ) /21( )2

xf x e μ σ

σ π− −=

2 2( ) /21( )2

xf x e μ σ

σ π− −=

μμ = mean= mean

σσ = standard deviation= standard deviation

ππ = 3.14159= 3.14159

ee = 2.71828= 2.71828

xx

μμ –– 33σσ μμ –– 11σσμμ –– 22σσ

μμ + 1+ 1σσμμ + 2+ 2σσ

μμ + 3+ 3σσμμ

68.26%68.26%95.44%95.44%99.72%99.72%

z x=

− μσ

z x=

− μσ

π−=

2 /21( )2

zf x eπ

−=2 /21( )

2zf x e

The letter The letter z z is used to designate is used to designate the standard normal random the standard normal random variable.variable.

Exponential Probability DistributionExponential Probability Distribution

The exponential probability distribution is useful in The exponential probability distribution is useful in describing the time it takes to complete a task.describing the time it takes to complete a task.The exponential random variables can be used to The exponential random variables can be used to describe:describe:

Time betweenTime betweenvehicle arrivalsvehicle arrivalsat a toll boothat a toll booth

Time requiredTime requiredto completeto complete

a questionnairea questionnaire

Distance betweenDistance betweenmajor defectsmajor defectsin a highwayin a highway

SLOW

Page 3: Lec 03 - Oct25

3

• Density Function

Exponential Probability Distribution

where: where: μμ = mean= meanee = 2.71828= 2.71828

f x e x( ) /= −1μ

μf x e x( ) /= −1μ

μ for for xx >> 0, 0, μμ > 0> 0

• Cumulative ProbabilitiesExponential Probability Distribution

P x x e x( ) /≤ = − −0 1 o μP x x e x( ) /≤ = − −0 1 o μ

where:where:xx00 = some specific value of = some specific value of xx

Exponential Probability Distribution• Example: Al’s Full-Service Pump

The time between arrivals of carsat Al’s full-service gas pump followsan exponential probability distributionwith a mean time between arrivals of 3 minutes. Al would like to know theprobability that the time between two successivearrivals will be 2 minutes or less.

x

f(x)

.1

.3

.4

.2

1 2 3 4 5 6 7 8 9 10Time Between Successive Arrivals (mins.)Time Between Successive Arrivals (mins.)

Exponential Probability Distribution

PP((xx << 2) = 1 2) = 1 -- 2.718282.71828--2/32/3 = 1 = 1 -- .5134 = .4866.5134 = .4866

Exponential Probability DistributionExponential Probability Distribution

A property of the exponential distribution is thatA property of the exponential distribution is thatthe mean, the mean, μμ, and standard deviation, , and standard deviation, σσ, are equal., are equal.

Thus, the standard deviation, Thus, the standard deviation, σσ, and variance, , and variance, σ σ 22, for, forthe time between arrivals at Althe time between arrivals at Al’’s fulls full--service pump are:service pump are:

σσ = = μμ = 3 minutes= 3 minutes

σσ 2 2 = (3)= (3)2 2 = 9= 9

WeibullWeibull Probability DistributionProbability Distribution

The The weibullweibull probability distribution is useful in probability distribution is useful in describing the time to failure.describing the time to failure.The parameters in the distribution model systems in The parameters in the distribution model systems in which:which:

Number of failure Number of failure increase with timeincrease with time

Bearing wearBearing wear

Number of failure Number of failure decrease with timedecrease with time

semiconductorssemiconductors

Number of failure Number of failure constant with timeconstant with time

External shocksExternal shocks

Page 4: Lec 03 - Oct25

4

• Density Function

Weibull Probability Distribution

where: where: xx > 0> 0the scale parameter the scale parameter δδ > 0> 0the shape parameter the shape parameter ββ > 0> 0

β

δβ

δδβ ⎟

⎠⎞

⎜⎝⎛−

⎟⎠⎞

⎜⎝⎛=

x

exxf1

)(β

δβ

δδβ ⎟

⎠⎞

⎜⎝⎛−

⎟⎠⎞

⎜⎝⎛=

x

exxf1

)(

• Shape and scale parameters

Weibull Probability Distributionβ

δβ

δδβ ⎟

⎠⎞

⎜⎝⎛−−

⎟⎠⎞

⎜⎝⎛=

x

exxf1

)(

β

δβ

δδβ ⎟

⎠⎞

⎜⎝⎛−−

⎟⎠⎞

⎜⎝⎛=

x

exxf1

)(

• Cumulative ProbabilitiesWeibull Probability Distribution

β

δ⎟⎠⎞

⎜⎝⎛−

−=≤o

1)( 0

x

exxPβ

δ⎟⎠⎞

⎜⎝⎛−

−=≤o

1)( 0

x

exxP

where:where:xx00 = some specific value of = some specific value of xx

• The mean and varianceWeibull Probability Distribution

⎟⎠

⎞⎜⎝

⎛+Γ==

βδμ 11)(xE ⎟

⎞⎜⎝

⎛+Γ==

βδμ 11)(xE

2222 1121)( ⎥

⎤⎢⎣

⎡⎟⎠

⎞⎜⎝

⎛+−⎟

⎞⎜⎝

⎛+Γ==

βδ

βδσ xV

2222 1121)( ⎥

⎤⎢⎣

⎡⎟⎠

⎞⎜⎝

⎛+−⎟

⎞⎜⎝

⎛+Γ==

βδ

βδσ xV

Weibull Probability Distribution• Example: Bearing service center

The time to failure (in hours) of a bearing in a mechanical shaft is satisfactory modeled as a Weibullrandom variable with

ββ = ½ and δ = 5000 hours.1- Determine the mean time until failure.2- Determine the probability that a bearing

lasts at least 6000 hours.

Weibull Probability Distribution

⎟⎠

⎞⎜⎝

⎛+Γ==

βδμ 11)(xE ⎟

⎞⎜⎝

⎛+Γ==

βδμ 11)(xE

• Example: Bearing service center

For For ββ = ½ and δ = 5000 hours.

1- E(x)= 5000 Γ[1+(1/0.5)] = 5000 Γ3 = 5000 2! = 10,000 hours

2- P(X>6000) = 1- F(6000) and = e[-√(6000/5000)] = e-1.095 = 0.334

β

δ⎟⎠⎞

⎜⎝⎛−

−=≤o

1)( 0

x

exxPβ

δ⎟⎠⎞

⎜⎝⎛−

−=≤o

1)( 0

x

exxP

Page 5: Lec 03 - Oct25

5

Important DistributionsContinuous Probability Distribution

• Uniform probability distribution • Normal probability distribution• Exponential probability distribution• Weibul distribution• …

Discrete Probability Distribution• Bernoulli distribution • Poisson Distribution• Binomial / Negative-Binomial distribution• Geometric / Hyper-geometric distribution

The table below shows the probability distribution for the random variable X, where X represents the number of DVDs a person rents from a video store during a single visit.

Example of a discrete distribution

EXAMPLE Drawing a Probability Histogram

Draw a probability histogram of the following probability distribution which represents the number of DVDs a person rents from a video store during a single visit.

EXAMPLE Identifying Probability Distributions

Is the following a probability distribution?

EXAMPLE Identifying Probability Distributions

Is the following a probability distribution?

Page 6: Lec 03 - Oct25

6

EXAMPLE The Mean of a Discrete Random Variable

Compute the mean of the following probability distribution which represents the number of DVDs a person rents from a video store during a single visit.

The following data represent the number of DVDs rented by 100 randomly selected customers in a single visit. Compute the mean number of DVDs rented.

49.1100

... 10021 =+++

=xxxX

EXAMPLE Expected Value

A term life insurance policy will pay a beneficiary a certain sum of money upon the death of the policy holder. These policies have premiums that must be paid annually. Suppose a life insurance company sells a $250,000 one year term life insurance policy to a 49-year-old female for $520. According to the National Vital Statistics Report, Vol. 47, No. 28, the probability the female will survive the year is 0.99791. Compute the expected value of this policy to the insurance company.

Page 7: Lec 03 - Oct25

7

EXAMPLE Standard deviation

Compute the variance and standard deviation of the following probability distribution which represents the number of DVDs a person rents from a video store during a single visit.

Important Discrete Probability Distributions

• Bernoulli distribution • Poisson Distribution• Binomial / Negative-Binomial distribution• Geometric / Hyper-geometric distribution

Bernoulli DistributionBernoulli Distribution

Success Success ppTwo Outcomes Two Outcomes

Failure Failure qq

pp + + qq = 1= 1

x

x

qp

q

1

P(x)

F(x)

Binomial DistributionBinomial Distribution

y : The number of success in n y : The number of success in n statistically independent statistically independent BernoullyBernoully trials trials

y = y = ∑∑ XXii where Xwhere X11, X, X22 , .. are , .. are independantindependant Bernoulli trialsBernoulli trials

p(yp(y) = ) = nnCCyy ppyy qqnn--yy

Binomial DistributionBinomial DistributionExample:Example:Suppose that in a very large shipment of some Suppose that in a very large shipment of some

product, the probability of a defect for any one product, the probability of a defect for any one item is 0.35. item is 0.35.

A Sample of 17 items has been drawn A Sample of 17 items has been drawn a)a) What is the probability that at most 3 items have What is the probability that at most 3 items have

defects ? defects ? b)b) What is the probability that 5 or more items have What is the probability that 5 or more items have

defects ?defects ?c)c) What is the probability that Exactly 12 items do not What is the probability that Exactly 12 items do not

have defects ?have defects ?d)d) What is the probability that 3,4 or 5 have defects ?What is the probability that 3,4 or 5 have defects ?

Page 8: Lec 03 - Oct25

8

Geometric DistributionGeometric Distribution

X : The number of failures before the first X : The number of failures before the first success in a sequence of independent success in a sequence of independent Bernoulli trialsBernoulli trials

p(xp(x)) = = p p qqxx

NegativeNegative--Binomial DistributionBinomial Distribution

X : The number of failures before the X : The number of failures before the rrthth

success in a sequence of independent success in a sequence of independent BernouliBernouli trials. trials.

P(xP(x) = ) = x+rx+r--11CCxx ppr r qqxx

Note: if r = 1 Geometric

Discrete DistributionDiscrete DistributionExample:Example:Historical records associated with automobile in a Historical records associated with automobile in a

city indicated that the probability that such an city indicated that the probability that such an automobile will pass the federal EPA automobile will pass the federal EPA (Environmental Protection Agency) standards (Environmental Protection Agency) standards is 0.45. is 0.45.

How many such cars will the city inspector have to check How many such cars will the city inspector have to check before the probability is greater than 0.95 that he will before the probability is greater than 0.95 that he will have found 3 cars that pass the EPA standards ?have found 3 cars that pass the EPA standards ?

Poisson DistributionPoisson Distribution

X : The number of X : The number of occurencesoccurences in a specified time periodin a specified time period

where x = 0,1,2,where x = 0,1,2,……λλ = The rate parameter = The rate parameter

(the (the avgavg # of # of occurencesoccurences in ain aspecified time periospecified time period) d)

!)(

xexp

x λλ −

=

Poisson DistributionPoisson DistributionExample:Example:The number of customer orders for a PC at a firm The number of customer orders for a PC at a firm

averages 20 PC/week. averages 20 PC/week.

a)a) Find the probability that more than 20 PC will be Find the probability that more than 20 PC will be purchased in a particular week ? purchased in a particular week ?

b)b) Find the probability that exactly 17 PC will be Find the probability that exactly 17 PC will be purchased in a particular week ?purchased in a particular week ?

c)c) There are currently 22 PC in stock, and no more will There are currently 22 PC in stock, and no more will arrive from the manufacturer until the start of the next arrive from the manufacturer until the start of the next week, what is the probability that no PC will have to be week, what is the probability that no PC will have to be backordered ?backordered ?

Relationship between the Poissonand Exponential Distributions

The Poisson distributionThe Poisson distributionprovides an appropriate descriptionprovides an appropriate description

of the number of occurrencesof the number of occurrencesper intervalper interval

The exponential distributionThe exponential distributionprovides an appropriate descriptionprovides an appropriate description

of the length of the intervalof the length of the intervalbetween occurrencesbetween occurrences

Page 9: Lec 03 - Oct25

9

HyperHyper--Geometric DistributionGeometric Distribution

N : The total number of unique itemsN : The total number of unique itemsa: The number of items having a property of interesta: The number of items having a property of interest

egeg. success. success

n: The size of the samplen: The size of the sampleegeg. The # of items drawn from N items without replacement. The # of items drawn from N items without replacement

x: Number of successes in the sample of n itemsx: Number of successes in the sample of n items

NNCCnn : The number of equally : The number of equally likellylikelly samplessamples

aaCCxx : The number of ways of getting x success: The number of ways of getting x success

NN--aaCCnn--xx : The number of ways of getting : The number of ways of getting nn--xx success.success.

nN

xnaNxa

CCCxp −−=)(

Questions ?