1 prof. indrajit mukherjee, school of management, iit bombay others convenience stratified judgment...

Post on 19-Jan-2016

220 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1Prof. Indrajit Mukherjee, School of Management, IIT Bombay

OthersConvenience

Stratified

Judgment

Non-ProbabilitySamples

Probability Samples

SimpleRandom

Systematic

StratifiedCluster

Samples

Sampling Techniques

2Prof. Indrajit Mukherjee, School of Management, IIT Bombay

TYPE OFSAMPLING

SELECTIONSTRATEGY

PURPOSE

Convenience Select casesbased on theiravailability for

the study.

Saves timetime,

money andeffort; but at the

expense ofinformation and

credibility.

3Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Simple random sampling

Sample method Resulting method

The population is identified uniquely by number. Selection by

random number

Every number of the population has an equal chance of being selected

into the sample

4Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Simulating From Continuous Uniform

( ]

Random numbersUniform [0,1] distribution

Uniform [a, b] distribution

0 r 1

0 a a + r(b-a) b

Shift a Stretch (b - a) b

5Prof. Indrajit Mukherjee, School of Management, IIT Bombay

How to use random number table to select a random samplecorresponds to a number on the list of your population. In the example below, # 08 has been chosen as the starting point and the first student chosen is Carol Chan.

10 09 73 25 33 7637 54 20 48 05 6408 42 26 89 53 1990 01 90 25 29 0912 80 79 99 70 8066 06 57 47 17 3431 06 01 08 05 45

Step 3: Move to the next number, 42 and select the person corresponding to that number intothe sample. #87 – Tan Teck WahStep 4: Continue to the next number that qualifies and select that person into the sample.# 26 -- Jerry Lewis, followed by #89, #53 and #19Step 5: After you have selected the student # 19, go to the next line and choose #90. Continuein the same manner until the full sample is selected. If you encounter a number selectedearlier (e.g., 90, 06 in this example) simply skip over it and choose the next number.

Starting point:move right to the endof the row, then downto the next row row;move left to the endEnd, then down to the next row, and so on.

6Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Systematic sampling (contd.)“Example”

1 26 51 76

2 27 52 77

3 28 53 78

4 29 54 79

5 30 55 80

6 31 56 81

7 32 57 82

8 33 58 83

9 34 59 84

10 35 60 85

11 36 61 86

12 37 62 87

13 38 63 88

14 39 64 89

15 40 65 90

16 41 66 91

17 42 67 92

18 43 68 93

19 44 69 94

20 45 70 95

21 46 71 96

22 47 72 97

23 48 73 98

24 49 74 99

25 50 75 100

Start with #4 and take every 5th unit

N=100

Want n=20

N/n=5

Select a random number from 1-5:Chose 4

7Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Stratified Random Sample: Stratified by Age

20 - 30 years old(homogeneous within)(alike)

30 - 40 years old(within homogeneous) (alike)

40 - 50 yearsold(homogeneous within)(alike)

Heterogeneous(different)between

Heterogeneous(different)between

8Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Sample Spaces and EventsRandom Experiments

Noise variables affect the transformation of inputs to outputs.

Noise variables

Controlled variables

Input OutputSystem

9Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Example

Rotation speed Traverse speedTool type Tool sharpnessShaft material Shaft lengthMaterial removal per cut Part cleanliness Coolant flow Operator Material variation Ambient temperature Coolant age

Machining a shafton a lathe

Outputs (Y’s)DiameterTaperSurface finish

10Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Four Types of Probability

Marginal Union Joint Conditional

The probability of XOccurring P( X )

The probability of Xor Y occurring

The probability of Xand Y occurring

The probability of Xoccurring giventhat Y has occurred P(X|Y)X YX Y

X X Y X Y

11Prof. Indrajit Mukherjee, School of Management, IIT Bombay

P(A and B)(Venn Diagram)

P(A) P(B)

P(A and B)

12Prof. Indrajit Mukherjee, School of Management, IIT Bombay

P(A or B)

13Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Sample Spaces and EventsVenn Diagrams

14Prof. Indrajit Mukherjee, School of Management, IIT Bombay

E4

E1

E2

E3

Venn diagram of four mutually exclusive events

15Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Collectively Exhaustive Events• Events are said to be collectively exhaustive ifthe list of outcomes includes every possibleoutcome: heads and tails as possibleoutcomes of coin flip

16Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Example 3Draw Mutually Collectively

Exclusive Exhaustive

Draw a space and a club Yes Yes

Draw a face card and a Yes Yesnumber cardDraw an ace and a 3 Yes No

Draw a club and a nonclub Yes Yes

Draw a 5 and a diamond No No

Draw a red card and a No Nodiamond

17Prof. Indrajit Mukherjee, School of Management, IIT Bombay

The following circuit operates only if there is a path of functional devices from left to right. The probability that each device function is shown on the graph. Assume that devices fail independently, what is the probability the circuit operates?

Let T and B denote the events that the top and bottom devices operate, Respectively. There is a path if at least on device operates. The probability that the circuit operates is

P(T or B) =1-[P(T or B)’]=1-P(T’ and B’)

A simple formula for the solution can be derived from the complements T and B’. From the independence assumption.

P(T’ and B’)=P(T’) P(B’)=(1-0.95)2 =0.052

P(T or B)=1- 0.052 0.9975

0.95

0.95

a b

18Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Probability(D|F)

P(D|F) = P(DF)/P(F)

/

P(D) P(DF) P(F)

19Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Random Variables (Numeric)

Experiment Outcome Random Variable Range of RandomVariable

Stock 50Xmas trees

Number oftrees sold

X = number oftrees sold

0,1,2,, 50

Inspect 600items

Number acceptable

Y = number acceptable

0,1,2,…,600

Send out5,000 sales letters

Number ofPeople responding

Z = number ofpeople responding

0,1,2,…,5,000

Build anapartment

building

%completedafter 4months

R = %completedafter 4 months

0≤R ≤ 100

Test thelifetime of a

light bulb(minutes)

Time bulblasts - up to

80,000minutes

S = time bulbburns

0 ≤ S ≤ 80,000

20Prof. Indrajit Mukherjee, School of Management, IIT Bombay

105 221 183 186 121 181 180 143

97 154 153 174 120 168 167 141

245 228 174 199 181 158 176 110

163 131 154 115 160 208 158 133

207 180 190 193 194 133 156 123

134 178 76 167 184 135 229 146

218 157 101 171 165 172 158 169

199 151 142 163 145 171 148 158

160 175 149 87 160 237 150 135

196 201 200 176 150 170 118 149

Comressive strength (in psi) of 80 aluminum-lithium alloy specimens

21Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Frequency Distributions and Histograms

Histogram of compressive strength for 80 aluminum-lithium alloy specimens.

0

0.0625

0.1250

0.1895

0.2500

0.3125 25

0

5

10

15

20

70 90 110 130 150 170 190 210 230 250

Fre

qu

en

cy

compressive strength (psi)

22Prof. Indrajit Mukherjee, School of Management, IIT Bombay

438 450 487 451 452 441 444 461 432 471

413 450 430 437 465 444 471 453 431 458

444 450 446 444 466 458 471 452 455 445

468 459 450 453 473 454 458 438 447 463

445 466 456 434 471 437 459 445 454 423

472 470 433 454 464 443 449 435 435 451

474 457 455 448 478 465 462 454 425 440

454 441 459 435 446 435 460 428 449 442

455 450 423 432 459 444 445 454 449 441

449 445 455 441 464 457 437 434 452 439

Histograms – Useful for large data sets

Group values of the variable into bins, then count the number ofobservations that fall into each binPlot frequency (or relative frequency) versus the values of thevariable

23Prof. Indrajit Mukherjee, School of Management, IIT Bombay

30

0

10

20

405 415 425 435 445 455 465 475 485 495

Minitab histogram for the metal layer thickness data in table

Metal thickness

Frequency

24Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Histogram ExampleData in ordered array:12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

No gapsbetweenbars, sincecontinuousData

10

0

5

5 15 25 36 45 55

00

3

65

4

2

Histogram

Class midpoints

Frequency

25Prof. Indrajit Mukherjee, School of Management, IIT Bombay

How Many Class Intervals?

• Many (Narrow classintervals)• may yield a very jagged distributionwith gaps from empty classes• Can give a poor indication of howfrequency varies across classes

• Few (Wide class intervals)• may compress variation too muchand yield a blocky distribution• can obscure important patterns ofvariation.

26Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Calculation of Grouped Mean

Class Interval Frequency Class Midpoint fM

20-under 30 6 25 15030-under 40 18 35 63040-under 50 11 45 49550-under 60 11 55 60560-under 70 3 65 19570-under 80 1 75 75

50 2150

215043.0

50

fm

f

27Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Mode of Grouped Data

• Midpoint of the modal class• Modal class has the greatest frequency

Class Interval Frequency

20-under 30 3530-under 40 1840-under 50 1150-under 60 1160-under 70 370-under 80 1

30+40Mode= 35

2

28Prof. Indrajit Mukherjee, School of Management, IIT Bombay

6 1 5 7 8 6 0 2 4 25 2 4 4 1 4 1 7 2 34 3 3 3 6 3 2 3 4 55 2 3 4 4 4 2 3 5 75 4 5 5 4 5 3 3 3 12

29Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Frequency Distribution:Discrete Data

• Discrete data: possible values are countable

Example: Anadvertiser asks200 customershow many daysper week theyread the dailynewspaper.

Number of days read Frequency

0 441 242 183 164 205 226 267 30

total 200

30Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Relative FrequencyRelative Frequency: What proportion is in each

22% of thepeople in thesample reportthat they read theNewspaper days per week

Number of days read Frequency

Relative frequency

0 44 0.221 24 0.122 18 0.093 16 0.084 20 0.105 22 0.116 26 0.137 30 0.15

total 200 1.00

31Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Relative Frequency Plot andProbability Distributions

Histogram approximates a probability density function.

F(x)

X

32Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Interpretations of Probability

Relative frequency of corrupted pulses sent over acommunications channel.

Relative frequency of corrupted pulse=2/10

Corrupted pulse

Time

Volt

age

33Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Interpretations of Probability

P(E)=30(0.01)=0.30Probability of the event E is the sum of the probabilities of the outcomes in E

Diodes

E

S

34Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Random Variables

Question Random Variable x Type

Familysize

x = Number of dependents in family reported on tax return

Discrete

Distance fromhome to store

x = Distance in miles fromhome to the store site

Continuous

Own dogor cat

x = 1 if own no pet;= 2 if own dog(s) only;= 3 if own cat(s) only;= 4 if own dog(s) and cat(s)

Discrete

35Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Using past data on TV sales, …a tabular representation of the probabilitydistribution for TV sales was developed.

Unit soldNumber of days read

0 801 502 403 104 20

total 200

x f(x)0 .401 .252 .203 .054 .10

total 1.00

36Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Graphical Representation of the ProbabilityDistribution

.50

.10

.20

.30

.40

0 1 2 3 4

Values of random variable x (TV sales)

Pro

babili

ty

37Prof. Indrajit Mukherjee, School of Management, IIT Bombay

• Uniform Probability Distribution• Normal Probability Distribution• Exponential Probability Distribution

Uniform

Normal Exponential

F(x)

X X X

F(x) F(x)

38Prof. Indrajit Mukherjee, School of Management, IIT Bombay

x1 x2 x3 x4 x5

F(x)

X

F(x)

X

p(x3)

p(x4)

p(x5)p(x1)

p(x2)

a b

Sometimes called aprobability mass function

Sometimes called a probabilitydensity function

Probability distributions (a)Discrete case (b)continuous case

39Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Throwing a Dice

1 2 3 4 5 6

1/6

Distribution of X

P(X

)

X

40Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Example 2aOutcome Probabilityof Roll = 5

Die 1 Die 21 4 1/362 3 1/363 2 1/364 1 1/36

Rolling two diceresults in a total offive spots showing. There are a total of 36possibleoutcomes

41Prof. Indrajit Mukherjee, School of Management, IIT Bombay

sample Sample sample1,1 1 3,1 2 5,1 31,2 1.5 3,2 2.5 5,2 3.51,3 2 3,3 3 5,3 41,4 2.5 3,4 3.5 5,4 4.51,5 3 3,5 4 5,5 51,6 3.5 3,6 4.5 5,6 5.52,1 1.5 4,1 2.5 6,1 3.52,2 2 4,2 3 6,2 42,3 2.5 4,3 3.5 6,3 4.52,4 3 4,4 4 6,4 52,5 3.5 4,5 4.5 6,5 5.52,6 4 4,6 5 6,6 6

X X X

All Samples of subgroup size 2 from a Population

42Prof. Indrajit Mukherjee, School of Management, IIT Bombay

1 1/361.5 2/362 3/36

2.5 4/363 5/36

3.5 6/364 5/36

4.5 4/365 3/36

5.5 2/366 1/36

X P X

SamplingDistribution of X

43Prof. Indrajit Mukherjee, School of Management, IIT Bombay

1 2 3 4 5 6

(b) Sampling distribution of

6/36

4/36

2/36

X

p X

X

44Prof. Indrajit Mukherjee, School of Management, IIT Bombay

0

.6

.5

.4

.3

.2

.1

61 3.5Sampling Distribution of for n = 5

X

X

45Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Sampling Distributions of MeansFigure Distributions of averagescores from throwing dice.

1 2 3 4 5 6

1 2 3 4 5 6

1 2 3 4 5 6

46Prof. Indrajit Mukherjee, School of Management, IIT Bombay

SamplingDistributionBecomesAlmostNormalRegardlessof Shape ofPopulation

As SampleSize GetsLargeEnough

Central Limit Theorem

X

47Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Probability DistributionsOutcome X Number responding p(X)

SA 5 10 0.1

A 4 20 0.2

N 3 30 0.3

D 2 40 0.3

SD 1 50 0.1

1 2 3 4 50

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Chart Title

48Prof. Indrajit Mukherjee, School of Management, IIT Bombay

X P(X=x) X P(X=x)0 0.205891 6 0.0019391 0.343152 7 0.0002772 0.266896 8 0.0000313 0.128505 9 0.0000034 0.042835 10 05 0.010471

49Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Probability Distributions andProbability Density Functions

Density function of a loading on a long, thin beam.

X

Loadin

g

50Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Probability Distributions andProbability Density Functions

Probability determined from the area under f(x).

P(a<X<b)

X

F(x)

a b

51Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Continuous Uniform Random Variable

1/(b-a)

X

F(x)

a bContinuous uniform probability density function.

52Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Uniform Distribution or rectangularprobability distribution

1/(b-a)

X

F(x)

a b

1, where f x a x b

b a

area = width x height = (b – a) x1

b a

1

b a

53Prof. Indrajit Mukherjee, School of Management, IIT Bombay

ExampleThe amount of gasoline sold daily at a service station isuniformly distributed with a minimum of 2,000 gallons and amaximum of 5,000 gallons.

X

F(x)

1

5,000 2,000 2,000 5,000

Find the probability that daily sales will fall betweenand 3,000 gallons.2,500 Algebraically: what is P(2,500 ≤ X ≤ 3,000) ?

54Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Example

X

F(x)

1

3,000 2,000 5,000

12,500 3,000 3,000 2,500 0.1667

3,000P X

“there is about a 17% chance that between 2,500 and 3,000gallons of gas will be sold on a given day”

55Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Cumulative Distribution Functions

X

F(x)

0 12.5

1

Cumulative Distribution Functions

56Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Probability Distributions andProbability Density Functions

X

F(x)

12.5 12.6

Probability density function

57Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Normal Probability DistributionCharacteristicsThe distribution is symmetric, and is bell-shaped.

x

58Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Normal Probability Distribution

Characteristics

Probabilities for the normal random variable are given by areas under the curve. The total area under the curve is 1 (.5 to the left of the mean and .5 to the right).

x

0.5 0.5

59Prof. Indrajit Mukherjee, School of Management, IIT Bombay

The mean is not necessarily the 50th percentile of the distribution (that’s the median)The mean is not necessarily the most likely value of the random variable (that’s the mode)

Two probability distributions with same mean but different standard deviations

µ Median Mode

The mean of a distribution

Two probability distributions with different means

µ µ Mode

µ=20µ=10 µ=10

σ=2

σ=4

60Prof. Indrajit Mukherjee, School of Management, IIT Bombay

µ-1σ µ+1σ µ+2σ µ+3σµ-2σµ-3σ µ

99.73%

68.26%

95.46%

µ

σ2

F(x)

x

Areas under normal distributionThe normal distribution

61Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Normal Distribution

Standardizing a normal random variable.

62Prof. Indrajit Mukherjee, School of Management, IIT Bombay

A normal distribution whose mean is zero and standarddeviation is one is called the standard normal distribution.

σ=1µ=0

As we shall see shortly, any normal distribution can be converted to astandard normal distribution with simple algebra. This makes calculations much easier.

Standard Normal Distribution…

2

1 0

2 11

1 2

x

f x e x

63Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Normal DistributionExample

0 1.5 z

64Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Areas under Standardized Normal Distribution

65Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Areas under Standardized Normal Distribution

66Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Using Excel to ComputeStandard Normal ProbabilitiesFormula Worksheet

A B

1 Probabilities: standard normal distribution

2 P (z < 1 00) =NORMSDIST(1)

3 P (0.00 < z < 1.00) =NORMSDIST(1)-NORMSDIST(0)

4 P (0.00 < z < 1.25) =NORMSDIST(1.25)-NORMSDIST(0)

5 P (-1.00 < z < 1.00) =NORMSDIST(1)-NORMSDIST(-1)

6 P (z > 1.58) =1-NORMSDIST(1.58)

7 P (z < -0.50) =NORMSDIST(-0.5)

8

top related