comparing systems powerpoints.pdf

Upload: libyaflower

Post on 26-Feb-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    1/45

    "The method that proceeds without analysis is like thegroping of a blind man. Socrates

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    2/45

    9-2

    Questions Addressed

    What are the issues involved in comparing

    two or more system configurations?

    What is hypothesis testing?

    How do common random numbers work?

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    3/45

    Purpose

    Purpose: comparison of alternative system designs.

    Approach: discuss a few of many statistical

    methods that can be used to compare two or more

    system designs.

    Statistical analysis is needed to discover whether observed

    differences are due to:

    Differences in design, or

    The random fluctuation inherent in the models.

    3

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    4/45

    Outline

    For two-system comparisons:

    Independent sampling.

    Correlated sampling (common random numbers).

    For multiple system comparisons:

    Bonferroni approach: confidence-intervalestimation, screening, and selecting the best.

    4

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    5/45

    9-5

    Comparing Two Systems

    Challenge is to determine if differences are

    attributable to actual differences inperformance and not to statistical variation.

    Running multiple replications, or batches,

    is required.

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    6/45

    9-6

    Hypothesis Testing

    Used when outcomes are close or high

    precision is required.

    A hypothesis is first formulated (e.g. thatmethods A and B both result in the same throughput)

    and then a test is made to see whether the

    results of the simulation lead us to rejectthe hypothesis.

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    7/459-7

    What does it mean if you

    fail to reject a hypothesis?

    1. Our hypothesis is correct

    OR

    2. The variance in the observed outcomesare too high given the number ofreplications to be conclusive (run morereplications or use variance reduction techniques).

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    8/459-8

    Types of Errors

    Type I error: Reject True Hypothesis

    Type II error: Accept False Hypothesis

    True False

    Accept

    Reject III

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    9/459-9

    Possible Confidence Intervals

    (A)[------|------]

    U1-U2=0

    [------|------]

    [------|------](C)

    (B)

    [------|------]denotes confidence Interval

    Fail to

    Reject H0

    Reject H0

    Reject H0

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    10/459-10

    Comparison of two Buffer Strategies

    Assume we have two candidate strategies believed to

    maximize the throughput of the system.

    We offer two methods based on the Confidence Interval

    approach.

    We seek to discover if the mean throughput of the systemdue to Strategy 1 and Strategy 2 are significantly different.

    We begin by estimating the mean performance of the two

    proposed strategies by simulating each strategy for 16 days

    past the warm-up period.

    The simulation experiment was replicated 10 times for

    each strategy.

    The throughput achieved by each strategy is shown next.

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    11/459-11

    Comparison of two Buffer Strategies

    (A)

    Replication

    (B)

    Strategy 1

    Throughput

    (C)

    Strategy 2

    Throughput

    1 54.48 56.01

    2 57.36 54.08

    3 54.81 52.14

    4 56.20 53.49

    5 54.83 55.49

    6 57.69 55.00

    7 58.33 54.88

    8 57.19 54.47

    9 56.84 54.93

    10 55.29 55.84

    mean 56.30 54.63

    Standard Deviation 1.37 1.17

    Variance 1.89 1.36

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    12/459-12

    Welch Confidence Interval

    Requires that the observations drawn

    from each simulated system benormally distributed and independent

    within a population and betweenpopulations.

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    13/459-13

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    14/459-14

    17.5

    = T.INV.2T(0.05,17.5) = 2.10

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    15/45

    9-15

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    16/459-16

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    17/45

    Paired-t CI for Comparing 2 Systems

    Like the Welch CI method, the paired-t CI

    method requires that the observations drawing

    from each population be normally distr ibuted

    and independentwithin a population.

    However, the paired-t CI method does not

    require that the observations betweenpopulations be independent.

    9-17

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    18/45

    Paired-t CI for Comparing 2 Systems

    This allows us to use Common Random

    Numbers to force a positive correlation between

    the two populations in order to reduce the half-

    width.

    Finally, like the Welch method, the paired-t CI

    method does not require that the population have

    equal variances.

    Given n observations (n1=n2=n), we pair the

    observations from each population (x1and x2) to

    define a new random variable: x(1-2)j = x1j- x2j9-18

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    19/459-19

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    20/45

    9-20

    Paired-t Comparison

    Variance 3.42

    (A)

    Replicat ion (j)

    (B )

    Strategy 1Throughputx1j

    (C)

    Strategy 2Throughput

    x2j

    (D)

    Difference (B C)

    X(1-2)j

    = x1j- x2j

    1 54.48 56.01 -1.53

    2 57.36 54.08 3.28

    3 54.81 52.14 2.674 56.20 53.49 2.71

    5 54.83 55.49 -0.66

    6 57.69 55.00 2.69

    7 58.33 54.88 3.45

    8 57.19 54.47 2.72

    9 56.84 54.93 1.91

    10 55.29 55.84 -0.55

    mean 1.67

    Standard Dev. 1.85

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    21/45

    9-21

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    22/45

    9-22

    Comparing More than 2 Systems

    Bonferroni approach

    ANOVA

    Factorial design and

    optimization experiments

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    23/45

    9-23

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    24/45

    9-24

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    25/45

    9-25

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    26/45

    9-26

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    27/45

    9-27

    Factorial Design

    Input variables are the factors

    Output measures are the responses

    Tests system response(s) when

    multiple factors are being manipulated.

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    28/45

    9-28

    Two-level Full-factorial Design

    Define a high and low level setting for each factor.

    Try every combination of factor settings.

    Run detailed studies for factors that have the

    greatest impact.

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    29/45

    9-29

    Fractional-factorial Design

    Strategically "screen out" factors that have little or no

    impact on system performance. On remaining factors, run a full-factorial experiment.

    Run detailed studies for factors that have the greatest

    impact.

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    30/45

    9-30

    Variance Reduction

    Common Random Numbers (CRN)

    Evaluates each system under the exact same

    circumstances.Helps ensure that observed differences of

    system designs are due to the differences in

    the designs and not to differences inexperimental conditions.

    Antithetic Random Numbers (ARN)

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    31/45

    9-31

    CRN ContinuedTable 7.10. Comparison of two buffer allocation strategies usincommon random numbers.(A)

    Repl icat ion

    (B)

    Strategy 1

    Throughput

    (C)

    Strategy 2

    Throughput

    (D)

    Differenc e (B -C)

    1 79.05 75.09 3.96

    2 54.96 51.09 3.87

    3 51.23 49.09 2.14

    4 88.74 88.01 0.73

    5 56.43 53.34 3.09

    6 70.42 67.54 2.88

    7 35.71 34.87 0.84

    8 58.12 54.24 3.88

    9 57.77 55.03 2.74

    10 45.08 42.55 2.53

    XDifference= 2.67

    sDifference= 1.16

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    32/45

    More Detail on Comparison of Two

    System Designs

    Goal: compare two possible system configurations, e.g.:

    two possible ordering policies in a supply-chain system,

    two possible scheduling rules in a job shop.

    Approach: the method of replications is used to analyze the

    output data.

    The mean performance measure for system iis denoted by

    i (i = 1, 2) to obtain point and interval estimates for the

    difference in mean performance, namely q1q2.

    32

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    33/45

    Comparison of Two System Designs

    Vehicle-safety inspection example:

    The station performs 3jobs: (1) brake check, (2) headlight check, and (3)steering check.

    Vehicles arrival: Possion with rate = 9.5/hour.

    Present system:

    Three stalls in parallel (one attendant makes all 3 inspections at each stall).

    Service times for the 3jobs: normally distributed with means 6.5, 6.0and 5.5minutes, respectively.

    Alternative system:

    Each attendant specializes in a single task, each vehicle will pass throughthree work stations in series

    Mean service times for each job decreases by 10%(5.85, 5.4, and 4.95minutes).

    Performance measure:mean response time per vehicle (total time fromvehicle arrival to its departure).

    33

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    34/45

    Comparison of Two System Designs

    From replication rof system i, the simulation analyst obtains an estimate

    Yirof the mean performance measure qi.

    Assuming that the estimators Yirare (at least approximately) unbiased:

    q1= E(Y1r), r = 1, , R1; q2= E(Y2r), r = 1, , R2

    Goal: compute a confidence interval for 1 2to compare thetwo system designs

    Confidence interval for 1 2: If c.i. is totally to the left of 0, strong evidence for the hypothesis that q1q2 < 0 (q1< q2).

    If c.i. is totally to the right of 0, strong evidence for the hypothesis that q1q2 > 0 (q1> q2).

    If c.i. contains 0, no strong statistical evidence that one system is better than the other

    If enough additional data were collected (i.e., Riincreased), the c.i. would

    most likely shift, and definitely shrink in length, until conclusion of 1< 2 or

    1> 2would be drawn.34

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    35/45

    Comparison of Two System Designs

    A two-sided 100(1-)% c.i. for q1q2always takes the

    form of:

    To calculate the Standard Error, the analyst uses one of

    two statistical techniques.

    Both techniques assume that the basic data Yir are

    approximately normally distributed.

    We will discuss these two methods next.

    ).(. 2.1.,2/2.1. YYestYY

    35

    freedom.ofdegresstheisand

    ns,replicatioalloversystemformeasureeperformancmeansampletheiswhere .

    iYi

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    36/45

    Comparison of Two System Designs

    Statistically significant versus practically significant

    Statistical significance: is the observed difference

    larger than the variability in ?

    Practical significance: is the true difference q1q2large

    enough to matter for the decision we need to make?

    Confidence intervals do not answer the question of

    practical significance directly, instead, they bound the true

    difference within a range.

    2.1. YY

    36

    2.1. YY

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    37/45

    Independent Sampling with Equal Variances

    [Comparison of 2 systems]

    Different and independent random number streams are

    used to simulate the two systems

    All observations of simulated system 1are statistically

    independent of all the observations of simulated system 2.

    The variance of the sample mean, , is:

    For independent samples:

    21,2

    .. ,i

    RR

    YVYV

    i

    i

    i

    ii

    iY

    .

    37

    2

    22

    1

    21

    2.1.2.1.

    RR

    YVYVYYV

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    38/45

    Independent Sampling with Equal Variances

    [Comparison of 2 systems]

    If it is reasonable to assume that 21 22(approximately) or ifR1= R2,a two-sample-t confidence-interval approach can be used:

    The point estimate of the mean performance difference is:

    The sample variance for system i is:

    The pooled estimate of 2is:

    C.I. is given by:

    Standard error:

    ).(. 2.1.,2/2.1. YYestYY

    2.1. YY

    38

    ii R

    r

    iirii

    R

    r

    irii

    i YRYR

    YYR

    S

    1

    2.

    2

    1

    2

    .2

    1

    1

    1

    1

    freedomofdegrees221re whe,2)1()1(

    21

    2

    22

    2

    112 -RRRR

    SRSRSp

    21

    2.1.

    11..

    RRSYYes p

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    39/45

    Independent Sampling with Unequal Variances

    [Comparison of 2 systems]

    If the assumption of equal variances cannot safely be made, anapproximate 100(1-)%C.I. for can be computed as:

    With degrees of freedom:

    Minimum number of replicationsR1> 7andR2> 7is recommended.

    39

    intergerantoround,

    1//1//

    //

    2

    2

    2221

    2

    121

    2

    2221

    21

    RRSRRS

    RSRS

    2

    22

    1

    21

    2.1...R

    S

    R

    SYYes

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    40/45

    Common Random Numbers (CRN)

    [Comparison of 2 systems] For each replication, the same random numbers are used to simulate

    both systems.

    For each replication r, the two estimates, Yr1and Yr2, are correlated.

    However, independent streams of random numbers are used on different

    replications, so the pairs (Yr1,Ys2) are mutually independent.

    Purpose: induce positive correlation between (for each r) to

    reduce variance in the point estimator of .

    Variance of arising from CRN is less than that of independent

    sampling (withR1=R2).

    RRR

    YYYVYVYYV

    2112

    2

    2

    2

    1

    2.1.2.1.2.1.

    2

    ,cov2

    .1 .2,Y Y

    40

    12is positive

    2.1. YY

    2.1. YY

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    41/45

    Common Random Numbers (CRN) [Comparison of 2 systems]

    The estimator based on CRN is more precise, leading to a

    shorter confidence interval for the difference.

    Sample variance of the differences :

    Standard error:

    1-Rfreedomofdegresswith,1

    andwhere

    1

    1

    1

    1

    1

    21

    1

    22

    1

    22

    R

    r

    rrrr

    R

    r

    r

    R

    r

    rD

    DR

    D-YYD

    DRDR

    DDR

    S

    2.1. YYD

    41

    R

    SYYesDes D 2.1...).(.

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    42/45

    Common Random Numbers (CRN)

    [Comparison of 2 systems] It is never enough to simply use the same seed for the

    random-number generator(s):

    The random numbers must be synchronized: each

    random number used in one model for some purposeshould be used for the same purpose in the other

    model.

    e.g., if the ithrandom number is used to generate a

    service time at work station 2 for the 5tharrival in

    model 1, the ithrandom number should be used for

    the very same purpose in model 2.

    42

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    43/45

    Comparison of Several System Designs

    To compare K alternative system designsbased on some specific

    performance measure, qi, of system i, for i = 1, 2, , K.

    Procedures are classified as:

    Fixed-sample-size procedures:predetermined sample size is used to

    draw inferences via hypothesis tests of confidence intervals.

    Sequential sampling (multistage): more and more data are collected

    until an estimator with a pre-specified precision is achieved or until one

    of several alternative hypotheses is selected.

    Some goals/approaches of system comparison:Estimation of each parameter q.

    Comparison of each performance measure qi, to control q1.

    All pairwise comparisons, qi- qj, for all i not equal to j

    Selection of the best qi. 43

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    44/45

    Bonferroni Approach[Multiple Comparisons]

    To make statements about several parameters simultaneously,(where all statements are true simultaneously).

    Bonferroni inequality:

    The smaller jis, the wider thejthconfidence interval will be.

    Major advantage: inequality holds whether models are run with

    independent sampling or CRN

    Major disadvantage: width of each individual interval increases

    as the number of comparisons increases.

    C

    jEji

    , ...,CiSP1

    11)1true,arestatementsall(

    44

    Overall error probability, provides an upper bound on the

    probability of a false conclusion

  • 7/25/2019 Comparing Systems Powerpoints.pdf

    45/45

    Bonferroni Approach [Multiple Comparisons]

    Should be used only when a small number of comparisons are made. Practical upper limit: about 10 comparisons

    3 possible applications:

    Individual c.i.s: Construct a 100(1- j)%c.i. for parameter qi,

    where # of comparisons = K.

    Comparison to an existing system: Construct a 100(1- j)%

    c.i. for parameter qi

    - q1

    (i = 2,3, K), where # of comparisons

    =K1.

    All pairwise: For any 2 different system designs, construct a

    100(1- j)%c.i. for parameter qi- qj. Hence, total # of