kxgx6101 stats

Upload: javed765

Post on 04-Jun-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/13/2019 Kxgx6101 Stats

    1/31

    KXGX 6101: Research Methodology

    If we knew what we were doing, it wouldn't be calledresearch, would it? - Albert Einstein

  • 8/13/2019 Kxgx6101 Stats

    2/31

    Science & Statistic

    Objective of statistical methods to make the process as efficient as possible!

    deduction inductiondeduction induction

    deduction induction

    data (facts/phenomenon)

    Hypothesis (conjecture/model/theory)

    Hypothesis H1

    deduction

    Consequence

    of H1

    induction

    data

    ModifiedHypothesis H2

    I f the facts do n't f i t the theory, change the facts

    Alber t Einste in

  • 8/13/2019 Kxgx6101 Stats

    3/31

    Introduction to Design of Experiment (DoE)

    What is it? - DoE is efficient way to quantitatively determine hownumerous input variables (Xi) affect the outcome (Y)

    DoE can be best used if: Multiple variables affect outcome

    Interactions of inputs exist

    Want to sort out , using data, which variables are significant

    Unsure of how variables are affecting the outcome

    Want to verify what you think you know Want to quantify how a process works

    It is not Magic!

    All life is an experiment. The more experiments you make the better

    Ralph Waldo Emerso n

  • 8/13/2019 Kxgx6101 Stats

    4/31

    DoE Factorial Strategy

    All Possibilities are Considered from Main Effects to Interaction Effects

    Variable 2

    Variable 1

    Variable 3

    23

    = (Two Levels)(Three Factors)

    The probability of anything happening is in inverse ratio to its desirability

    John W. Hazard

  • 8/13/2019 Kxgx6101 Stats

    5/31

    DoE for Mileage Example

    Speed(A)

    Octane(B)

    Tyre Pressure(C)

    Mileage(Y)

    55 (-) 87 (-) 30 (-) Y1

    65 (+) 87 (-) 30 (-) Y2

    55 (-) 92 (+) 30 (-) Y3

    65 (+) 92 (+) 30 (-) Y4

    55 (-) 87 (-) 35 (+) Y5

    65 (+) 87 (-) 35 (+) Y6

    55 (-) 92 (+) 35 (+) Y7

    65 (+) 92 (+) 35 (+) Y8

    How Many Runs? 23= 8

    How Many Observations for each level? 4

    Problem: Gas mileage for Car is 20 mpg

  • 8/13/2019 Kxgx6101 Stats

    6/31

    What DoE Tools to Use

    Current State of Problem Knowledge

    Low High

    Type of Design Screening FractionalFactorial

    Factorial

    Usual Number ofFactors

    > 10 5-10 1-5

    Purpose

    Identify Most ImportantFactors- Vital Few

    Some Interactions RelationshipsAmong Factors

    Estimate Crude directionfor Improvement- Liner Effects

    Someinterpolation

    All main effectsand interactions

    Golden rule of an experiment: the duration of an experiment

    should not exceed the lifetime of the experimentalist

    Unknown Physic ist

  • 8/13/2019 Kxgx6101 Stats

    7/31

    Analyzing a Full Factorial Design

    Step 1: Set-up Table of Contrast Example: This example relates two quantitative Input Variables (Temperature

    and Concentration) and one qualitative Input (Catalyst) to Yield.

    The factors and levels:

    - Temperature : 160C (-1), 180C (+1)

    - Concentration (%) : 20 (-1), 40 (+1)- Catalyst : Brand A (-1), Brand B (+1)

    Temp Concentration Catalyst Yield

    -1 -1 -1 ?

    1 -1 -1 ?

    -1 1 -1 ?

    1 1 -1 ?

    -1 -1 1 ?

    1 -1 1 ?

    -1 1 1 ?

    1 1 1 ?

    Everything should be made as simple as possible,

    but not one bit simpler

    Albert Einstein

  • 8/13/2019 Kxgx6101 Stats

    8/31

    Step 2: Calculating Main Effects

    We will now calculate the effects of the experiment.

    First we look at Temperature. We simply add the Yields associated with (-1) and the Yields

    associated with (1) and calculate the average.

    Temp Concentration Catalyst Yield

    -1 -1 -1 60

    1 -1 -1 72

    -1 1 -1 54

    1 1 -1 68

    -1 -1 1 52

    1 -1 1 83

    -1 1 1 45

    1 1 1 80

    Total (-) 211 267 254

    Total (+) 303 247 260

    Diff 92 -20 6

    Mean Eff 23 -5 1.5

    Temperature Effect = (72 + 68 + 83 + 80) - (60 + 54 + 52 + 45)

    4 4

    = 75.72 - 52.75 = 23

    This can be interpreted as the Yield going up by and average of 23 points as temperaturemoves from low to high

  • 8/13/2019 Kxgx6101 Stats

    9/31

    Step 3: Calculating Interaction Effects (cont.)

    The Interaction Effects is represented by multiplying the columns to bepresented. For the 2x2 example, the Temperature x Concentration interactioncontrast is created by multiplying the Temp contrast and Concentrationcontrast.

    Temp Concentration TxC

    -1 -1 1

    1 -1 -1

    -1 1 -11 1 1

    It is always better to be approximately right than precisely wrong

    Unknown Engineer

  • 8/13/2019 Kxgx6101 Stats

    10/31

    Step 3: Calculating Interaction Effects

    Calculate the interaction effects for the entire matrix

    Temp (T) Conc (C) Cat (K) T*C T*K C*K T*C*K Yield-1 -1 -1 1 1 1 -1 60

    1 -1 -1 -1 -1 1 1 72

    -1 1 -1 -1 1 -1 1 54

    1 1 -1 1 -1 -1 -1 68

    -1 -1 1 1 -1 -1 1 52

    1 -1 1 -1 1 -1 -1 83

    -1 1 1 -1 -1 1 -1 45

    1 1 1 1 1 1 1 80

    Total (-) 211 267 254254 237 257 256

    Total (+) 303 247 260 260 277 257 258

    Diff 92 -20 6 6 40 0 2

    Mean Eff 23 -5 1.5 1.5 10 0 0.5

  • 8/13/2019 Kxgx6101 Stats

    11/31

    Step 4: Graph Main Effects Plot

    -1 1 -1 1 -1 1

    75 _

    70 _

    65 _

    60 _

    50 _

    Temp Conc Cat

    Average Yield at

    (-1) Level

    Average Yield at

    (+1) Level

  • 8/13/2019 Kxgx6101 Stats

    12/31

    Step 5: Graph Interaction Plot

    Mean

    Catalyst

    80 _

    70 _

    60 _

    50 _

    -1 1

    Interaction Plots (T*K)

    Mean

    Concentration

    80 _

    70 _

    60 _

    50 _

    -1 1

    Interaction Plots (T*C)

    Average Yield at(+1) Cat and (+1) Temp

    Average Yield at

    (-1) Cat and (-1) Temp

  • 8/13/2019 Kxgx6101 Stats

    13/31

    2520151050-5

    USLLSL

    Process Capability Analysis for C1

    PPM Total

    PPM > USL

    PPM < LSL

    PPM Total

    PPM > USL

    PPM < LSL

    PPM Total

    PPM > USL

    PPM < LSL

    Ppk

    PPL

    PPU

    Pp

    Cpm

    Cpk

    CPL

    CPU

    Cp

    StDev (Overall )

    StDev (Wi thin)

    Sample N

    Mean

    LSL

    Target

    USL

    687520.13

    554607.20

    132912.93

    673422.73

    559030.23

    114392.51

    900000.00

    750000.00

    150000.00

    -0.05

    0.37

    -0.05

    0.16

    *

    -0.05

    0.40

    -0.05

    0.18

    5.12602

    4.73941

    20

    10.7039

    5.0000

    *

    10.0000

    Exp. "Overall" PerformanceExp. "Within" PerformanceObserved PerformanceOverall Capabil ity

    Potential (Within) Capability

    Process Data

    Within

    Overall

    Distribution Plot

    Cpk = X - LSL

    3

  • 8/13/2019 Kxgx6101 Stats

    14/31

    Importance of Statistics in Industry

    Organizations around the world are constantlysearching for more effective methodology toachieve improvement (breakthroughimprovement)

    Financial Performance

    Customer Satisfaction The improvement methodology evolved from

    common sense, PDCA, Kaizen, Just-in-Time,Lean, SPC, TQM, Business Process Re-engineering to Six Sigmanow.

    I f your result needs a statist ician then you s hou ld design a better

    experiment."Ernest Rutherford

  • 8/13/2019 Kxgx6101 Stats

    15/31

    Six sigma commonly refers to a statistically derived performance

    target of 3.4 defects for every 1 million opportunities (3.4 DPMO).

    Six Sigma (with 1.5 sigma mean shifts)

    Statistical Definition of Six Sigma

    -6s -5s -4s -3s -2s -1s 0 1s 2s 3s 4s 5s 6s

    99.99966% or 3.4 DPMO

    Short -term

    LSL USL

    Short-term

    99.9999998% or 0.002 DPMO

    1.5s

  • 8/13/2019 Kxgx6101 Stats

    16/31

    Practical Meaning of Six Sigma

    54,000 lost articles of mail per year

    Five short or long landings at most

    major airports/day More than 40,500 newborn babies

    dropped by doctors/nurses each year

    Unsafe drinking water about two hours

    each month

    20,000 Lost bags per Day (Baggage

    Handling System

    Houston Airport )

    35 lost articles of mail per year

    One short or long landings at most

    major airports/10 year Three newborn babies dropped by

    doctors/nurses in 100 years

    Unsafe drinking water 1 second

    every 16 years

    < 5 Lost bags per day

    99% Good

    3-Sigma

    99.99% Good

    6-Sigma

    Why 99% Good is often not Good Enough

  • 8/13/2019 Kxgx6101 Stats

    17/31

    Six Sigma DMAIC Approach

    There are five major steps involved in applying Six Sigma Approach toachieve breakthrough quality and performance.

    Define, Measure,Analyze, Improve, & Control. (D-M-A-I-C).

    D M A I C

    Be thankful for problems. If they were less difficult,

    someone with less ability might have your job

    Reliabil i ty Engineer

  • 8/13/2019 Kxgx6101 Stats

    18/31

    DMAIC - Systematic Problem Solving Tool

    In Define phase, the team :

    Defines the Project

    Defines Problem & Goal Statement

    Defines Project Benefits (Financial Analysis)

    Defines Project Charter & Project Scope

    Obtains support from Management

    Classic American and Russian

    approach for a problem during

    space mission!

    D M A I C

    A SMART Goal statement

    Specific

    Measurable

    Attainable

    Relevant

    Time Bound

    "Stat ist ic s: The only scienc e that enables different experts using the

    same figures to draw different conclusions. - A frustrated Statist ician

  • 8/13/2019 Kxgx6101 Stats

    19/31

  • 8/13/2019 Kxgx6101 Stats

    20/31

    Measure Phase cont.

    Measurement System Analysis: Four characteristics to examine in a gaugesystem

    1) Sensitivity

    The gauge should be sensitive enough to detect differences in measurement asslight as one-tenth of the total tolerance specification.

    e.g: 200 0.1 mmtool should be able to measure at 0.01mm accuracy.

    2) Reproducibility

    The reliability of the gauge system to reproduce measurements. Customarilychecked by comparing the results of different operators taken at different time.

    This affects both accuracy and precision.

    D M A I C

  • 8/13/2019 Kxgx6101 Stats

    21/31

    Measure Phase cont.

    3) Accuracy

    An unbiased true value

    Normally reported as difference between the average of a number of measurementsand the true value.

    e.g: checking a micrometer with a gauge block

    4) Repeatability/Precision

    The ability to repeat the same measurement by the same operator at the sametime.

    To improve the accuracy and precision of a measurement process, it must have adefined test method and must be statistically stable.

    D M A I C

    Precise but not

    accurate

    Accurate but

    not precise

    Accurate and

    precise

  • 8/13/2019 Kxgx6101 Stats

    22/31

    D M A I C

    Measurement Error

    Repeatability & Reproducibility

    - Analysis of variance (ANOVA) is the most accurate method for quantifyingrepeatability and reproducibility.

    - It considers error by appraiser and the system

    How to do ANOVA Test:

    1) Calculate variance between system/appraiser

    2) Calculate variance within system/appraiser

    3) Calculate F ratio

    4) If F ratio is greater than the Fcritical value accept or reject your hypothesis

    Any equation longer than three inches is most likely wrong

    Unknown Physic ist

  • 8/13/2019 Kxgx6101 Stats

    23/31

    Example : A Complex DoE Model (using JMP)

    0

    0.5

    1

    1.5

    2

    2.5

    3

    3.5

    %

    StressActual

    .0 .5 1.0 1.5 2.0 2.5 3.0 3.5

    % Stress Predicted P

  • 8/13/2019 Kxgx6101 Stats

    24/31

    InAnalyze phase:

    Brainstorm potential root causes

    Use the data collected to determine root causes

    and opportunitiesfor improvement

    Verifies the hypothesisestablished Establishes the priority for action regarding theXs

    2 common techniques:

    i) Fish bone diagram

    ii) Why-why analysis

    Analyze Phase

    Separate what we think is happening from what is really happening !!

    Avoids solutions that dont solve the real problem !

    D M A I C

    Well done is better than well said

    Benjamin Frankl in

  • 8/13/2019 Kxgx6101 Stats

    25/31

    Cause-And-Effect Diagrams (Fish Bone)

    D M A I C

    Problem

    Material

    Method

    Machine Man

    Mother Nature

    (Effect )(Causes )

    Measurement

  • 8/13/2019 Kxgx6101 Stats

    26/31

    Why-Why Analysis

    It is a technique to determine root causes to a phenomenon by

    repeatedly asking Why

    It is a variant of the 5 Why Analysis used at Toyota Motor

    company for discovering true causes by repeating the question

    Why five times.

    Why?...Why?...Why?...Why?...Why? Stop!

    D M A I C

    It is easy to see, it is hard to foresee

    Benjamin Franklin , Am erican Scientist and Statesman

    h

  • 8/13/2019 Kxgx6101 Stats

    27/31

    After invested much time in the Define-Measure-Analyzephases, the team needs to change gear from being detailed

    minded (in process analysis and data analysis) to creativeand innovative in developing solutions and changeprocesses.

    Piloting whenever possible, before the fullimplementation.

    Improve Phase

    D M A I C

    If you bet on a horse, thats gambling.

    If you bet you can make three spades, thats entertainment.

    If you bet the device will survive for twenty years, thats engineering.

    See the difference?

    Unknown Engineer

    C t l Ph

  • 8/13/2019 Kxgx6101 Stats

    28/31

    Control Phase

    D M A I C

    This is the last phase in the DMAIC

    improvement process. Without control efforts, the improved process

    may revert to itsprevious state.

    Be more careful is not effective

    The old way of dealing with human error was to scold

    people, retrain them, and tell them to be more careful

    We cantdo much to change human nature, and peopleare going to make mistakes (often the same mistakes

    too). If we canttolerate them ... we should remove the

    opportunitiesfor error.

    What Abo ut Human Error ???

    To err is hum an, to forgive is divine, but to inc lude errors in

    your design is statistical

    Lesl ie Kisch

    P k Y k E P fi

  • 8/13/2019 Kxgx6101 Stats

    29/31

    Poka-Yoke Error Proofing

    Beep !!Beep !

    D M A I C

    Reliability it is when the customer comes back, not the product,

    Unkno wn Reliabil i ty Manager

    Key Take Away

  • 8/13/2019 Kxgx6101 Stats

    30/31

    Key Take Away

    Cpk = X - LSL

    3

    1. Plan DoE matrix using 23= (Two Levels) (Three Factors)You should have at least 8 runs for your simple DoE matrix

    2. Calculate Mean, Std Deviation (sigma ) and Cpk

    MsoftExcel application can be easily used for this

    Calculate Cpk using below formula:

    3. Plot interaction chart to understand the interaction of various input

    factors and identify the most significant factor(s)

    4. Plot distribution curve for better visualization of your data if needed

    5. Consider measurement errors in your data

  • 8/13/2019 Kxgx6101 Stats

    31/31

    Thanks for your attention

    Statist ics is like mod ern art, the more com plicated it is

    the higher the value

    Unknown Engineer