engr 610 applied statistics fall 2007 - week 1
DESCRIPTION
ENGR 610 Applied Statistics Fall 2007 - Week 1. Marshall University CITE Jack Smith http://mupfc.marshall.edu/~smith1106. Overview for Today. Syllabus Introductions Chapters 1-3 Introduction to Statistics and Quality Improvement Tables and Charts Describing and Summarizing Data - PowerPoint PPT PresentationTRANSCRIPT
ENGR 610Applied Statistics
Fall 2007 - Week 1
Marshall UniversityCITE
Jack Smith
http://mupfc.marshall.edu/~smith1106
Overview for Today Syllabus Introductions Chapters 1-3
Introduction to Statistics and Quality Improvement
Tables and Charts Describing and Summarizing Data
Homework assignment
Syllabus
Week 1 (Aug 23) Introduction - Descriptive Statistics 1-3
Week 2 (Aug 30) Discrete Probability Distributions 4
Week 3 (Sept 6) Continuous Probability Distributions 5
Week 4 (Sept 13) Estimation Procedures 8
Week 5 (Sept 20) Review, Exam 1 1-5, 8
Week 7 (Sept 27) Hypothesis Testing 9
Week 7 (Oct 4) Hypothesis Testing 9
Week 8 (Oct 11) Design of Experiments 10
Week 9 (Oct 18) Design of Experiments 11
Week 10 (Oct 25) Review, Exam 2 9-11
Syllabus, cont’d
Week 11 (Nov 1) Simple Linear Regression 12
Week 12 (Nov 8) Multiple Regression 13
Week 13 (Nov 15) More Regression 13
Fall Break (Nov 22) (no class)
Week 14 (Nov 29) Review, Exam 3 12-13
Week 15 (Dec 6) (Exam 3 due)
Text -- Levine, Ramsey, Smidt, “Applied Statistics for Engineers and Scientists: Using Microsoft Excel and MINITAB” (Prentice-Hall, 2001) - with CD-ROM
Grading 25% - Homework and attendance 25% - Exam 1 25% - Exam 2 25% - Exam 3
Introductions Name Home town Undergraduate degree, major, where Major focus of study at MU Occupation, if working Background in statistics Hopes for this course
Introduction to Statistics (Ch 1)
What is Statistics? Variables Operational Definitions Sampling Software
What is Statistics? Descriptive Statistics
Methods that lead to the collection, tabulation, summarization and presentation of data
Inferential Statistics Methods that lead to conclusions, or estimates of
parameters, about a population (of size N) based on summary measures (statistics) on a sample (of size n) - in lieu of a census
Why Statistics? Describe numerical information Draw conclusions on a large population from
sample information only Derive and test models Understand and control variation Improve quality of processes Design experiments to extract maximum
information Predict or affect future behavior
Variables Categorical
Nominal Mutually exclusive Collectively exhaustive
Numerical Discrete or Continuous Scale
Ordered Interval - equally spaced Ratio - with absolute zero
Operational Definitions Objective, not subjective Specific tests, measurements Specific criteria Agreed to by all Consistent between individuals Stable over time
Sampling Advantages
Cost, time, accuracy, feasibility, scope Minimize destructive tests
Probability samples Simple random
With or without replacement Systematic random
Random start, but constant increment or rate
Non-probability samples Convenience, Judgment, Quota (representative)
Software Historical (mainframe, batch)
SAS, SPSS,… Specialized (workstations, stand-alone)
SAS, SPSS, MINITAB, S-PLUS (R*), BMDP,… Integrated (standard desktops)
DataDesk, JMP, SYSTAT, MINITAB Excel, add-ons (e.g., PHStat - from Prentice-Hall) MATLAB (Octave*)
*Open Source
Introduction to Quality Improvement Quality = fitness of use
Meeting user/customer needs, expectations, perceptions and experience
Quality of… Design - intentional differences, grades Conformance - meets/exceeds design Performance - long-term consistency
History of Quality Improvement
Middle Ages
> Industrial Revolution
> Information Age
Smith, Taylor, Ford, Shewhart, Deming
Read text!
Themes of Quality Improvement The primary focus is on process improvement
Shewhart-Deming cycle: Plan, Do, Study, Act Most of the variation in a process is systemic and not
due to the individual Teamwork is an integral part of a quality-management
organization Customer satisfaction - primary organizational goal Organizational transformation needs to occur to
implement quality management Fear must be removed from organizations Higher quality costs less, not more, but it requires an
investment in training
Tables and Charts (Ch 2) Process Flow Diagrams Cause-and-Effect Diagrams Time-Order Plots Numerical Data Concentration Diagrams Categorical Data Bivariate Categorical Data Graphical Excellence
Process Flow Diagrams
Cause-and-Effect Diagrams
Also known as an Ishikawa or a “fishbone” Diagram
Effect
Procedures or methods
People or personnel
Environment
Materials or supplies
Machinery or equipment
Time-Order Plots
Tables and Charts forNumerical Data
Stem-and-Leaf Displays Poor man’s histogram
Frequency Distribution “Binning” by range
Histogram Polygon
Concentration Diagrams
Data points overlaid on schematic or picture of object or process of interest
By location Displayed as individual symbols or
tallies
Tables and Charts forCategorical Data
Bar Chart Pie Chart
Almost always in percentages Pareto Diagram
Sorted (usually descending) Overlaid with cumulative line (polygon) plot Separate scales Usually in percentages
Examples
Tables and Charts forBivariate Categorical Data Contingency Table
Cross-classification Joint responses Percentages by row, column, total
Side-by-Side (Cluster) Bar Chart May prefer stacked bars with percentage data
A B C
1 5 3 2 102 2 3 4 93 0 2 3 5
7 8 9 24
Graphical Excellence Tufte, “The Visual Display of Quantitative
Information” Graphical excellence… gives the viewer the largest number
of ideas, in the shortest time, with the least ink - clearly, precisely, efficiently, and truthfully
Data-ink Ratio (data-ink)/(total ink used in graphic)
Chartjunk Non-data or redundant “ink”
Lie Factor (size of effect in graph)/(size of effect in data)
Describing and Summarizing Data - Descriptive Statistics (Ch 3)
Measures of… Central Tendency Variation Shape
Skewness Kurtosis
Box-and-Whisker Plots
Measures ofCentral Tendency Mean (arithmetic)
Average value: Median
Middle value - 50th percentile (2nd quartile) Mode
Most popular (peak) value(s) - can be multi-modal Midrange
(Max+Min)/2 Midhinge
(Q3+Q1)/2 - average of 1st and 3rd quartiles
1
NX i
i
N
Measures of Variation Range (max-min) Inter-Quartile Range (Q3-Q1) Variance
Sum of squares (SS) of the deviation from mean divided by the degrees of freedom (df) - see pp 113-5
df = N, for the whole population df = n-1, for a sample
2nd moment about the mean (dispersion)(1st moment about the mean is zero!)
Standard Deviation Square root of variance (same units as variable)
Sample (s2, s, n) vs Population (2, , N)
Quantiles Equipartitions of ranked array of observations
Percentiles - 100 Deciles - 10 Quartiles - 4 (25%, 50%, 75%) Median - 2
Pn = n(N+1)/100 -th ordered observation
Dn = n(N+1)/10
Qn = n(N+1)/4
Median = (N+1)/2 = Q2 = D5 = P50
Measures of Shape Symmetry
Skewness - extended tail in one direction 3rd moment about the mean
Kurtosis Flatness, peakedness
Leptokurtic - highly peaked, long tails Mesokurtic - “normal”, triangular, short tails Platykurtic - broad, even
4th moment about the mean
See p 118.
Box-and-Whisker Plots Graphical representation of five-number summary
Min, Max (full range) Q1, Q3 (middle 50%) Median (50th %-ile)
See pp 123-5
Shows symmetry (skewness) of distribution
Homework Ch 1
Appendix 1.2 Excel, Analysis ToolPak, PHStat add-in
Problems: 1.25 Ch 2
Appendix 2.1 Problems: 2.54, 2.55, 2.61
Ch 3 Appendix 3.1 Problems: 3.27, 3.31 (data on CD)
Next Week Probability and
Discrete Probability Distributions (Ch 4)