walpole chapter 01

7/31/2019 Walpole Chapter 01

1/12

WalpoleCh 01: 1

Spring 2007

Probability & Statistics

for Engineers &Scientists, by

Walpole, Myers, Myers

& Ye~Chapter 1 Notes

Class notes for ISE 201

San Jose State UniversityIndustrial & Systems Engineering Dept.

Steve Kennedy


2/12

WalpoleCh 01: 2

Spring 2007

Statistics Intro

There are three types of lies, lies, damn lies, andstatistics, a quote attributed to Mark Twain amongothers. In academia, we hopefully pursue the truth when analyzing

data. Companies, unfortunately, dont always have this goal.

We can use statistics to verify the validity of published resultsand find the truth, if raw data can be obtained.

Try to find some published invalid uses of statistics thissemester.

Statistics, used properly, is a tool for analyzing dataand discovering and proving its true meaning.


3/12

WalpoleCh 01: 3

Spring 2007

Statistics Intro

Suppose Im trying to measure the performance of twoalternative hardware/software combinations. Setup Atakes 9.8 seconds to execute a given user task, andSetup B takes 10.8 seconds. What can we infer from this experiment?

The answer is, without knowing the inherent variation in thesystem being measured and the measurement procedure, notmuch.

Every experiment and data set has some inherentvariation that must be understood before inferencesabout data can be made.

No two measurements are ever exactly the same, due to bothprocess and measurement variability.

We must always gather a sample of several data points inorder to make valid inferences.


4/12

WalpoleCh 01: 4

Spring 2007

Probability

Can anybody tell me what probability is?

When we know the underlying model that governs anexperiment, we use probability to figure out the chance thatdifferent outcomes will occur.

For example, if we flip a fair coin 3 times, what is theprobability of obtaining 3 heads?

By definition, probability values are between 0 and 1.What does it mean if Outcome A of an experiment has

a probability of 1/3rd of occurring? If the experiment is repeated a large number of times,

Outcome A will occur 1/3rd of the time.


5/12

WalpoleCh 01: 5

Spring 2007

Probability and Statistics

In Probability, we use our knowledge of the underlyingmodel to determine the probability that differentoutcomes will occur.

How does statistics compare to probability? In statistics, we dont know the underlying model governing

an experiment.

All we get to see is a sample of some outcomes of theexperiment.

We use that sample to try to make inferences about theunderlying model governing the experiment.

So a thorough understanding of probability is essentialto understanding statistics.


6/12

WalpoleCh 01: 6

Spring 2007

Probability and Statistics

Example: Suppose for a manufacturing process we havean upper limit of 5% defective items produced for theprocess to be in control.

We take a sample of 100 items produced and find 10defective items. Is the manufacturing process incontrol?

One way to do look at this is to say if the process has 5%defective items, what is the probability that there will be 10 ormore defective items in a sample of size 100?

The probability of this outcome is called a P-value.

In this case the P-value is only .0282. What does this mean?

That this outcome would occur by chance only 2.82% of thetime. So what is the definition of a P-value?

The P-value is the probability of getting the measuredoutcome if the assumed underlying model were true.


7/12

WalpoleCh 01: 7

Spring 2007

Samples and Populations

In statistics, a population is the set of all possibleoutcomes of an experiment (may be infinite). In probability, we will call the set of all possible outcomes the

sample space.

A sample is a set of observations taken from apopulation.

A random sample is selected so that every element inthe population has an equal chance of being selected.

Often in statistics, we compare samples from twodifferent populations and try to determine statistically

if the populations are significantly different.


8/12

WalpoleCh 01: 8

Spring 2007

Measures of Sample Location

The sample mean is the most important single statisticmeasuring the location of a sample. What is the common term for the sample mean?

The numerical average of the sample observations. How is thiscalculated?

The sum of the observations divided by the sample size n.

The sample mean is an estimate of the population mean. For a set of n observations, x1, x2, , xn , the sample mean iscalculated as follows:

n

x

x

n

i

i 1


9/12

l l


10/12

WalpoleCh 01: 10

Spring 2007

Variability Measures: Variance

Sample variability is critical to statistical calculations.

Sample variance and standard deviation are the mostimportant measures of variability. Does anyone know how to calculate variance?

For a set of n observations, x1, x2, , xn , the sample variance,s2, is calculated as follows:

n 1 is called the degrees of freedom associated with the

variance. This is the number independent squared deviations,or pieces of information that make up s2.

)1(

)(

1

)(1

2

1

2

1

2

2

nn

xxn

n

xx

s

n

i

i

n

i

i

n

i

i

l lf l i bili


11/12

WalpoleCh 01: 11

Spring 2007

Measures of Sample Variability

The standard deviation, s, is the square root of thevariance. What are the units of the standard deviation?

What does it mean if the variance (and thus the standarddeviation) are large?

Range is the other measure of sample variability.

How is the range of a sample of data calculated? The range is equal to xmax xmin .

W l lF Hi


12/12

WalpoleCh 01: 12

Spring 2007

Frequency Histogram

Frequency histogram: Given a sample of data points, we divide data into equally-

spaced intervals, and count the number of data points that fallinto each interval.

A histogram is a bar chart with the length of each barproportional to the number of observations in that interval.

A histogram for a sample will be an approximation of the

probability distribution of the population.

Probability distributions: Show much more about a population than just the mean and

standard deviation.

A distribution may be symmetric, or may be skewed to theright or the left.

The tail of a distribution shows the distance from the mean ofthe outlying points (for example, the 95th percentile point).

walpole chapter 01

Documents