what is statistic. statistics is a tool for creating an understanding from a set of numbers

129
What is statistic

Upload: helena-harper

Post on 25-Dec-2015

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

What is statistic

Page 2: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

• Statistics is a tool for creating an understanding from a set of numbers.

Page 3: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

An Example: Stats Anxiety.

Page 4: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

Key Statistical Concepts. . .

• Population• — a population is the group of all items of interest to a• statistics practitioner.• — frequently very large; sometimes infinite.• E.g. All 5 million Florida voters • Sample• — A sample is a set of data drawn from the population.• — Potentially very large, but less than the population.• E.g. a sample of 765 voters exit polled on election day.

Page 5: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

• Parameter• — A descriptive measure of a population.• Statistic• — A descriptive measure of a sample.

Page 6: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 7: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

Descriptive Statistics. . .

• . . . are methods of organizing, summarizing, and presenting data in a convenient and informative way. These

• methods include:• Graphical Techniques , and• Numerical Techniques .• The actual method used depends on what information we

would like to extract.

Page 8: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

Inferential Statistics. . .

• Descriptive Statistics describe the data set that’s being analyzed, but doesn’t allow us to draw any conclusions or make any interferences about the data. Hence we need another branch of statistics: inferential statistics.

• Inferential statistics is also a set of methods, but it is used to draw conclusions or inferences about characteristics of populations based on data from a sample.

Page 9: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 10: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

• We use statistics to make inferences about parameters.• Therefore, we can make an estimate, prediction, or decision

about a population based on sample data.• Thus, we can apply what we know about a sample to the

larger population from which it was drawn!• Rationale:• Large populations make investigating each member

impractical and expensive.• Easier and cheaper to take a sample and make estimates

about the population from the sample.

Page 11: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

• However:• Such conclusions and estimates are not always going to be correct. For this

reason, we build into the statistical inference “measures of reliability,” namely confidence level and significance level.

Page 12: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

Confidence and Significance Levels. . .

• The confidence level is the proportion of times that an estimating procedure will be correct.

• E.g. a confidence level of 95% means that, estimates• based on this form of statistical inference will be correct 95%

of the time.• When the purpose of the statistical inference is to draw a

conclusion about a population, the significance level measures how frequently the conclusion will be wrong in the long run.

• E.g. a 5% significance level means that, in the long run, this type of conclusion will be wrong 5% of the time.

Page 13: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

• If we use α (Greek letter “alpha”) to represent significance, then our confidence level is 1 − α.

• This relationship can also be stated as:• Confidence Level + Significance Level = 1• Consider a statement from polling data you may hear about in

the news:• “This poll is considered accurate within 3.4 percentagepoints,

19 times out of 20.”• In this case, our confidence level is 95% (19/20 = 0.95), while

our significance level is 5%.

Page 14: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

Random Variables. .

• variability is omnipresent in the business world. To model variability probabilistically, we need the concept of a random variable.

• A random variable is a numerically valued variable which takes on different values with given probabilities.

Page 15: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

• Examples:• The return on an investment in a one-year

period• The price of an equity• The number of customers entering a store• The sales volume of a store on a particular day• The turnover rate at your organization next

year

Page 16: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

Types of Random Variables. . .

• Discrete Random Variable:• — one that takes on a countable number of

possible• values, e.g.,• • total of roll of two dice: 2, 3, . . . , 12• • number of desktops sold: 0, 1, . . .• • customer count: 0, 1, . . .

Page 17: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

• Continuous Random Variable:• — one that takes on an uncountable number of

possible• values, e.g.,• • interest rate: 3.25%, 6.125%, . . .• • task completion time: a nonnegative value• • price of a stock: a nonnegative value• Basic Concept: Integer or rational numbers are

discrete, while real numbers are continuous.

Page 18: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

Probability Distributions. . .

• Random variables have values that are determined by chance events. The future price of a share of stock is a random variable because its value is determined by chance factors such as market conditions, the accomplishment of revenue targets by the company, interest rates, and so on.

Page 19: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

• Random variables can be either discrete or continuous. A random variable is discrete if it can assume only a finite number of values or if its values are distinct and separate units.

• For example, the number of boxes of cookies produced during a given shift is a discrete random variable, because each box is a distinct, whole unit; a manufacturer would not produce or measure half a box of cookies.

Page 20: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

• continuous random variables can assume any range of values along a continuum. Consider boxes of cookies again. The weight of a box of cookies is a continuous random variable because it can be measured using an infinite range of fractional values.

• For example, the weight could assume values such as 16 ounces, 16.24 ounces, 16.2411 ounces, or any of a range of fractional values.

Page 21: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 22: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

• Consider the experiment of tossing a single die. Define X as the number of spot on the up face of the die after a toss. Then R = (I. 2. 3. 4. 5. 6). Assume the die is loaded so that the probability that a given face lands up is proportional to the number of spot showing. The discrete probability distribution for this random experiment is given by

Page 23: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 24: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 25: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 26: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 27: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 28: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 29: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 30: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 31: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 32: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 33: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 34: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

Population Mean — Expected Value. . .

• The population mean is the weighted average of all of its values. The weights are specified by the probability mass function. This parameter is also called the expected value of X and is denoted by E(X).

• The formal definition is similar to computing sample mean for grouped data:

Page 35: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

• Example: Expected No. of TVs• Let X be the number of TVs in a household.• Then,• E(X) = 0 · 0.012 + 1 · 0.319 + · · · + 5 · 0.028 = 2.084

Page 36: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

Population Variance. . .• The population variance is calculated similarly. It is the weighted

average of the squared deviations from the mean. Formally

• Since (2) is an expected value (of (X − µ) 2 ), it should be interpreted as the long-run average of squared deviations

• from the mean. Thus, the parameter σ2 is a measure of the extent of variability in successive realizations of X.

Page 37: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 38: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 39: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 40: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 41: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 42: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 43: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 44: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 45: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 46: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 47: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 48: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 49: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 50: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 51: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

• 1. Terminals on an on-line computer system are attached to a communication line to the central computer system. The probability that any terminal is ready to transmit is 0.95.

• Let X = number of terminals polled until the firstready terminal is located.

• 2. Toss a coin repeatedly.• Let X = number of tosses to first head• 3. It is known that 20% of products on a production line are

defective. Products are inspected until first defective is encountered.

• Let X = number of inspections to obtain first defective

Page 52: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

Poisson distribution• The Poisson distribution is a discrete distribution. It is often used as a

model for the number of events (such as the number of telephone calls at a business, number of customers in waiting lines, number of defects in a given surface area, airplane arrivals, or the number of accidents at an intersection) in a specific time period.

• The major difference between Poisson and Binomial distributions is that the Poisson does not have a fixed number of trials. Instead, it uses the fixed interval of time or space in which the number of successes is recorded.

Page 53: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

is the parameter which indicates the average number of events in the given time interval.

• Parameters: The mean is λ. The variance is λ.

• λ

Page 54: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 55: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

• Consider a computer system with Poisson job-arrival stream at an average of 2 per minute. Determine the probability that in any one-minute interval there will be

• (i) 0 jobs;• (ii) exactly 2 jobs;• (iii) at most 3 arrivals.• (iv) What is the maximum jobs that should

arrive one minute with 90 % certainty?

Page 56: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 57: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

Hypergeometric Distribution

• The probability distribution of a hypergeometric random variable is called a hypergeometric distribution

• The following notation is helpful, when we talk about hypergeometric distributions and hypergeometric probability.

• N: The number of items in the population.• k: The number of items in the population that are classified as successes.• n: The number of items in the sample.• x: The number of items in the sample that are classified as successes.• kCx: The number of combinations of k things, taken x at a time.• h(x; N, n, k): hypergeometric probability - the probability that an n-trial

hypergeometric experiment results in exactly x successes, when the population consists of N items, k of which are classified as successes.

Page 58: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

Hypergeometric Experiments

• A hypergeometric experiment is a statistical experiment that has the following properties:

• A sample of size n is randomly selected without replacement from a population of N items.

• In the population, k items can be classified as successes, and N - k items can be classified as failures.

• Consider the following statistical experiment. You have an urn of 10 marbles - 5 red and 5 green. You randomly select 2 marbles without replacement and count the number of red marbles you have selected. This would be a hypergeometric experiment.

Page 59: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

Hypergeometric Distribution

• A hypergeometric random variable is the number of successes that result from a hypergeometric experiment. The probability distribution of a hypergeometric random variable is called a hypergeometric distribution.

• Given x, N, n, and k, we can compute the hypergeometric probability based on the following formula:

• Hypergeometric Formula. Suppose a population consists of N items, k of which are successes. And a random sample drawn from that population consists of nitems, x of which are successes. Then the hypergeometric probability is:h(x; N, n, k) = [ kCx ] [ N-kCn-x ] / [ NCn ]

Page 60: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

• The hypergeometric distribution has the following properties:

• The mean of the distribution is equal to n * k / N .

• The variance is n * k * ( N - k ) * ( N - n ) / [ N2 * ( N - 1 ) ] .

Page 61: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

Example 1• Suppose we randomly select 5 cards without replacement from an ordinary

deck of playing cards. What is the probability of getting exactly 2 red cards (i.e., hearts or diamonds)?

• Solution: This is a hypergeometric experiment in which we know the following:

• N = 52; since there are 52 cards in a deck.• k = 26; since there are 26 red cards in a deck.• n = 5; since we randomly select 5 cards from the deck.• x = 2; since 2 of the cards we select are red.• We plug these values into the hypergeometric formula as follows:• h(x; N, n, k) = [ kCx ] [ N-kCn-x ] / [ NCn ]

h(2; 52, 5, 26) = [ 26C2 ] [ 26C3 ] / [ 52C5 ] h(2; 52, 5, 26) = [ 325 ] [ 2600 ] / [ 2,598,960 ] = 0.32513

• Thus, the probability of randomly selecting 2 red cards is 0.32513.

Page 62: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 63: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

Multinomial

• The Binomial distribution was based on having a series of events that could take on only two states: success/failure, sick/well, heads/tails, et cetera.

• But what if there are several possible events, like left/right/center, or Africa/Eurasia/Australia/Americas? The Multinomial distribution extends the Binomial distribution for such cases.

• The Binomial case could be expressed with one parameter, p, which indicated success with probability p and failure with probability 1 − p. The Multinomial case requires k variables, p1, . . . , p k, such that

Page 64: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 65: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

• The binomial distribution allows one to compute the probability of obtaining a given number of binary outcomes. For example, it can be used to compute the probability of getting 6 heads out of 10 coin flips. The flip of a coin is a binary outcome because it has only two possible outcomes: heads and tails. The multinomial distribution can be used to compute the probabilities in situations in which there are more than two possible outcomes.

Page 66: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

• For example, suppose that two chess players had played numerous games and it was determined that the probability that Player A would win is 0.40, the probability that Player B would win is 0.35, and the probability that the game would end in a draw is 0.25. The multinomial distribution can be used to answer questions such as: "If these two chess players played 12 games, what is the probability that Player A would win 7 games, Player B would win 2 games, and the remaining 3 games would be drawn?" The following formula gives the probability of obtaining a specific set of outcomes when there are three possible outcomes for each event:

Page 67: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

• where• p is the probability,

n is the total number of eventsn1 is the number of times Outcome 1 occurs,n2 is the number of times Outcome 2 occurs,n3 is the number of times Outcome 3 occurs,p1 is the probability of Outcome 1p2 is the probability of Outcome 2, andp3 is the probability of Outcome 3.

Page 68: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

• For the chess example,• n = 12 (12 games are played),

n1 = 7 (number won by Player A),n2 = 2 (number won by Player B),n3 = 3 (the number drawn),p1 = 0.40 (probability Player A wins)p2 = 0.35(probability Player B wins)p3 = 0.25(probability of a draw)

Page 69: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

Continuous Distributions

Page 70: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 71: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 72: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 73: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 74: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 75: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

Normal Distribution

Page 76: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 77: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 78: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 79: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 80: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 81: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 82: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 83: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

83

The lognormal distribution• A random variable x is lognormally distributed if ln(x) is

normally distributed– If x is normal, and ln(y) = x (or y = ex), then y is lognormal– If continuously compounded stock returns are normal then the

stock price is lognormally distributed

• Product of lognormal variables is lognormal– If x1 and x2 are normal, then y1=ex

1 and y2=ex2 are lognormal.

– The product of y1 and y2: y1 x y2 = ex1 x ex

2 = ex1+x

2

– Since x1+x2 is normal, ex1+x

2 is lognormal

Page 84: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

Lognormal Distribution – Probability Density Function

• A random variable X is said to have the Lognormal Distribution with parameters and , where > 0 and > 0, if the probability density function of X is:

• • , for X >0

• , for X 0 • • •

22

xln2

1

e2x

1 )x(f

0

x

f(x)

0

Page 85: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

• If X ~ LN(,),•

• then Y= ln (X) ~ N(,)

Page 86: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

Lognormal Distribution - Probability Distribution Function

xFxXPxF

ln )()(

where F(z) is the cumulative probability distribution function of N(0,1)

Page 87: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

Lognormal Distribution - Example• A theoretical justification based on a certain material failure

mechanism underlies the assumption that ductile strength X of a material has a lognormal distribution.

• If the parameters are µ=5 and σ=0.1 ,• Find: (a) µx and σx

(b) P(X >120)(c) P(110 ≤ X ≤ 130)(d) The median ductile strength(e) The expected number having strength at least 120, if ten

different samples of an alloy steel of this type were subjected to a strength test.

Page 88: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 89: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 90: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 91: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 92: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

The probability density function of a log-normal distribution is

Page 93: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

Negative binomial• Say that we have a sequence of Bernoulli draws. How many failures will

we see before we see n successes? If p percent of cars are illegally parked, and a meter reader hopes to write n parking tickets, the Negative binomial tells her the odds that she will be able to stop with n + x cars.

Page 94: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 95: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

Gamma distribution

• A better name in the statistical context would be ‘Negative Poisson,’ because it relates to the Poisson distribution in the same way the Negative binomial relates to the Binomial.

• If the timing of events follows a Poisson distribution, meaning that events come by at the rate of λ per period, then this distribution tells us how long we would have to wait until the nth event occurs

• The form of the Gamma distributionis typically expressed in terms of a shape parameter θ ≡ 1/λ, where λ is the Poisson parameter.

Page 96: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 97: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 98: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 99: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 100: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 101: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 102: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 103: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

• Just as the Gamma distribution is named for the Gamma function, the Beta distribution is named after the Beta function—whose parameters are typically notated as α and β

Page 104: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 105: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 106: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

Bivariate Distributions. . .

• Up to now, we have looked at univariate distributions, i.e., probability distributions in one variable.

• Bivariate distributions, also called joint distributions, are probabilities of combinations of two variables.

• For discrete variables X and Y , the joint probability distribution or joint probability mass function of X and Y is defined as:

• P(x, y) ≡ P(X = x and Y = y)• for all pairs of values x and y.

Page 107: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 108: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 109: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

Marginal Probabilities

Page 110: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 111: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

• Covariance and correlation describe how two variables are related.

• Variables are positively related if they move in the same direction.

• Variables are inversely related if they move in opposite directions.

• Both covariance and correlation indicate whether variables are positively or inversely related. Correlation also tells you the degree to which the variables tend to move together.

Page 112: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

• You are probably already familiar with statements about covariance and correlation that appear in the news almost daily.

• For example, you might hear that as economic growth increases, stock market returns tend to increase as well. These variables are said to be positively related because they move in the same direction. You may also hear that as world oil production increases, gasoline prices fall. These variables are said to be negatively, or inversely, related because they move in opposite directions.

Page 113: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 114: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

• To determine the actual relationships of these variables, you would use the formulas for covariance and correlation.

• Covariance• Covariance indicates how two variables are related.

A positive covariance means the variables are positively related, while a negative covariance means the variables are inversely related. The formula for calculating covariance of sample data is shown below.

Page 115: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 116: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

• To understand how covariance is used, consider the table below, which describes the rate of economic growth (xi) and the rate of return on the S&P 500 (yi).

• Using the covariance formula, you can determine whether economic growth and S&P 500 returns have a positive or inverse relationship. Before you compute the covariance, calculate the mean of x and y

Page 117: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 118: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 119: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

Correlation

• correlation also tells you the degree to which the variables tend to move together.

• covariance measures variables that have different units of measurement. Using covariance, you could determine whether units were increasing or decreasing, but it was impossible to measure the degree to which the variables moved together because covariance does not use one standard unit of measurement. To measure the degree to which variables move together, you must use correlation.

Page 120: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

• Correlation standardizes the measure of interdependence between two variables and, consequently, tells you how closely the two variables move. The correlation measurement, called a correlation coefficient, will always take on a value between 1 and – 1:

Page 121: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

• If the correlation coefficient is one, the variables have a perfect positive correlation. This means that if one variable moves a given amount, the second moves proportionally in the same direction.

• If correlation coefficient is zero, no relationship exists between the variables. If one variable moves, you can make no predictions about the movement of the other variable; they are uncorrelated.

• If correlation coefficient is –1, the variables are perfectly negatively correlated (or inversely correlated) and move in opposition to each other. If one variable increases, the other variable decreases proportionally.

Page 122: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 123: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 124: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

• To understand how correlation is used, consider the table below, which describes the rate of economic growth (xi) and the rate of return on the S&P 500 (yi).

• Using the correlation formula, you can determine whether economic growth and S&P 500 returns have a positive or inverse relationship.

Page 125: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

• you know that the covariance of S&P 500 returns and economic growth was calculated to be 1.53. Now you need to determine the standard deviation of each of the variables. You would calculate the standard deviation of the S&P 500 returns and the economic growth

Page 126: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 127: What is statistic. Statistics is a tool for creating an understanding from a set of numbers
Page 128: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

• Using the information from above, you know that• COV(x,y) = 1.53

sx = 0.90sy = 2.58

• Now you can calculate the correlation coefficient by substituting the numbers above into the correlation formula, as shown below.

Page 129: What is statistic. Statistics is a tool for creating an understanding from a set of numbers

• A correlation coefficient of .66 tells you two important things:

• Because the correlation coefficient is a positive number, returns on the S&P 500 and economic growth are postively related.

• Because .66 is relatively far from indicating no correlation, the strength of the correlation between returns on the S&P 500 and economic growth is strong.