hypothesis testing

Fundamentals of

Hypothesis Testing(Part I)

presented by Zoheb Alam Khan

Learning objectives

• What is hypothesis?• Types of hypothesis• Normal distribution curve• Hypothesis testing

Level of significance Types of errors p value One & two tail tests Degree of freedom Data analysis

What is a Hypothesis?• An educated guess

• A tentative point of view

• A proposition not yet tested

• A preliminary explanation

• A preliminary Postulate

• A hypothesis is a claim (assumption) about a population parameter

Various authors• “A hypothesis is a conjectural statement of the relation between two or

more variables”. (Kerlinger, 1956)

• “Hypothesis are single tentative guesses, good hunches – assumed for use in devising theory or planning experiments intended to be given a direct experimental test when possible”. (Eric Rogers, 1966)

• “Hypothesis is a formal statement that presents the expected relationship between an independent and dependent variable.”(Creswell, 1994)

• A hypothesis is a logical supposition, a reasonable guess, an educated conjecture. It provides a tentative explanation for a phenomenon under investigation." (Leedy and Ormrod, 2001).

A Hypothesis :• must make a prediction• must identify at least two variables• should have an elucidating power• should strive to furnish an acceptable explanation or accounting of a

fact• must be falsifiable meaning hypotheses must be capable of being

refuted based on the results of the study• must be formulated in simple, understandable terms• should correspond with existing knowledge• In general, a hypothesis needs to be unambiguous, specific,

quantifiable, testable and generalizable.

1. A Hypothesis must be conceptually clear - concepts should be clearly defined - the definitions should be commonly accepted - the definitions should be easily communicable

2. The hypothesis should have empirical reference - Variables in the hypothesis should be empirical realities - If they are not it would not be possible to make the observation and

ultimately the test

3. The Hypothesis must be specific - Place, situation and operation

Characteristics of a Testable Hypothesis

4. A hypothesis should be related to available techniques of research - Either the techniques are already available or - The researcher should be in a position to develop suitable techniques

5. The hypothesis should be related to a body of theory - Hypothesis has to be supported by theoretical argumentation - It should depend on the existing body of knowledge In this way - the study could benefit from the existing knowledge and- later on through testing the hypothesis could contribute to the reservoir of

knowledge

Characteristics of a Testable Hypothesis

Categorizing HypothesesCan be categorized in different ways

1. Based on their formulation • Null Hypotheses and Alternate Hypotheses

2. Based on direction• Directional and Non-directional Hypothesis

3. Based on their derivation• Inductive and Deductive Hypotheses

The Null Hypothesis, H0• States the claim or assertion to be tested• Is always about a population parameter, not about a sample

statistic • Begin with the assumption that the null hypothesis is true

– Similar to the notion of innocent until proven guilty

• Refers to the status quo• Always contains “=” , “≤” or “” sign• May or may not be rejected• It states that independent variable has no effect and there

will be no difference b/w the two groups.

The Alternative Hypothesis, H1

• Is the opposite of the null hypothesis• Challenges the status quo• Never contains the “=” , “≤” or “” sign• May or may not be proven• Is generally the hypothesis that the researcher is trying

to prove• It states that independent variable has an effect and

there will be a difference b/w the two groups.

Categorizing Hypotheses (Cont…)2. Directional Hypothesis and Non-directional Hypothesis

• Simply based on the wording of the hypothesis we can tell the difference between directional and non-directional

– If the hypothesis simply predicts that there will be a difference between the two groups, then it is a non-directional hypothesis. It is non-directional because it predicts that there will be a difference but does not specify how the groups will differ.

– If, however, the hypothesis uses so-called comparison terms, such as “greater,”“less,”“better,” or “worse,” then it is a directional hypothesis. It is directional because it predicts that there will be a difference between the two groups and it specifies how the two groups will differ

3. Inductive and Deductive Hypotheses(Theory Building and Theory Testing)

• classified in terms of how they were derived:- Inductive hypothesis - a generalization based on

observation

- Deductive hypothesis - derived from theory

Theory Hypothesis Observation Confirmation

Observation Pattern Hypothesis Theory

Normal Distribution Curve• A normal distribution curve is symmetrical, bell-shaped curve defined by the mean and standard deviation of a data set.

•The normal curve is a probability distribution with a total area under the curve of 1.

•The mean of the data in a standard normal distribution is 0 and the standard deviation is 1.

•A standard normal distribution is the set of all z-scores

One standard deviation away from the mean ( ) in either direction on the horizontal axis accounts for around 68 percent of the data. Two standard deviations away from the mean accounts for roughly 95 percent of the data with three standard deviations representing about 99.7 percent of the data.

Chap 9-15

6 Steps in Hypothesis Testing1. State the null hypothesis, H0 and the alternative hypothesis, H1

2. Choose the level of significance, , and the sample size, n3. Determine the appropriate test statistic (two-tail, one-tail, and Z or t

distribution) and sampling distribution4. Determine the critical values(mainly three criteria, (i) significance level,

(ii) degree of freedom,(iii) One or two tailed test,that divide the rejection and non rejection regions

5. Collect data and compute the value of the test statistic6. Make the statistical decision and state the managerial conclusion. If

the test statistic falls into the non rejection region, do not reject the null hypothesis H0. If the test statistic falls into the rejection region, reject the null hypothesis. Express the managerial conclusion in the context of the problem

Problem Definition

Clearly state the null and alternate hypotheses.

Choose the relevant test and the appropriate

probability distribution

Choose the critical value

Compare test statistic and critical value

Reject null

Does the test statistic fall in the critical region?

Determine the significance level

Compute relevant test statistic

Determine the degrees of freedom

Decide if one-or two-tailed test

Do not reject nullNo

Yes

Steps in Hypothesis Testing

Level of Significance, • Defines the unlikely values of the sample statistic if the null

hypothesis is true• Indicates the percentage of sample means that is outside the cut-off

limits (critical value)• It is the max. value of probablity of rejecting null hypothesis when it

is true.– Defines rejection region of the sampling distribution

• Is designated by , (level of significance)– Typical values are 0.01, 0.05, or 0.10

• Is selected by the researcher at the beginning• Provides the critical value(s) of the test

Level of Significance and the Rejection Region

H0: μ ≥ 3 H1: μ < 3 0

H0: μ ≤ 3 H1: μ > 3

Represents critical value

Lower-tail test

Level of significance =

0Upper-tail test

Two-tail test

Rejection region is shaded

/ /2

0

/ /2H0: μ = 3 H1: μ ≠ 3

Errors in Making Decisions • Type I Error

– Reject a true null hypothesis– Considered a serious type of error

• The probability of Type I Error is

• Called level of significance of

the test• Set by the researcher in

advance• Type II Error

– Fail to reject a false null hypothesis

• The probability of Type II Error is β

Testing of hypotheses Type I and Type II Errors

Decision H0 true / HA false H0 false / HA true

Accept H0 /reject HA OK

p=1-

Type II error ()

p=

Reject H0/accept HA

Type I error ()

p= OK p=1-

- level of significance

1- - power of the test

No study is perfect, there is always the chance for error

Testing of hypothesesType I and Type II Errors

The probability of making a Type I (α) can be decreased by altering the level of significance.

α =0.05there is only 5 chance in 100 that the result termed "significant" could occur by chance alone

it will be more difficult to find a significant result

the power of the test will be decreased the risk of a Type II error will be increased

Type I & II Error Relationship Type I and Type II errors cannot happen at the same

time

Type I error can only occur if H0 is true

Type II error can only occur if H0 is false

If Type I error probability ( ) , then

Type II error probability ( β )

Factors affecting type II error

All else equal:– β when the difference between hypothesized

parameter and its true value

– β when

– β when σ

– β when n

Testing of hypothesesType I and Type II Errors

The probability of making a Type II () can be decreased by increasing the level of significance.

it will increase the chance of a Type I error

To which type of error you are willing to risk ?

Degree of Freedom• The number or bits of "free" or unconstrained data used in

calculating a sample statistic or test statistic• It refers to the scores in a distribution that are free to change

without changing the mean of distribution.• A sample mean (X) has `n' degree of freedom• A sample variance (s2) has (n-1) degrees of freedom• This no. is used to determine power ,because the more

subjects the greater the power

One-Tail Test• In many cases, the alternative hypothesis focuses on a particular

direction• Determines whether a particular population parameter is larger or

smaller than some predefined value• Uses one critical value of test statistic

H0: μ ≥ 3 H1: μ < 3

H0: μ ≤ 3 H1: μ > 3

This is a lower-tail test since the alternative hypothesis is focused on the lower tail below the mean of 3

This is an upper-tail test since the alternative hypothesis is focused on the upper tail above the mean of 3

Two tailed test

• Two-tailed Test • Determines the

likelihood that a population parameter is within certain upper and lower bounds

• May use one or two critical values

Confidence interval and significance test

A value for null hypothesis within the 95% CI

A value for null hypothesis outside of 95% CI

p-value > 0.05

p-value < 0.05

Null hypothesis is accepted

Null hypothesis is rejected

p-Value Approach to Testing

• p-value: Probability of obtaining a test statistic more extreme ( ≤ or ) than the observed sample value given H0 is true

• Also called observed level of significance

• Smallest value of for which H0 can be rejected

p-Value Approach to Testing

• Convert Sample Statistic (e.g., X ) to Test Statistic (e.g., Z statistic )

• Obtain the p-value from a table or computer

• Compare the p-value with – If p-value < , reject H0

– If p-value , do not reject H0

(continued)

Fundamentals of

Hypothesis Testing(Part Il)

presented by Zoheb Alam Khan

Data AnalysisStatistics - a powerful tool for analyzing data

1. Descriptive Statistics - provide an overview of the attributes of a data set.It describes aspects such as the most common,average,range of values etc.These inclumeasurements of central tendency (frequency, histograms, mean, median, & mode) and dispersion (range, variance & standard deviation)

2. Inferential Statistics –it infer whether the diff. b/w or Relationships b/w groups represent persistent andreproducuble trend measures of how well your data support your hypothesis and if your data are generalisable beyond what was tested (significance tests).

Selection of appropriate inferential statistical test: It is determined by the following considerations:

The scale of measurement used to obtain the data( nominal, ordinal, interval, ratio)

The number of groups used in an investigation ( one or two or more than two)

Whether the measurements was obtained from independent subjects or from repeated measurements from the same subject.

Number of subjects in the study (sample size)

1. Nominal data: synonymous with categorical data, assigned names/ categories based on characters with out ranking between categories.

ex. male/female, yes/no, death /survival2. Ordinal data: ordered or graded data, expressed as Scores or ranks ex. pain graded as mild, moderate and severe3. Interval data: an equal and definite interval between two measurement , can

be continuous or discrete ex. weight expressed as 20, 21,22,23,24 interval between 20 & 21 is same as 23 &244. Ratio: measurement there is always an absolute zero that is meaningful.

This means that you can construct a meaningful fraction (or ratio) with a ratio variable.

The First QuestionAfter examining your data, ask: does what you're testingseem to be a question of relatedness or a question ofdifference?

• If relatedness (between your control and your experimentalsamples or between you dependent and independent variable), We will be using tests for correlation (positive or negative) or regression.

• If difference (your control differs from your experimental),we will be testing for independence between distributions,means or variances. Different tests will be employed ifyour data show parametric or non-parametric properties.

Parametric or Non-parametric Parametric tests: to estimate at least one population parameter

from sample statistics and are restricted to data that: 1) show a normal distribution 2) are independent of one another 3) are on the same continuous scale of measurement 4) require certain assumptions about the parameters of the population such as knowing μ and

Non-parametric tests : are used on data that: 1) show an other-than normal distribution 2) are dependent or conditional on one another 3) in general, do not have a continuous scale of measurement 4) does not require assumptions about the parameters of the population such as knowing μ and are not needed

Parametric and nonparametric tests of significance

Nonparametric tests Parametric tests

Nominaldata

Ordinal data Ordinal, interval,ratio data

One group Chi squaregoodnessof fit

Wilcoxonsigned rank test

One group t-test

Twounrelatedgroups

Chi square Wilcoxon ranksum test,Mann-Whitneytest

Student’s t-test

Two relatedgroups

McNemar’stest

Wilcoxonsigned rank test

Paired Student’st-test

K-unrelatedgroups

Chi squaretest

Kruskal -Wallisone wayanalysis ofvariance

ANOVA

K-relatedgroups

Friedmanmatchedsamples

ANOVA withrepeatedmeasurements

41

Types of Parametric tests 1. Large sample tests

Z-test 2. Small sample tests

t-test* Independent/ unpaired t-test

* Paired t-test ANOVA (Analysis of variance) * One way ANOVA * Two way ANOVA

Z test:• It is used to test the null hypothesis for a single sample

when the population variance is known.• A z-test is used for testing the mean of a population versus

a standard, or comparing the means of two populations, with large (n ≥ 30) samples whether you know the population standard deviation or not

• It is used to judge the significance of several statistical measures ,particularly mean.

• It compares a sample mean with the sampling distribution, i.e the sample is part of the sampling distribution

44

• It is also used for testing the proportion of some characteristic versus a standard proportion, or comparing the proportions of two populations.Ex. Comparing the average engineering salaries of men versus women.Ex. Comparing the fraction defectives from two production lines.

Formula in Computing the Test Statistic Using Z Test (Two Sample Mean Test)

• when the given means are sample means.

• when the given means are population means.

mean of the 1st sample mean of the 2nd sample mean of the 1st population mean of the 2nd population standard deviation of the 1st sample standard deviation of the 2nd sample standard deviation of the 1st population standard deviation of the 2nd populationsize of the 1st sample or populationsize of the 2nd sample or population

One tailed Z test:• A directional test in which a prediction is made that the

population represented by sample is either below or above the general population

Ha: μ 0 < μ 1 or Ha: μ 0 > μ 1

Two tailed Z test:• A non directional test in which a prediction is made that

the population represented by sample will differ from the general population,but thre direction of the difference is not predicted

Ha: μ 0 ≠ μ 1

Example• Ho: Children who learn whole language approach do not statistically

significantly differ from the average child in word recognition (µ = 75%, σ = 5%). In symbols: Ho: µ = 75%.

H1: Children who learn whole language approach statistically significantly differ from the average child with respect to word recognition (µ = 75%, σ = 5%). In symbols: H1: µ ≠ 75%.

• α = 0.05, thus the critical values (C.V.) are ± 1.96.Sample mean= 78% Population mean = 75% σ = 5% n = 50

Z = . = 78 - .75 = 0.03 = 4.24(This is the test statistic which is a z –

.05 /√50 .05 /√50 score (unit: standard deviation

We reject the null hypothesis and conclude that children who learn the whole language statistically significantly differ from the average child in word recognition, z = 4.24, p < .05.

.

n

x

t test: Derived by W S Gosset in 1908

• It is based on t-distribution• It is the indicator of the no. of standard deviation units

the sample mean is from the mean of the sampling distribution

• Used to judge the significance of a smaple mean or for judging the difference b/w the means of 2 samples in case of small sample(usually < 30) when population variance is not known,

• Properties of t distribution:i. It has mean 0ii. It has variance greater than oneiii. It is bell shaped symmetrical distribution about mean• Assumption for t test:i. Sample must be random, observations independentii. Standard deviation is not knowniii. Normal distribution of population

Uses of t test:iv. The mean of the sample

v. The difference between means or to compare two samples

vi. Correlation coefficient

Types of t test:a. Paired t test

b. Unpaired t test

Paired t test:• Consists of a sample of matched pairs of similar units, or one group

of units that has been tested twice (a "repeated measures" t-test).

• Ex. where subjects are tested prior to a treatment, say for high

blood pressure, and the same subjects are tested again after

treatment with a blood-pressure lowering medication

http://en.wikipedia.org/wiki/Unit_(statistics)

Unpaired t test:• When two separate sets of

independent and identically distributed samples are obtained, one from each of the two populations being compared.

• Ex: 1. compare the height of girls and boys.

2. compare 2 stress reduction interventionswhen one group practiced mindfulness meditation while the other learned progressive muscle relaxation.

http://en.wikipedia.org/wiki/Independent_and_identically-distributed_random_variables

http://en.wikipedia.org/wiki/Independent_and_identically-distributed_random_variables

One tailed t test:• A directional test in which a prediction is made that the

population represented by sample is either below or above the general population

Ha: μ 0 < μ 1 or Ha: μ 0 > μ 1

Two tailed t test:• A non directional test in which a prediction is made that

the population represented by sample will differ from the general population, but the direction of the difference is not predicted

Ha: μ 0 ≠ μ 1

ANOVA• Prof R. A fisher was the first to use the term variance and

developed a theory concerning ANOVA• ANOVA (Analysis of Variance) compares the means of two or

more parametric samples.• It tests the difference among different groups of data for

homogenity• Basic principle of ANOVA is to test for differences among the

means of the populations by examining the amount of variation within each of these samples, relative to the amount of variation b/w samples.

• The statistic for ANOVA is called the F statistic, which we get from the F Test

F = Estimate of population variance based on b/w sample variance Estimate of population variance based on within sample variance• If we take one factor and investigate the differences amongst it

various categories we use one way ANOVA• In case we investigate 2 factors at the same time ,then we use two

way ANOVA• The ANOVA test has 2 degrees of freedom:

– N-I (Total number sampled – Number of Groups)– I-1 (Number of Groups – 1)

• Assumption for ANOVA test:i. Normal distribution of populationii. 3 or more groups, iii. Variables are independentiv. Data is interval or ratio, v. Homogenity of variance

Difference between one & two way ANOVA• An example of when a one-way ANOVA could be used is if

we want to determine if there is a difference in the mean height of stalks of three different types of seeds. Since there is more than one mean, we can use a one-way ANOVA since there is only one factor that could be making the heights different.

• Now, if we take these three different types of seeds, and then add the possibility that three different types of fertilizer is used, then we would want to use a two-way ANOVA.

• The mean height of the stalks could be different for a combination of several reasons

• The types of seed could cause the change,

the types of fertilizer could cause the change, and/or there is

an interaction between the type of seed and the type of

fertilizer.

• There are two factors here (type of seed and type of

fertilizer), so, if the assumptions hold, then we can use a

two-way ANOVA.

Pearson Correlation coefficient: It measures relationship between two variables. denoted by ‘r’ , unitless quantity, it is a pure number. values lie between -1 and +1 if variables not correlated CC will be zero.

60

Summary of parametric tests applied for different type of data

Sl no Type of Group Parametric test

1. Comparison of two paired groups Paired ‘t’ test

2. Comparison of two unpaired groups Unpaired ‘t’ test

3. Comparison of three or more matched groups Two way ANOVA

4. Comparison of three or more matched groups One way ANOVA

5. Correlation between two variables Pearson correlation

Commonly used non parametric tests• Commonly used Non Parametric Tests are:

− Chi Square test− The Sign Test− Wilcoxon Signed-Ranks Test− Mann–Whitney U or Wilcoxon rank sum test− The Kruskal Wallis or H test− Friedman ANOVA− The Spearman rank correlation test− Cochran's Q test

Chi Square test• First used by Karl Pearson • Simplest & most widely used non-parametric test in statistical work.• Calculated using the formula-

χ2 = ∑ ( O – E )2

E

O = observed frequencies

E = expected frequencies• Greater the discrepancy b/w observed & expected frequencies,

greater shall be the value of χ2.• Calculated value of χ2 is compared with table value of χ2 for given

degrees of freedom.

Chi Square test

• Application of chi-square test:

– Test of association (smoking & cancer, treatment & outcome of

disease, vaccination & immunity)

– Test of proportions (compare frequencies of diabetics & non-

diabetics in groups weighing 40-50kg, 50-60kg, 60-70kg & >70kg.)

– The chi-square for goodness of fit (determine if actual numbers are

similar to the expected/theoretical numbers)

Chi Square test• Attack rates among vaccinated & unvaccinated children against measles

:

• Prove protective value of vaccination by χ2 test at 5% level of significance

Group Result TotalAttacked Not-attacked

Vaccinated (observed)

(a)10 (b) 90 (a+b)100

Unvaccinated (observed)

(c) 26 (d) 74 (c+d) 100

Total (a+c) 36 (b+d) 164 200

Chi Square test Group Result Total

Attacked Not-attacked

Vaccinated (Expected)

18 82 100

Unvaccinated (Expected)

18 82 100

Total 36 164 200

Chi Square test χ2 value = ∑ (O-E)2/E (10-18)2 + (90-82)2 + (26-18)2 + (74-82)2

18 82 18 82 64 + 64 + 64 + 64 18 82 18 82 =8.67 calculated value (8.67) > 3.84 (expected value corresponding to P=0.05) Direct formula = (ad-bc)2 * N (a+b)(c+d)(a+c)(b+d) Null hypothesis is rejected. Vaccination is protective.

• Yates’ correction: applies when we have two categories (one degree of freedom)• Used when sample size is ≥ 40, and expected frequency of <5 in one cell• Subtracting 0.5 from the difference between each observed value and its expected value in

a 2 × 2 contingency table

• χ2 = ∑ [O- E-0.5]2

E

The Chi-Square Test for Goodness-of-Fit (cont.)

• The null hypothesis specifies the proportion of the population that should be in each category.

• The proportions from the null hypothesis are used to compute expected frequencies that describe how the sample would appear if it were in perfect agreement with the null hypothesis.

The Chi-Square Test for Independence

• The second chi-square test, the chi-square test for independence, can be used and interpreted in two different ways:

1. Testing hypotheses about the relationship between two variables in a population, or

2. Testing hypotheses about differences between

proportions for two or more populations.

Sign Test• Used for paired data, can be ordinal or continuous• Simple and easy to interpret• Makes no assumptions about distribution of the data• Not very powerful• To evaluate H0 we only need to know the signs of the differences • If half the differences are positive and half are negative, then the

median = 0 (H0 is true).• If the signs are more unbalanced, then that is evidence against H0.

– Children in an orthodontia study were asked to rate how they felt about their teeth on a 5 point scale.

– Survey administered before and after treatment.

How do you feel about your teeth?

1. Wish I could change them

2. Don’t like, but can put up with them

3. No particular feelings one way or the other

4. I am satisfied with them5. Consider myself

fortunate in this area

Sign Test

childRating before

Rating after

1 1 52 1 43 3 14 2 35 4 46 1 47 3 58 1 59 1 4

10 4 411 1 112 1 413 1 414 2 415 1 416 2 517 1 418 1 519 4 420 3 5

• Use the sign test to evaluate

whether these data provide

evidence that orthodontic

treatment improves children’s

image of their teeth.

childRating before

Rating after change

1 1 5 42 1 4 33 3 1 -24 2 3 15 4 4 06 1 4 3

7 3 5 28 1 5 49 1 4 310 4 4 011 1 1 0

12 1 4 313 1 4 314 2 4 215 1 4 316 2 5 317 1 4 318 1 5 419 4 4 020 3 5 2

• First, for each child, compute the difference between the two ratings

childRating before

Rating after change sign

1 1 5 4 +2 1 4 3 +3 3 1 -2 -4 2 3 1 +5 4 4 0 0

6 1 4 3 +7 3 5 2 +8 1 5 4 +9 1 4 3 +10 4 4 0 011 1 1 0 0

12 1 4 3 +13 1 4 3 +14 2 4 2 +15 1 4 3 +16 2 5 3 +17 1 4 3 +18 1 5 4 +19 4 4 0 020 3 5 2 +

• The sign test looks at the signs of the differences– 15 children felt better

about their teeth (+ difference in ratings)

– 1 child felt worse (- diff.) – 4 children felt the same

(difference = 0)• If H0 were true we’d expect an

equal number of positive and negative differences.

(P value from table 0.004)

74

Wilcoxon signed-rank test• Nonparametric equivalent of the paired t-test.

• Similar to sign test, but take into consideration the magnitude of difference among the

pairs of values. (Sign test only considers the direction of difference but not the

magnitude of differences.) For eg

• The 14 difference scores in BP among hypertensive patients after giving drug A were:

-20, -8, -14, -12, -26, +6, -18, -10, -12, -10, -8, +4, +2, -18

• The statistic T is found by calculating the sum of the positive ranks, and the sum of

the negative ranks.

• The smaller of the two values is considered.

Wilcoxon signed-rank test Score Rank • +2 1 • +4 2• +6 3• -8 4.5 Sum of positive ranks = 6• -8 4.5• -10 6.5 Sum of negative ranks = 99• -10 6.5• -12 8 • -14 9 T= 6• -16 10• -18 11.5• -18 11.5• -20 13• -26 14

For N = 14, and α = .05, the critical value of T = 21. If T is equal to or less than T critical, then null hypothesis is rejected i.e., drug A decreases the BP among hypertensive patients.

Mann-Whitney U test• Mann-Whitney U – similar to Wilcoxon signed-ranks test except

that the samples are independent and not paired.• Null hypothesis: the population means are the same for the two

groups. • Rank the combined data values for the two groups. Then find the

average rank in each group. • Then the U value is calculated using formula• U= N1*N2+ Nx(Nx+1) _ Rx (where Rx is larger rank

2 total)• To be statistically significant, obtained U has to be equal to or

LESS than this critical value.

Example

• 10 dieters following Atkin’s diet vs. 10 dieters following Jenny Craig dietHypothetical RESULTS:• Atkin’s group loses an average of 34.5 lbs.• J. Craig group loses an average of 18.5 lbs.• Conclusion: Atkin’s is better?• When individual data is seen• Atkin’s, change in weight (lbs):

+4, +3, 0, -3, -4, -5, -11, -14, -15, -300•J. Craig, change in weight (lbs)

-8, -10, -12, -16, -18, -20, -21, -24, -26, -30

• RANK the values, 1 being the least weight loss and 20 being the most weight loss.

• Atkin’s– +4, +3, 0, -3, -4, -5, -11, -14, -15, -300– 1, 2, 3, 4, 5, 6, 9, 11, 12, 20

• J. Craig− -8, -10, -12, -16, -18, -20, -21, -24, -26, -30− 7, 8, 10, 13, 14, 15, 16, 17, 18, 19

• Sum of Atkin’s ranks: 1+ 2 + 3 + 4 + 5 + 6 + 9 + 11+ 12 + 20=73

• Sum of Jenny Craig’s ranks: 7 + 8 +10+ 13+ 14+ 15+16+ 17+

18+19=137

• Jenny Craig clearly ranked higher.

• Calculated U value (18) < table value (27), Null hypothesis is rejected.

Kruskal-Wallis One-way ANOVA

• It’s more powerful than Chi-square test.

• It is computed exactly like the Mann-Whitney test, except that

there are more groups (>2 groups).

• Applied on independent samples with the same shape (but not

necessarily normal).

Friedman ANOVA

• Friedman ANOVA: When either a matched-subjects or repeated-

measure design is used and the hypothesis of a difference

among three or more (k) treatments is to be tested, the Friedman

ANOVA by ranks test can be used.

Spearman rank-order correlation

• Use to assess the relationship between two ordinal variables or

two skewed continuous variables.

• Nonparametric equivalent of the Pearson correlation.

• It is a relative measure which varies from -1 (perfect negative

relationship) to +1 (perfect positive relationship).

hypothesis testing

Data & Analytics