mba2216 week 11 data analysis part 01

107
Data Analysis Part 1: Preparation, Frequencies, Hypothesis Testing MBA2216 BUSINESS RESEARCH PROJECT by Stephen Ong Visiting Fellow, Birmingham City University, UK Visiting Professor, Shenzhen University

Upload: stephen-ong

Post on 07-May-2015

978 views

Category:

Business


2 download

DESCRIPTION

Data preparation, Frequencies, Hypothesis testing

TRANSCRIPT

Page 1: Mba2216 week 11 data analysis part 01

Data Analysis Part 1:Preparation, Frequencies,

Hypothesis Testing

Data Analysis Part 1:Preparation, Frequencies,

Hypothesis Testing

MBA2216 BUSINESS RESEARCH PROJECT

byStephen Ong

Visiting Fellow, Birmingham City University, UKVisiting Professor, Shenzhen University

Page 2: Mba2216 week 11 data analysis part 01

19–2

LEARNING OUTCOMESLEARNING OUTCOMES

1. Know when a response is really an error and should be edited

2. Appreciate coding of pure qualitative research

3. Understand the way data are represented in a data file

4. Understand the coding of structured responses including a dummy variable approach

5. Appreciate the ways that technological advances have simplified the coding process

After this lecture, you should be able to

Page 3: Mba2216 week 11 data analysis part 01

6. Know what descriptive statistics are and why they are used

7. Create and interpret simple tabulation tables

8. Understand how cross-tabulations can reveal relationships

9. Perform basic data transformations

10. List different computer software products designed for descriptive statistical analysis

11. Understand a researcher’s role in interpreting the data

12. Implement the hypothesis-testing procedure

13. Use p-values to assess statistical significance

19–3

LEARNING OUTCOMESLEARNING OUTCOMES

Page 4: Mba2216 week 11 data analysis part 01

14. Test a hypothesis about an observed mean compared to some standard

15. Know the difference between Type I and Type II errors

16. Know when a univariate χ2 test is appropriate and how to conduct one

17. Recognize when a bivariate statistical test is appropriate

18. Calculate and interpret a χ2 test for a contingency table

19. Calculate and interpret an independent samples t-test comparing two means

19–4

LEARNING OUTCOMESLEARNING OUTCOMES

Page 5: Mba2216 week 11 data analysis part 01

Remember this,

Garbage in, garbage out! If data is collected improperly, or coded

incorrectly, then the research results are “garbage”.

Page 6: Mba2216 week 11 data analysis part 01

Stages of Data Analysis

Raw Data The unedited responses from a respondent

exactly as indicated by that respondent.

Nonrespondent Error Error that the respondent is not responsible for

creating, such as when the interviewer marks a response incorrectly.

Data Integrity The notion that the data file actually contains the

information that the researcher is trying to obtain to adequately address research questions.

Page 7: Mba2216 week 11 data analysis part 01

19–7

EXHIBIT 19.1 Overview of the Stages of Data Analysis

Page 8: Mba2216 week 11 data analysis part 01

Editing Editing

The process of checking the completeness, consistency, and legibility of data and making the data ready for coding and transfer to storage.

E.g. How long you have stayed at your current address? 45 The researchers need to make adjustment/reconstruct

responses Field Editing – useful in personal interview

Preliminary editing by a field supervisor on the same day as the interview to catch technical omissions, check legibility of handwriting, and clarify responses that are logically or conceptually inconsistent.

In-House Editing

A rigorous editing job performed by a centralized office staff.

Page 9: Mba2216 week 11 data analysis part 01

Editing – what to do? Checking for Consistency

Respondents match defined population – e.g. SBS? Check for consistency within the data collection

framework – e.g. items listed by the respondents are within the definition.

Taking Action When Response is Obviously in Error Change/correct responses only when there are

multiple pieces of evidence for doing so. Editing Technology

Computer routines can check for consistency automatically.

Page 10: Mba2216 week 11 data analysis part 01

19–10

Editing for Completeness Item Nonresponse

The technical term for an unanswered question on an otherwise complete questionnaire resulting in missing data.

Most of the time the researchers will do nothing to it. But sometimes the question is linked to another question

therefore the researchers have to fill-in-the blank. Plug Value

An answer that an editor “plugs in” to replace blanks or missing values so as to permit data analysis.

Choice of value is based on a predetermined decision rule, e.g. take an average value or neutral value.

Several choices: Leave it blank Plug in alternate choices. Randomly select an answer. Impute a missing value.

Page 11: Mba2216 week 11 data analysis part 01

Editing …

Impute To fill in a missing data point through the use of a

statistical process providing an educated guess for the missing response based on available information.

I.e. based on the respondent’s choices to other questions.

Page 12: Mba2216 week 11 data analysis part 01

Editing for Completeness (cont’d)

What about missing data? Many statistical software programs required complete data for an analysis to take place.

List-wise deletion The entire record for a respondent that has left

a response missing is excluded from use in statistical analysis.

Pair-wise deletion Only the actual variables for a respondent that

do not contain information are eliminated from use in statistical analysis.

Page 13: Mba2216 week 11 data analysis part 01

Please take note,

When a questionnaire has too many missing answer, it may not be suitable for the planned data analysis. In such situation, that particular questionnaire has to be dropped from the sample.

Page 14: Mba2216 week 11 data analysis part 01

Facilitating the Coding Process

Editing And Tabulating “Don’t Know” Answers Legitimate don’t know (no opinion) Reluctant don’t know (refusal to answer) Confused don’t know (does not

understand)

Page 15: Mba2216 week 11 data analysis part 01

Editing (cont’d)

Pitfalls of Editing Allowing subjectivity to enter into the editing process.

Data editors should be intelligent, experienced, and objective.

A systematic procedure for assessing the questionnaire should be developed by the research analyst so that the editor has clearly defined decision rules.

Pretesting Edit Editing during the pretest stage can prove very

valuable for improving questionnaire format, identifying poor instructions or inappropriate question wording.

Page 16: Mba2216 week 11 data analysis part 01

Coding Qualitative Responses

Coding The process of assigning a numerical score or

other character symbol to previously edited data. Codes

Rules for interpreting, classifying, and recording data in the coding process.

The actual numerical or other character symbols assigned to raw data.

Dummy Coding Numeric “1” or “0” coding where each number

represents an alternate response such as “female” or “male.”

If k is the number of categories for a qualitative variable, k-1 dummy variables are needed.

Page 17: Mba2216 week 11 data analysis part 01

Data File Terminology

Field A collection of characters that represents a

single type of data—usually a variable. String Characters

Computer terminology to represent formatting a variable using a series of alphabetic characters (nonnumeric characters) that may form a word.

Record A collection of related fields that represents the

responses from one sampling unit.

Page 18: Mba2216 week 11 data analysis part 01

Data File Terminology (cont’d)

Data File The way a data set is stored electronically

in spreadsheet-like form in which the rows represent sampling units and the columns represent variables.

Value Labels Unique labels assigned to each possible

numeric code for a response.

Page 19: Mba2216 week 11 data analysis part 01

Code Construction Two Basic Rules for Coding Categories:

1. They should be exhaustive, meaning that a coding category should exist for all possible responses.

2. They should be mutually exclusive and independent, meaning that there should be no overlap among the categories to ensure that a subject or response can be placed in only one category.

Test Tabulation – especially useful for open-ended questions

Tallying of a small sample of the total number of replies to a particular question in order to construct coding categories.

Purpose is to preliminarily identify the stability and distribution of answers that will determine a coding scheme.

Page 20: Mba2216 week 11 data analysis part 01

Test Tabulation

E.g. 1st respondent: I don’t like to use Facebook

because it is wasting time. 2nd respondent: I don’t know what is Facebook. 3rd respondent: Facebook takes me a lot of time.

Based on the above 3 answer, you can have 2 groups of answer: 1st group: Time factor 2nd group: No knowledge on Facebook

Page 21: Mba2216 week 11 data analysis part 01

Devising the Coding Scheme

A coding scheme should not be too elaborate. The coder’s task is only to summarize the data. Categories should be sufficiently unambiguous

that coders will not classify items in different ways.

Code book Identifies each variable in a study and gives the

variable’s description, code name, and position in the data matrix.

Page 22: Mba2216 week 11 data analysis part 01

The Nature of Descriptive Analysis

Descriptive Analysis The elementary transformation of raw data

in a way that describes the basic characteristics such as central tendency, distribution, and variability.

Histogram A graphical way of showing a frequency

distribution in which the height of a bar corresponds to the observed frequency of the category.

Page 23: Mba2216 week 11 data analysis part 01

20–23

EXHIBIT 20.1 Levels of Scale Measurement and Suggested Descriptive Statistics

Page 24: Mba2216 week 11 data analysis part 01

Creating and Interpreting Tabulation

Tabulation The orderly arrangement of data in a table or

other summary format showing the number of responses to each response category.

Tallying is the term when the process is done by hand.

Frequency Table A table showing the different ways

respondents answered a question. Sometimes called a marginal tabulation.

Page 25: Mba2216 week 11 data analysis part 01

Frequency Table Example

Page 26: Mba2216 week 11 data analysis part 01

Cross-Tabulation Cross-Tabulation

Addresses research questions involving relationships among multiple less-than interval variables.

Results in a combined frequency table displaying one variable in rows and another variable in columns.

Contingency Table A data matrix that displays the frequency of some

combination of responses to multiple variables. Marginals

Row and column totals in a contingency table, which are shown in its margins.

Page 27: Mba2216 week 11 data analysis part 01

20–27

EXHIBIT 20.2 Cross-Tabulation Tables from a Survey Regarding AIG and Government Bailouts

Page 28: Mba2216 week 11 data analysis part 01

20–28

EXHIBIT 20.3 Different Ways of Depicting the Cross-Tabulation of Biological Sex and Target Patronage

Page 29: Mba2216 week 11 data analysis part 01

Cross-Tabulation (cont’d)

Percentage Cross-Tabulations Statistical base – the number of respondents or

observations (in a row or column) used as a basis for computing percentages.

Elaboration and Refinement Elaboration analysis – an analysis of the basic

cross-tabulation for each level of a variable not previously considered, such as subgroups of the sample.

Moderator variable – a third variable that changes the nature of a relationship between the original independent and dependent variables.

Page 30: Mba2216 week 11 data analysis part 01

EXHIBIT 20.4 Cross-Tabulation of Marital Status, Sex, and Responses to the Question “Do You Shop at Target?”

Page 31: Mba2216 week 11 data analysis part 01

Cross-Tabulation (cont’d)

How Many Cross-Tabulations? Every possible response becomes a possible

explanatory variable. When hypotheses involve relationships among two

categorical variables, cross-tabulations are the right tool for the job.

Quadrant Analysis An extension of cross-tabulation in which responses

to two rating-scale questions are plotted in four quadrants of a two-dimensional table.

Importance-performance analysis

Page 32: Mba2216 week 11 data analysis part 01

EXHIBIT 20.5 An Importance-Performance or Quadrant Analysis of Hotels

Page 33: Mba2216 week 11 data analysis part 01

20–33

Data Transformation Data Transformation

Process of changing the data from their original form to a format suitable for performing a data analysis addressing research objectives.

Bimodal

Page 34: Mba2216 week 11 data analysis part 01

20–34

Problems with Data Transformations

Median Split Dividing a data set into two categories by placing

respondents below the median in one category and respondents above the median in another.

The approach is best applied only when the data do indeed exhibit bimodal characteristics.

Inappropriate collapsing of continuous variables into categorical variables ignores the information contained within the untransformed values.

Page 35: Mba2216 week 11 data analysis part 01

20–35

EXHIBIT 20.6 Bimodal Distributions Are Consistent with Transformations into Categorical Values

Page 36: Mba2216 week 11 data analysis part 01

20–36

EXHIBIT 20.7 The Problem with Median Splits with Unimodal Data

Page 37: Mba2216 week 11 data analysis part 01

20–37

Index Numbers

Index Numbers Scores or observations recalibrated to indicate

how they relate to a base number.

Price indexes Represent simple data transformations that allow

researchers to track a variable’s value over time and compare a variable(s) with other variables.

Recalibration allows scores or observations to be related to a certain base period or base number.

Page 38: Mba2216 week 11 data analysis part 01

20–38

EXHIBIT 20.8 Hours of Television Usage per Week

Page 39: Mba2216 week 11 data analysis part 01

20–39

Calculating Rank Order

Rank Order Ranking data can be summarized by

performing a data transformation. The transformation involves multiplying

the frequency by the ranking score for each choice resulting in a new scale.

Page 40: Mba2216 week 11 data analysis part 01

20–40

EXHIBIT 20.9 Executive Rankings of Potential Conference Destinations

Page 41: Mba2216 week 11 data analysis part 01

20–41

EXHIBIT 20.10 Frequencies of Conference Destination Rankings

Page 42: Mba2216 week 11 data analysis part 01

20–42

EXHIBIT 20.11 Pie Charts Work Well with Tabulations and Cross-Tabulations

Page 43: Mba2216 week 11 data analysis part 01

20–43

Computer Programs for Analysis

Statistical Packages Spreadsheets

Excel Statistical software:

SAS SPSS (Statistical

Package for Social Sciences)

MINITAB

Page 44: Mba2216 week 11 data analysis part 01

20–44

Computer Graphics and Computer Mapping

Box and Whisker Plots Graphic representations of central

tendencies, percentiles, variabilities, and the shapes of frequency distributions.

Interquartile Range A measure of variability.

Outlier A value that lies outside the normal range

of the data.

Page 45: Mba2216 week 11 data analysis part 01

20–45

EXHIBIT 20.15 Computer Drawn Box and Whisker

Plot

Page 46: Mba2216 week 11 data analysis part 01

SPSS Windows The main program in SPSS is FREQUENCIES. It produces a

table of frequency counts, percentages, and cumulative percentages for the values of each variable. It gives all of the associated statistics.

If the data are interval scaled and only the summary statistics are desired, the DESCRIPTIVES procedure can be used.

The EXPLORE procedure produces summary statistics and graphical displays, either for all of the cases or separately for groups of cases. Mean, median, variance, standard deviation, minimum, maximum, and range are some of the statistics that can be calculated.

Page 47: Mba2216 week 11 data analysis part 01

SPSS WindowsTo select these procedures click:

Analyze>Descriptive Statistics>FrequenciesAnalyze>Descriptive Statistics>DescriptivesAnalyze>Descriptive Statistics>Explore

The major cross-tabulation program is CROSSTABS.This program will display the cross-classification tables and provide cell counts, row and column percentages, the chi-square test for significance, and all the measures of the strength of the association that have been discussed.

To select these procedures, click:

Analyze>Descriptive Statistics>Crosstabs

Page 48: Mba2216 week 11 data analysis part 01

SPSS WindowsThe major program for conducting parametric tests in SPSS is COMPARE MEANS. This program can be used to conduct t tests on one sample or independent or paired samples. To select these procedures using SPSS for Windows, click:

Analyze>Compare Means>Means …

Analyze>Compare Means>One-Sample T Test …

Analyze>Compare Means>Independent-Samples T Test …

Analyze>Compare Means>Paired-Samples T Test …

Page 49: Mba2216 week 11 data analysis part 01

SPSS WindowsThe nonparametric tests discussed in this chapter canbe conducted using NONPARAMETRIC TESTS.

To select these procedures using SPSS for Windows,click:

Analyze>Nonparametric Tests>Chi-Square …

Analyze>Nonparametric Tests>Binomial …

Analyze>Nonparametric Tests>Runs …

Analyze>Nonparametric Tests>1-Sample K-S …

Analyze>Nonparametric Tests>2 Independent Samples …

Analyze>Nonparametric Tests>2 Related Samples …

Page 50: Mba2216 week 11 data analysis part 01

1 - 50

Page 51: Mba2216 week 11 data analysis part 01

SPSS Windows: Frequencies

1. Select ANALYZE on the SPSS menu bar.

2. Click DESCRIPTIVE STATISTICS and select FREQUENCIES.

3. Move the variable “Familiarity [familiar]” to the VARIABLE(s) box.

4. Click STATISTICS.

5. Select MEAN, MEDIAN, MODE, STD. DEVIATION, VARIANCE, and RANGE.

Page 52: Mba2216 week 11 data analysis part 01

SPSS Windows: Frequencies

6. Click CONTINUE.

7. Click CHARTS.

8. Click HISTOGRAMS, then click CONTINUE.

9. Click OK.

Page 53: Mba2216 week 11 data analysis part 01

Introduction of a Third Variable in Cross-Tabulation

Refined Association between the Two Variables

No Association between the Two Variables

No Change in the Initial Pattern

Some Association between the Two Variables

Some Association between the Two Variables

No Association between the Two Variables

Introduce a Third Variable

Introduce a Third Variable

Original Two Variables

Page 54: Mba2216 week 11 data analysis part 01

1 - 54

Page 55: Mba2216 week 11 data analysis part 01

SPSS Windows: Cross-tabulations

1. Select ANALYZE on the SPSS menu bar.

2. Click on DESCRIPTIVE STATISTICS and select CROSSTABS.

3. Move the variable “Internet Usage Group [iusagegr]” to the ROW(S) box.

4. Move the variable “Sex[sex]” to the COLUMN(S) box.

5. Click on CELLS.

6. Select OBSERVED under COUNTS and COLUMN under PERCENTAGES.

Page 56: Mba2216 week 11 data analysis part 01

SPSS Windows: Cross-tabulations

7. Click CONTINUE.

8. Click STATISTICS.

9. Click on CHI-SQUARE, PHI AND CRAMER’S V.

10. Click CONTINUE.

11. Click OK.

Page 57: Mba2216 week 11 data analysis part 01

20–57

Interpretation Interpretation

The process of drawing inferences from the analysis results.

Inferences drawn from interpretations lead to managerial implications and decisions.

From a management perspective, the qualitative meaning of the data and their managerial implications are an important aspect of the interpretation.

Page 58: Mba2216 week 11 data analysis part 01

Hypothesis Testing

Types of Hypotheses Relational hypotheses

Examine how changes in one variable vary with changes in another.

Hypotheses about differences between groups Examine how some variable varies from one group to

another. Hypotheses about differences from some

standard Examine how some variable differs from some

preconceived standard. These tests typify univariate statistical tests.

Page 59: Mba2216 week 11 data analysis part 01

21–59

Types of Statistical Analysis

Univariate Statistical Analysis Tests of hypotheses involving only one variable. Testing of statistical significance

Bivariate Statistical Analysis Tests of hypotheses involving two variables.

Multivariate Statistical Analysis Statistical analysis involving three or more

variables or sets of variables.

Page 60: Mba2216 week 11 data analysis part 01

21–60

The Hypothesis-Testing Procedure

Process1. The specifically stated hypothesis is derived from

the research objectives.

2. A sample is obtained and the relevant variable is measured.

3. The measured sample value is compared to the value either stated explicitly or implied in the hypothesis. If the value is consistent with the hypothesis, the

hypothesis is supported. If the value is not consistent with the hypothesis, the

hypothesis is not supported.

Page 61: Mba2216 week 11 data analysis part 01

20–61

EXHIBIT 20.10 Frequencies of Conference Destination Rankings

Page 62: Mba2216 week 11 data analysis part 01

20–62

EXHIBIT 20.11 Pie Charts Work Well with Tabulations and Cross-Tabulations

Page 63: Mba2216 week 11 data analysis part 01

20–63

Computer Programs for Analysis

Statistical Packages Spreadsheets

Excel Statistical software:

SAS SPSS (Statistical

Package for Social Sciences)

MINITAB

Page 64: Mba2216 week 11 data analysis part 01

20–64

Computer Graphics and Computer Mapping

Box and Whisker Plots Graphic representations of central

tendencies, percentiles, variabilities, and the shapes of frequency distributions.

Interquartile Range A measure of variability.

Outlier A value that lies outside the normal range

of the data.

Page 65: Mba2216 week 11 data analysis part 01

20–65

EXHIBIT 20.15 Computer Drawn Box and Whisker

Plot

Page 66: Mba2216 week 11 data analysis part 01

SPSS Windows The main program in SPSS is FREQUENCIES. It produces a

table of frequency counts, percentages, and cumulative percentages for the values of each variable. It gives all of the associated statistics.

If the data are interval scaled and only the summary statistics are desired, the DESCRIPTIVES procedure can be used.

The EXPLORE procedure produces summary statistics and graphical displays, either for all of the cases or separately for groups of cases. Mean, median, variance, standard deviation, minimum, maximum, and range are some of the statistics that can be calculated.

Page 67: Mba2216 week 11 data analysis part 01

SPSS WindowsTo select these procedures click:

Analyze>Descriptive Statistics>FrequenciesAnalyze>Descriptive Statistics>DescriptivesAnalyze>Descriptive Statistics>Explore

The major cross-tabulation program is CROSSTABS.This program will display the cross-classification tables and provide cell counts, row and column percentages, the chi-square test for significance, and all the measures of the strength of the association that have been discussed.

To select these procedures, click:

Analyze>Descriptive Statistics>Crosstabs

Page 68: Mba2216 week 11 data analysis part 01

SPSS WindowsThe major program for conducting parametric tests in SPSS is COMPARE MEANS. This program can be used to conduct t tests on one sample or independent or paired samples. To select these procedures using SPSS for Windows, click:

Analyze>Compare Means>Means …

Analyze>Compare Means>One-Sample T Test …

Analyze>Compare Means>Independent-Samples T Test …

Analyze>Compare Means>Paired-Samples T Test …

Page 69: Mba2216 week 11 data analysis part 01

SPSS WindowsThe nonparametric tests discussed in this chapter canbe conducted using NONPARAMETRIC TESTS.

To select these procedures using SPSS for Windows,click:

Analyze>Nonparametric Tests>Chi-Square …

Analyze>Nonparametric Tests>Binomial …

Analyze>Nonparametric Tests>Runs …

Analyze>Nonparametric Tests>1-Sample K-S …

Analyze>Nonparametric Tests>2 Independent Samples …

Analyze>Nonparametric Tests>2 Related Samples …

Page 70: Mba2216 week 11 data analysis part 01

1 - 70

Page 71: Mba2216 week 11 data analysis part 01

SPSS Windows: Frequencies

1. Select ANALYZE on the SPSS menu bar.

2. Click DESCRIPTIVE STATISTICS and select FREQUENCIES.

3. Move the variable “Familiarity [familiar]” to the VARIABLE(s) box.

4. Click STATISTICS.

5. Select MEAN, MEDIAN, MODE, STD. DEVIATION, VARIANCE, and RANGE.

Page 72: Mba2216 week 11 data analysis part 01

SPSS Windows: Frequencies

6. Click CONTINUE.

7. Click CHARTS.

8. Click HISTOGRAMS, then click CONTINUE.

9. Click OK.

Page 73: Mba2216 week 11 data analysis part 01

Introduction of a Third Variable in Cross-Tabulation

Refined Association between the Two Variables

No Association between the Two Variables

No Change in the Initial Pattern

Some Association between the Two Variables

Some Association between the Two Variables

No Association between the Two Variables

Introduce a Third Variable

Introduce a Third Variable

Original Two Variables

Page 74: Mba2216 week 11 data analysis part 01

1 - 74

Page 75: Mba2216 week 11 data analysis part 01

SPSS Windows: Cross-tabulations

1. Select ANALYZE on the SPSS menu bar.

2. Click on DESCRIPTIVE STATISTICS and select CROSSTABS.

3. Move the variable “Internet Usage Group [iusagegr]” to the ROW(S) box.

4. Move the variable “Sex[sex]” to the COLUMN(S) box.

5. Click on CELLS.

6. Select OBSERVED under COUNTS and COLUMN under PERCENTAGES.

Page 76: Mba2216 week 11 data analysis part 01

SPSS Windows: Cross-tabulations

7. Click CONTINUE.

8. Click STATISTICS.

9. Click on CHI-SQUARE, PHI AND CRAMER’S V.

10. Click CONTINUE.

11. Click OK.

Page 77: Mba2216 week 11 data analysis part 01

20–77

Interpretation Interpretation

The process of drawing inferences from the analysis results.

Inferences drawn from interpretations lead to managerial implications and decisions.

From a management perspective, the qualitative meaning of the data and their managerial implications are an important aspect of the interpretation.

Page 78: Mba2216 week 11 data analysis part 01

Hypothesis Testing

Types of Hypotheses Relational hypotheses

Examine how changes in one variable vary with changes in another.

Hypotheses about differences between groups Examine how some variable varies from one group to

another. Hypotheses about differences from some

standard Examine how some variable differs from some

preconceived standard. These tests typify univariate statistical tests.

Page 79: Mba2216 week 11 data analysis part 01

21–79

Types of Statistical Analysis

Univariate Statistical Analysis Tests of hypotheses involving only one variable. Testing of statistical significance

Bivariate Statistical Analysis Tests of hypotheses involving two variables.

Multivariate Statistical Analysis Statistical analysis involving three or more

variables or sets of variables.

Page 80: Mba2216 week 11 data analysis part 01

21–80

The Hypothesis-Testing Procedure

Process1. The specifically stated hypothesis is derived from

the research objectives.

2. A sample is obtained and the relevant variable is measured.

3. The measured sample value is compared to the value either stated explicitly or implied in the hypothesis. If the value is consistent with the hypothesis, the

hypothesis is supported. If the value is not consistent with the hypothesis, the

hypothesis is not supported.

Page 81: Mba2216 week 11 data analysis part 01

20 0 :H

Univariate Hypothesis Test Utilizing the t-Distribution: An Example

The sample mean is equal to 20.

The sample mean is equal not to 20.

20 1 :H

nSSX / 25/5 1

Page 82: Mba2216 week 11 data analysis part 01

Univariate Hypothesis Test Utilizing the t-Distribution: An Example (cont’d)

The researcher desired a 95 percent confidence; the significance level becomes 0.05.

The researcher must then find the upper and lower limits of the confidence interval to determine the region of rejection. Thus, the value of t is needed. For 24 degrees of freedom (n-1= 25-1),

the t-value is 2.064.

Page 83: Mba2216 week 11 data analysis part 01

Univariate Hypothesis Test Utilizing the t-Distribution: An Example (cont’d)

9361725

5064220 ....

Xlc StLower limit

=

0642225

5064220 ....

Xlc StUpper limit

=

Page 84: Mba2216 week 11 data analysis part 01

Univariate Hypothesis Test Utilizing the t-Distribution: An Example (cont’d)

Univariate Hypothesis Test t-Test

X

obs S

Xt

1

2022

1

2 2

This is less than the critical t-value of 2.064 at the 0.05 level with 24 degrees of freedom hypothesis is not supported.

Page 85: Mba2216 week 11 data analysis part 01

21–85

The Chi-Square Test for Goodness of Fit

Chi-square (χ2) test Tests for statistical significance. Is particularly appropriate for testing

hypotheses about frequencies arranged in a frequency or contingency table.

Goodness-of-Fit (GOF) A general term representing how well some

computed table or matrix of values matches some population or predetermined table or matrix of the same size.

Page 86: Mba2216 week 11 data analysis part 01

The Chi-Square Test for Goodness of Fit: An Example

Page 87: Mba2216 week 11 data analysis part 01

The Chi-Square Test for Goodness of Fit: An Example (cont’d)

i

ii( ²

E

E )²O

χ² = chi-square statisticsOi = observed frequency in the ith cellEi = expected frequency on the ith cell

Page 88: Mba2216 week 11 data analysis part 01

n

CRE jiij

Chi-Square Test: Estimation for Expected Number for Each Cell

Ri = total observed frequency in the ith rowCj = total observed frequency in the jth columnn = sample size

Page 89: Mba2216 week 11 data analysis part 01

Hypothesis Test of a Proportion Hypothesis Test of a Proportion

Is conceptually similar to the one used when the mean is the characteristic of interest but that differs in the mathematical formulation of the standard error of the proportion.

pobs S

pZ

π is the population proportionp is the sample proportionπ is estimated with p

Page 90: Mba2216 week 11 data analysis part 01

What Is the Appropriate Test of Difference?

Test of Differences

An investigation of a hypothesis that two (or more) groups differ with respect to measures on a variable.

Behaviour, characteristics, beliefs, opinions, emotions, or attitudes

Bivariate Tests of Differences

Involve only two variables: a variable that acts like a dependent variable and a variable that acts as a classification variable.

Differences in mean scores between groups or in comparing how two groups’ scores are distributed across possible response categories.

Page 91: Mba2216 week 11 data analysis part 01

22–91

EXHIBIT 22.1 Some Bivariate Hypotheses

Page 92: Mba2216 week 11 data analysis part 01

Cross-Tabulation Tables: The χ2 Test for Goodness-of-Fit

Cross-Tabulation (Contingency) Table A joint frequency distribution of observations on

two more variables. χ2 Distribution

Provides a means for testing the statistical significance of a contingency table.

Involves comparing observed frequencies (Oi) with expected frequencies (Ei) in each cell of the table.

Captures the goodness- (or closeness-) of-fit of the observed distribution with the expected distribution.

Page 93: Mba2216 week 11 data analysis part 01

Chi-Square Test

i

ii

E

)²E(O χ²

χ² = chi-square statisticOi = observed frequency in the ith cellEi = expected frequency on the ith cell

n

CRE jiij

Ri = total observed frequency in the ith rowCj = total observed frequency in the jth columnn = sample size

Page 94: Mba2216 week 11 data analysis part 01

Degrees of Freedom (d.f.)

d.f.=(R-1)(C-1)

Page 95: Mba2216 week 11 data analysis part 01

22–95

Example: Papa John’s RestaurantsUnivariate Hypothesis:Papa John’s restaurants are more likely to be located in a stand-alone location or in a shopping center.

Bivariate Hypothesis: Stand-alone locations are more likely to be profitable than are shopping center locations.

Page 96: Mba2216 week 11 data analysis part 01

Example: Papa John’s Restaurants (cont’d)

In this example, χ2 = 22.16 with 1 d.f. From Table A.4, the critical value at the

0.05 level with 1 d.f. is 3.84. Thus, we are 95 percent confident that

the observed values do not equal the expected values.

But are the deviations from the expected values in the hypothesized direction?

Page 97: Mba2216 week 11 data analysis part 01

χ2 Test for Goodness-of-Fit Recap

Testing the hypothesis involves two key steps:

1. Examine the statistical significance of the observed contingency table.

2. Examine whether the differences between the observed and expected values are consistent with the hypothesized prediction.

Page 98: Mba2216 week 11 data analysis part 01

The t-Test for Comparing Two Means Independent Samples t-Test

A test for hypotheses stating that the mean scores for some interval- or ratio-scaled variable grouped based on some less-than-interval classificatory variable are not the same.

means random ofy Variabilit

2 MeanSample - 1 MeanSample t

21

21 XXS

t

Page 99: Mba2216 week 11 data analysis part 01

The t-Test for Comparing Two Means (cont’d)

Pooled Estimate of the Standard Error An estimate of the standard error for a t-

test of independent means that assumes the variances of both groups are equal.

2121

222

211 11

2

1121 nnnn

SnSnS XX

))(

Page 100: Mba2216 week 11 data analysis part 01

© 2010 South-Western/Cengage Learning. All rights reserved. May not

be scanned, copied or duplicated, or posted to a publically accessible

website, in whole or in part.22–100

EXHIBIT 22.2 Independent Samples t-Test Results

Page 101: Mba2216 week 11 data analysis part 01

Comparing Two Means (cont’d) Paired-Samples t-Test

Compares the scores of two interval variables drawn from related populations.

Used when means need to be compared that are not from independent samples.

Page 102: Mba2216 week 11 data analysis part 01

© 2010 South-Western/Cengage Learning. All rights reserved. May not

be scanned, copied or duplicated, or posted to a publically accessible

website, in whole or in part.22–102

EXHIBIT 22.4 Example Results for a Paired Samples t-Test

Page 103: Mba2216 week 11 data analysis part 01

1 - 103

Page 104: Mba2216 week 11 data analysis part 01

SPSS Windows: One Sample t Test

1. Select ANALYZE from the SPSS menu bar.

2. Click COMPARE MEANS and then ONE SAMPLE T TEST.

3. Move “Familiarity [familiar]” in to the TEST VARIABLE(S) box.

4. Type “4” in the TEST VALUE box.

5. Click OK.

Page 105: Mba2216 week 11 data analysis part 01

SPSS Windows: Two Independent Samples t Test

1. Select ANALYZE from the SPSS menu bar.

2. Click COMPARE MEANS and then INDEPENDENT SAMPLES T TEST.

3. Move “Internet Usage Hrs/Week [iusage]” in to the TEST VARIABLE(S) box.

4. Move “Sex[sex]” to GROUPING VARIABLE box.

5. Click DEFINE GROUPS.

6. Type “1” in GROUP 1 box and “2” in GROUP 2 box.

7. Click CONTINUE.

8. Click OK.

Page 106: Mba2216 week 11 data analysis part 01

SPSS Windows: Paired Samples t Test

1. Select ANALYZE from the SPSS menu bar.

2. Click COMPARE MEANS and then PAIRED SAMPLES T TEST.

3. Select “Attitude toward Internet [iattitude]” and then select “Attitude toward technology [tattitude].” Move these variables in to the PAIRED VARIABLE(S) box.

4. Click OK.

Page 107: Mba2216 week 11 data analysis part 01

1 - 107

Further Reading COOPER, D.R. AND SCHINDLER, P.S. (2011)

BUSINESS RESEARCH METHODS, 11TH EDN, MCGRAW HILL

ZIKMUND, W.G., BABIN, B.J., CARR, J.C. AND GRIFFIN, M. (2010) BUSINESS RESEARCH METHODS, 8TH EDN, SOUTH-WESTERN

SAUNDERS, M., LEWIS, P. AND THORNHILL, A. (2012) RESEARCH METHODS FOR BUSINESS STUDENTS, 6TH EDN, PRENTICE HALL.

SAUNDERS, M. AND LEWIS, P. (2012) DOING RESEARCH IN BUSINESS & MANAGEMENT, FT PRENTICE HALL.