1 copyright © 2011 by saunders, an imprint of elsevier inc. chapter 11 understanding statistics in...

1Copyright © 2011 by Saunders, an imprint of Elsevier Inc.

Chapter 11

Understanding Statistics in Research


Clinical Uses of Statistics

Reading or critiquing published research Examining outcomes of nursing practice by

analyzing data collected in clinical site Developing administrative reports with

support data Analyzing research done by nursing staff and

other health professionals at a clinical site Demonstrating a problem or need and

conducting a study


Stages in Data Analysis

1. Prepare data for analysis.2. Describe the sample.3. Test the reliability of measurement methods.4. Conduct exploratory analysis.5. Conduct confirmatory analysis guided by

hypotheses, questions, or objectives.6. Conduct post hoc analyses.


Preparing the Data for Analysis

1. Enter data into the computer using means designed to reduce errors.

2. Clean the data to ensure accuracy.3. Correct all identified errors.4. Identify missing data points.5. Add missing data when possible.


Describing the Sample

Purpose: to obtain as complete a picture of the sample as possible Determine frequencies of variables related to

sample• Age• Education• Gender• Health status• Ethnicity


Describing the Sample (cont’d)

Examine averages and variation of demographic variables.

If there are study groups, compare using variables such as age, education, health status, gender, and ethnicity.

Determine the comparability of groups. If groups are not comparable, planned

comparative analyses cannot be performed.


Testing Reliability of Measurement

Examine reliability of study scales before testing hypotheses, questions, or objectives using Cronbach’s alpha coefficient.

Values should be at least 0.70.


Cronbach’s Alpha Coefficient

Tests internal consistency of measurement scale

To what extent is the measure a true reflection of subject’s responses?

Reliability of 0.7 is the lowest acceptable alpha.

This means that 70% of the time you can trust the score to accurately reflect what is being measured.


Conducting Exploratory Analysis

Determine the nature of data in variables used to test hypotheses, questions, and objectives.

Identify outliers (subjects or data points with extreme values or values unlike the rest of the sample).

Examine relationships among variables.


Conducting Confirmatory Analyses

Perform analyses designed to test hypothesis, research questions, or objectives.

Generalize findings from sample to appropriate populations (inference).


Performing Post Hoc Analyses

Necessary when ANOVA is used in studies with three or more groups

Necessary with chi-square analyses Purpose: to determine which groups are

significantly different


Probability Theory

Deductive Used to explain:

Extent of a relationship Probability of an event occurring Probability that an event can be accurately

predicted Expressed as lowercase p with values

expressed as percents


Probability

If probability is 0.23, then p = 0.23. There is a 23% probability that a particular

event will occur. Probability is usually expected to be p < 0.05.


Decision Theory

Inductive reasoning Assumes that all the groups in a study used

to test a hypothesis are components of the same population relative to the variables under study.

It is up to the researcher to provide evidence that there really is a difference.

To test the assumption of no difference, a cutoff point is selected before analysis.


Alpha ()

Risk of making a type I error The threshold at which statistical significance

is reached


Cutoff Point

Referred to as level of significance or alpha (α)

Point at which the results of statistical analysis are judged to indicate a statistically significant difference between groups

For most nursing studies, level of significance is 0.05.

Sometimes written as α = 0.05


Cutoff Point (cont’d)

The cutoff point is absolute. If value obtained is only a fraction above the

cutoff point, groups are from the same population.

No meaning can be attributed to differences between the groups.

Results that reveal a significant difference of 0.001 are not considered more significant than the cutoff point.


Levels of Acceptable Significance

0.05 0.01 0.005 0.001


Inference

A conclusion or judgment based on evidence Judgments are made based on statistical

results Statistical inferences must be made

cautiously and with great care Decision theory rules were designed to

increase the probability that inferences are accurate


Generalization

A generalization is the application of information that has been acquired from a specific instance to a general situation.

Generalizing requires making an inference. Both inference and generalization require the

use of inductive reasoning.


Generalization (cont’d)

An inference is made from a specific case and extended to a general truth, from a part to a whole, from the known to the unknown.

In research, an inference is made from the study findings to a more general population.


Normal Curve

A theoretical frequency distribution of all possible values in a population

No real distribution exactly fits the normal curve.

However, in most sets of data, the distribution is similar to the normal curve.

Levels of significance and probability are based on the logic of the normal curve.


Normal Curve (cont’d)


Tailedness

An extreme score can occur in either tail of the normal curve.

An extreme score is higher or lower than 95% of the population.

Mean scores of a population also can be extreme and occur in the tail of the normal curve.


Tailedness (cont’d)

If the mean score is an extreme value, the population is not likely to be the same as that represented by the normal curve; it is significantly different.

However, extreme values that are members of the population do occur. Thus there is always a risk of making an error in deciding that the groups are different.


Two-Tailed Test

Assumes that an extreme score can occur in either tail of the normal curve

Nondirectional hypothesis: tests for significance in either tail

Hypothesis: the extreme score is higher or lower than 95% of the population; thus sample with extreme score is not a member of the same population

A two-tailed test of significance is used.


Two-Tailed Test (cont’d)


One-Tailed Test

Extreme values occur on a single tail of the curve.

The hypothesis is directional: one-tailed test of significance used

The 5% of statistical values considered significant will be in one tail rather than two.

Extreme values in the other tail are not considered significantly different.

One-tailed tests are more powerful than two-tailed tests.


One-Tailed Test (cont’d)


Type I and Type II Errors

Type I error occurs when the researcher rejects the null hypothesis when it is true. The results indicate that there is a significant

difference, when in reality there is not. Type II error occurs when the researcher

regards the null hypothesis as true but it is false. The results indicate there is no significant

difference, when in reality there is a difference.


Data analysis In reality, the In reality, the indicates: null hypothesis null hypothesis

is true: is false:

Results significant—null Type I error Correct

decision

Results notsignificant—null Correct decision Type II errornot rejected

Occurrence of Type I and Type II Errors


Risk of Type I Error


Power and Risk for Type II Error

Power analysis = 0.80 minimum Influenced by the sample size and the effect

size


You have six scores and the mean = 6. What is the value of score #6? Can the value of score #6 vary? Can the other five scores vary? The number of scores that can vary is your degree of freedom.

1 = 5 4 = 42 = 7 5 = 43 = 8 6 = ?

Degrees of Freedom


Using Statistics to Describe

Descriptive statistics are also referred to as summary statistics.

In any study in which the data are numerical, data analysis begins with descriptive statistics.

In simple descriptive studies, analysis may be limited to descriptive statistics.


Types of Descriptive Statistics

Frequency distributions Ungrouped frequency distributions Grouped frequency distributions Percentage distributions

Measures of central tendency Measures of dispersion


Example of an Ungrouped Frequency Distribution

Data are presented in raw, counted form.1: /2: /////3: ///4: /5: //


Example of a Grouped Frequency Distribution

Data are pregrouped into categories. Ages 20 to 39: 14Ages 40 to 59: 43Ages 60 to 79: 26Ages 80 to 100: 4


Example of Percentage Distribution

Salaries: 41.7% Maintenance: 8.3% Equipment: 16.7% Fixed costs: 8.3% Supplies: 25%


Commonly Used Graphic Displays of Frequency Distribution


Measures of Central Tendency

What is a typical score?


Mode

Is the numerical value or score that occurs with greatest frequency

Is expressed graphically Is not always the center of distribution


Bimodal Distribution


Median

Is the value in exact center of ungrouped frequency distribution

Is obtained by rank ordering the values When number of values is uneven, may not

be an actual value in data set


Mean

Is the sum of values divided by the number of values being summed

Like the median, the mean may not be a data set value.


Measures of Dispersion

Range Variance Standard deviation Standardized scores Scatterplots


Range

Is obtained by subtracting lowest score from highest score

Uses only the two extreme scores Very crude measure and sensitive to outliers


Difference Scores

The sum of all difference scores in a data set is zero, making it a useless measure.

Difference scores are the basis for many statistical analysis procedures.


Difference Scores (cont’d)

Are obtained by subtracting the mean from each score

Sometimes referred to as a deviation score because it indicates the extent to which a score deviates from the mean


Standard Deviation

Is the square root of the variance Just as the mean is the “average” value, the

standard deviation is the “average” difference score.


Standardized Scores

Raw scores that cannot be compared and are transformed into standardized scores

Common standardized score is a Z-score. Provides a way to compare scores in a similar

process


Scatterplots

Have two scales: horizontal axis (X) and vertical axis (Y)

Illustrates a relationship between two variables


Structure of a Plot


Example of a Scatterplot


Chi-Square Test of Independence

Used with nominal or ordinal data Tests for differences between expected

frequencies if groups are alike and frequencies actually observed in the data


Regular No RegularExercise Exercise Total

Male 35 15 50 Female 10 40 50 Total 45 55 100

Example of Chi-Square Table


Chi-Square Results

Indicates that there is a significant difference between some of the cells in the table

The difference may be between only two of the cells, or there may be differences among all of the cells.

Chi-square results will not tell you which cells are different.


Example of Chi-Square Results

2 = 4.98, df = 2, p = 0.05


Pearson Product-Moment Correlation

Tests for the presence of a relationship between two variables Called bivariate correlation

Types of correlation are available for all levels of data. Best results are obtained using interval data.


Correlation

Performed on data collected from a single sample

Measures of the two variables to be examined must be available for each subject in the data set.


Correlation (cont’d)

Results Nature of the relationship (positive or negative) Magnitude of the relationship (–1 to +1) Testing the significance of a correlation coefficient

Does not identify direction of a relationship (one variable does not cause the other)

Are symmetrical


Correlation Results

r = 0.56 (p = 0.03) r = –0.13 (p = 0.2) r = 0.65 (p < 0.002) Which ones are significant?


Explained Variance

Definition: The R2 is the variation between two variables expressed as a percentage.


Factor Analysis

Examines relationships among large numbers of variables

Disentangles those relationships to identify clusters of variables most closely linked

Sorts variables according to how closely related they are to the other variables

Closely related variables grouped into a factor


Factor Analysis (cont’d)

Several factors may be identified within a data set.

The researcher must explain why the analysis grouped the variables in a specific way.

Statistical results indicate the amount of variance in the data set that can be explained by each factor and the amount of variance in each factor that can be explained by a particular variable.


Usefulness of Factor Analysis

Aids in development of theoretical constructs Aids in development of measurement scales


Regression Analysis

Used when one wishes to predict the value of one variable based on the value of one or more other variables

For example, one might wish to predict the possibility of passing the credentialing exam based on grade point average (GPA) from a graduate program.


Regression Analysis (cont’d)

Regression analysis could also be used to predict the length of stay in a neonatal unit based on the combined effect of multiple variables such as gestational age, birth weight, number of complications, and sucking strength.


Regression Analysis (cont’d)

The outcome of analysis is the regression coefficient R.

When R is squared, it indicates the amount of variance in the data that is explained by the equation.

The R2 is also called the coefficient of multiple determination.


Regression Results

R2 = 0.63 This result indicates that 63% of the variance

in length of stay can be predicted by the combined effect of age, weight, complications, and sucking strength.


Overlay of Scatterplot and Best-Fit Line


t-Test

Requires interval level measures Tests for significant differences between two

samples Most commonly used test of differences


Example of t-Test Results

t = 4.169 (p < 0.05)


Analysis of Variance (ANOVA)

Tests for differences between means More flexible than other analyses in that it

can examine data from two or more groups


ANOVA (cont’d)

Multiple versions of ANOVA are available that can be used in studies examining multiple outcome variables, or repeated measures of outcome variables across several time periods.

Can look at between-group variance, within-group variance, and total variance


Results of ANOVA

F = 9.75 (2, 95) (p = 0.002) If there are more than two groups under

study, it is not possible to determine where the significant differences are.

Post hoc tests are used to determine the location of differences.


Analysis of Covariance (ANCOVA)

Allows the researcher to examine the effect of a treatment apart from the effect of one or more potentially confounding variables

Potentially confounding variables that are commonly of concern include pretest scores, age, education, social class, and anxiety level.


ANCOVA (cont’d)

The effects on study variables are statistically removed by performing regression analysis before performing ANOVA.

Allows the effect of the treatment to be examined more precisely


Information Needed for Algorithm

1. Determine whether the research question focuses on differences (I) or associations (relationships) (II).

2. Determine level of measurement (A, B, or C).3. Select the design listed that most closely fits

the study you are critiquing (1, 2, or 3).4. Determine whether the study samples are

independent (a), dependent (b), or mixed (c).


Algorithm for Choosing a Statistical Test


Judging Statistical Suitability

Factors that must be considered include: Study purpose Hypotheses, questions, or objectives Design Level of measurement


Judging Statistical Suitability (cont’d)

Requires you to be familiar with the statistical procedures used in the study

Requires you to compare the statistical procedures used with other statistics that could have been used to greater advantage

Are there dependent or independent groups?


Judging Statistical Suitability (cont’d)

You must judge whether the procedure was performed appropriately and the results were interpreted correctly.

Judgments required Whether the data for analysis were treated as

nominal, ordinal, or interval The number of groups in the study Whether the groups were dependent or

independent


Types of Results

Significant and predicted results Nonsignificant results Mixed results Unexpected results


Significant and Predicted Results

Are in keeping with those predicted by researcher and support logical links developed by researcher among the framework, questions, variables, and measurement tools


Nonsignificant Results

Also called negative or inconclusive results Analysis showed no significant differences or

relationships. Could be a true reflection of reality. If so, the

researcher or theory used by researcher to develop hypothesis is in error. In this case, negative findings are an important addition to the body of knowledge.


Nonsignificant Results (cont’d)

Results could stem from a type II error Causes of type II error include:

Inappropriate methods Biased or small sample Internal validity problems


Nonsignificant Results (cont’d)

Inadequate measurement Weak statistical measures Faulty analysis


Significant and Unpredicted Results

Are opposite of those predicted Indicate flaws in the logic of both the

researcher and the theory being tested If valid, are an important addition to the body

of knowledge


Mixed Results

Most common outcome of studies One variable may uphold predicted

characteristics, whereas another does not. Or two dependent measures of the same

variable may show opposite results May be caused by methodology problems May indicate need to modify existing theory


Unexpected Results

Relationships between variables that were not hypothesized and not predicted from the framework being used

Can be useful in theory development, modification of existing theory, development of later studies


Unexpected Results (cont’d)

Serendipitous results are important as evidence in developing the implications of the study.

They must be evaluated carefully because the study was not designed to examine these results.


Findings

Results of the study that have been translated and interpreted

A consequence of evaluating evidence


Conclusions

A synthesis of the findings using: Logical reasoning Creative formation of meaningful whole from

pieces of information obtained through data analysis and findings from previous studies

Receptivity to subtle clues in data Alternative explanations of data


Conclusions (cont’d)

Risk in developing conclusions is going beyond the data Forming conclusions not warranted by data Occurs more frequently in published studies than

one would like to believe


Implications

The meanings of conclusions for the body of nursing knowledge, theory, and practice

Based on, but more specific than, conclusions

Provide specific suggestions for implementing the findings


Significance of Findings

Associated with importance to the nursing body of knowledge

May be associated with: Amount of variance explained Control in the study design to eliminate

unexplained variance Detection of statistically significant differences


Clinical Significance

Findings can have statistical significance but not clinical significance.

Related to practical importance of the findings No common agreement in nursing about how

to judge clinical significance Effect size? Difference sufficiently important to warrant

changing the patient’s care?


Clinical Significance (cont’d)

Who should judge clinical significance? Patients and their families? Clinician/researcher? Society at large?

Clinical significance is ultimately a value judgment.


Generalizing the Findings

Extends the implications of the findings: From the sample studied to a larger population From the situation studied to a more general

situation How far can generalizations be made?


Empirical Generalizations

Are based on accumulated evidence from many studies

Are important for verification of theoretical statements or for development of new theory

Are the basis of a science Contribute to scientific conceptualization


Suggesting Further Studies

Researcher gains knowledge and experience from conducting the study that can be used to design a better study next time.

Researcher often makes suggestions for future studies that logically emerge from the present study.


Suggesting Further Studies (cont’d)

Replications Different design Larger sample Hypotheses emerging from findings Strategies to further test framework in use


Critiquing Statistics in a Study

What statistics were used to describe the characteristics of the sample?

Are the data analysis procedures clearly described?

Did statistics address the purpose of the study?


Critiquing Statistics in a Study (cont’d)

Did the statistics address the objectives, questions, or hypotheses of the study?

Were the statistics appropriate for the level of measurement of each variable?

1 copyright © 2011 by saunders, an imprint of elsevier inc. chapter 11 understanding statistics in...

Documents