class 7: 10/22/12 intro to statistical methods cont

44
class 7: 10/22/12 intro to statistical methods cont.

Upload: scott-maxwell

Post on 18-Jan-2018

224 views

Category:

Documents


0 download

DESCRIPTION

all researchers must learn the trick and avoid the mistake trick: begin with the question and then to figure out the best way(s) to answer that question mistake: begin with a specific method and fit the question to that method

TRANSCRIPT

Page 1: Class 7: 10/22/12 intro to statistical methods cont

class 7: 10/22/12intro to statistical

methods cont.

                                                                   

Page 2: Class 7: 10/22/12 intro to statistical methods cont

• Being wrong in science is fine, and even necessary—as long as scientists recognize that they blew it, report their mistake openly instead of disguising it as a success, and then move on to the next thing—until they come up with the very occasional breakthrough. But as long as careers remain contingent on producing a stream of research that’s dressed up to seem more right than it is, scientists will keep delivering exactly that.

Science is a noble endeavor, but it is also a low-yield endeavor. I’m not sure that more than a very small percentage of medical research is ever likely to lead to major improvements in clinical outcomes and quality of life. We should be very comfortable with that fact. (p. 86)

Friedman, David H. (2010, November). Lies, damned lies, and

medical science. The Atlantic, 306(4), 76-86

Page 3: Class 7: 10/22/12 intro to statistical methods cont

all researchers must learn the trick and avoid the mistake

• trick: begin with the question and then to figure out the best way(s) to answer that question

• mistake: begin with a specific method and fit the question to that method

Page 4: Class 7: 10/22/12 intro to statistical methods cont

more on models•models should meet three criteria:

– generality, precision, accuracy•can generally satisfy any two, at the cost of sacrificing the third.

– climatology settles for generality & accuracy– ecologists focusing particular species, for

precision & accuracy– rigorous history & ethnography often give

up generality for precision & accuracy—results can still be important

•Kitcher, Philip. (2012, May 24). The trouble with scientism: Why history and the humanities are also a form of knowledge. The New Republic, 243, 20-25)

Page 5: Class 7: 10/22/12 intro to statistical methods cont

• research using – measurement description – statistical analysis

critical for answering certain kinds of important questions

Page 6: Class 7: 10/22/12 intro to statistical methods cont

strengths of measurement description

• precise descriptions• often efficient—one can make confident

predictions based on relatively small samples—if samples good

• increasingly sophisticated ways of analyzing measurement data

• powerful stat packages now available for desktop computers, e.g, Systat, SPSS, SAS

Page 7: Class 7: 10/22/12 intro to statistical methods cont

cautions• measure only what can be measured

– “to replace the unmeasureable with the unmeaningful is not progress” (Achen, 1977, p. 806)

• value precision but realize that a precise description may not be an accurate one

• scientific method (drawing inferences from observations) comprises many research methods—its strength does not come from any one specific method

Page 8: Class 7: 10/22/12 intro to statistical methods cont

my personal recommendations• whatever your Ph.D. Research

Specialization take at least one stat course, preferably 2 or 3

• whatever your methodological expertise, find people with similar interests but different methodological expertise and work with them—the best research often uses many approaches

Page 9: Class 7: 10/22/12 intro to statistical methods cont

• The statistician knows, for example, that in nature there never was a normal distribution, there never was a straight line, yet with normal and linear assumptions, known to be false, he can often derive results which match, to a useful approximation, those found in the real world. (Box, p. 792)

• All models are false, but some are useful.

– Box, George E. P. (1976). Science and statistics. Journal of the American Statistical Association, 71, 791-799.

Page 10: Class 7: 10/22/12 intro to statistical methods cont

a caution• Statistics today is in a conceptual and

theoretical mess. The discipline is divided into two rival camps, the frequentists and the Bayesians, and neither camp offers the tools that science needs for objectively representing and interpreting statistical data as evidence. (Royall, pp. 127-128)

– Royall, Richard (2004). The likelihood paradigm for statistical evidence. In M. L. Taper & S. R. Lele (Eds.), The nature of scientific evidence: Statistical, philosophical, and empirical considerations (pp. 199-152). Chicago: University of Chicago Press.

Page 11: Class 7: 10/22/12 intro to statistical methods cont

• It is possible to spend a lifetime analysing data without realising that there are two very different fundamental approaches to statistics: Bayesianism and Frequentism.

• Bayesians address the question everyone is interested in, by using assumptions no-one believes

• Frequentists use impeccable logic to deal with an issue of no interest to anyone (Louis Lyons, 2007)

Page 12: Class 7: 10/22/12 intro to statistical methods cont

K ch 19: inferential statistics • inferential statistics allow one to infer

the characteristics of a population from a representative sample– estimate characteristics of population

within a determined range with a given probability

– determine (in general) with a given probability whether effect beyond sampling and chance error exists

Page 13: Class 7: 10/22/12 intro to statistical methods cont

• parameters: refer to population• statistics: refer to sample• sampling distribution: descriptive

statistic calculated from repeated sampling

• confidence intervals: range that includes the population value with a given probability

Page 14: Class 7: 10/22/12 intro to statistical methods cont

confidence level: • the probability that the interval will

contain the population value: conventionally 68%, 95%, and 99% (2 to 1, 19 to 1, 99 to 1 respectively)

• the wider the interval the more likely it contains the population value (and the less valuable the information)

Page 15: Class 7: 10/22/12 intro to statistical methods cont

• hypothesis testing (traditionally takes form of rejecting the null hypothesis, i.e., that there is no effect beyond sampling and chance error)

• alpha level: the risk the result is due to chance; set by the researcher in advance, traditionally .10, .05, .01, .001 (N.B., no good reason for these and not others)

• p-level: the actual probability level found, which is then compared to the alpha level

Page 16: Class 7: 10/22/12 intro to statistical methods cont

two-tailed test: • non-directional, puts the alpha level at

both ends. used when one does not expect results in one direction

one-tailed test: • directional, puts alpha level at one end

(determined by researcher). increases probability of finding statistically significant result

Page 17: Class 7: 10/22/12 intro to statistical methods cont

common statistical testst test of difference between means• common and simple test for differences

between means of two groupschi-square• common test for categorical data and

frequencies– are cell values different from what

would be expected

Page 18: Class 7: 10/22/12 intro to statistical methods cont

chi-square examples Jefferson & Madison Combined

Years in Kindergarten by SES

poor* not poor total2-year 24 34 581-year 6 65 71 total 30 99 129

chi-square: 19.4 (1 df) p < .0001

* eligible for free or reduced lunch

Page 19: Class 7: 10/22/12 intro to statistical methods cont

Jefferson and Madison CombinedYears in Kindergarten by race

non-white white total2-year 10 48 581-year 9 62 71 total 19 110 129 chi-square: .530 (1 df) p < .5

Page 20: Class 7: 10/22/12 intro to statistical methods cont

ANOVA (analysis of variance)• experimental designs where two or more

groups or multiple conditions are being compared (common in psychology and ed psych, and in educational research in general)

• powerful: – accurate measure of error variance– tests significance of each variable as

well as combined effect, – avoids inflation of probabilities problem

Page 21: Class 7: 10/22/12 intro to statistical methods cont

(not in K)regression analysis• explains (predicts) variability of a

dependent variable using information about one or more independent variables.

• predicts expected change in dependent variable given specific changes in the independent variable

• not used in educational research as much as ANOVA, but more useful for policy purposes

Page 22: Class 7: 10/22/12 intro to statistical methods cont

regression example

achievement*= 77.5 - .8 SES**

*combined math & reading scores, ITBS** percent of low income students

Page 23: Class 7: 10/22/12 intro to statistical methods cont

errors of inference• type I error (alpha error): a concern when

theory testing (K, “when validating a finding”)

• type II error (beta error): a concern when theory building (K: “when exploring”)

• decreasing the probability of one type increases the probability of the other

• pointless to talk about Type I or II error absent discussion of what is at stake

Page 24: Class 7: 10/22/12 intro to statistical methods cont

cost of type I error in theory testing• dominant theory not challenged• knowledge production stoppedcost of type II error in theory building• possibly important explanations etc.

ignored• knowledge production stopped

(one of the many challenges the late and great Lee Cronbach (1916-2001) made to the accepted wisdom of the day)

Page 25: Class 7: 10/22/12 intro to statistical methods cont

statistical power: 1-beta • increasing statistical power:

– increase size of effect (stronger treatment)

– increase sample size– reduce variability

Page 26: Class 7: 10/22/12 intro to statistical methods cont

statistical & practical significance• statistical: confidence at a given

probability that result is not due to chance

• practical: is the result important enough, big enough, feasible, affordable—all value judgments– if one apple a day keeps the doctor

away, but it takes three grapefruit, then…?

Page 27: Class 7: 10/22/12 intro to statistical methods cont

• no statistic or statistical test can make a practical decision

• whether one risks being wrong cautiously (type I) or wrong incautiously (type II) cannot be decided absent cost and risk, needs, what’s a stake etc

• no statistical analysis better than numbers (descriptions) fed into it: garbage in, garbage out

Page 28: Class 7: 10/22/12 intro to statistical methods cont

statistical significance refers only to samples from population

• it does not refer to size of effect—ceteris paribus larger effects are more likely to be statistically significant, but with large samples very small effects will be

• if you have the population, any effects are real, no matter the size

Page 29: Class 7: 10/22/12 intro to statistical methods cont

no proof in science:• a statistically significant result

(assuming appropriate analysis etc) does not prove that the hypothesis is true, only that it has escaped disconfirmation

• the more often an hypothesis passes the test and the more demanding the tests it passes, the more certain we can be that we know something—the more we have reduced uncertainty

Page 30: Class 7: 10/22/12 intro to statistical methods cont

other terms• parametric: assumes random

sampling, from distribution with known parameters, often normal distribution

• nonparametric: when data do not come from known distribution—often with nominal or ordinal data

• robust test: accurate even when assumptions violated

• effect size: too long and too often ignored—journals now requiring estimates of effect size

Page 31: Class 7: 10/22/12 intro to statistical methods cont

thinkingsimple statistical way to find out what

people may not willing to admit• ask people to flip coin

– if head, answer “head: no answer”– if tail and have done X, answer “head:

no answer”– if tail and have not done X, answer

“no”• thus, no’s an estimate of half who have

not done x• thus, N minus twice the number of “no’s”

gives estimate of those who have done X

Page 32: Class 7: 10/22/12 intro to statistical methods cont

Monte Hall Problem1. behind door: Lamborghini Monte reveals either goat

switching loses2. behind door: goat A

Monte must reveal Goat B switching wins

3. behind door: goat B Monte must reveal Goat A switching wins

• initially player has .33 chance of selecting the car, Goat A, or Goat B. Switching results in a win 2/3 of the time

Page 33: Class 7: 10/22/12 intro to statistical methods cont

Vogt• nominal scale• operational definition• outlier• parsimony• path diagram• practical significance• Pygmalion effect

Page 34: Class 7: 10/22/12 intro to statistical methods cont

Vogt• regression toward the mean• reliability• sample space• sampling frame• scatter plot• self-selection bias• sleeper effect• sociogram• spurious relation (or correlation)• suppressor variable

Page 35: Class 7: 10/22/12 intro to statistical methods cont

Sieber ch 6: Strategies for Assuring Confidentiality

6.1 Confidentiality refers to agreements with people about what can be done with data

• states steps will be taken to insure privacy

• states legal limitations to assurances of confidentiality

Page 36: Class 7: 10/22/12 intro to statistical methods cont

6.2 why an issue (be able to discuss the cases)

6.3 confidentiality or anonymity6.4 procedural approaches to assuring

confidentiality6.4.1 cross-sectional research

– anonymity– temporarily identified responses– separately identified responses

Page 37: Class 7: 10/22/12 intro to statistical methods cont

6.4.2 longitudinal data (requires links)– aliases

6.4.3 interfile linkage6.5 statistical strategies for assuring

confidentiality (coin flip example)6.6 certificates of confidentiality

– researchers do NOT have testimonial privilege unless they have certificate of confidentiality from Dept of Health and Human Services

Page 38: Class 7: 10/22/12 intro to statistical methods cont

6.7 confidentiality and consent:– consent statement must specify

promises of confidentiality researcher cannot make—be aware of state reporting laws, e.g., on child abuse

6.8 data sharing– when data shared publicly, all

identifiers must be removed and researcher must ensure no way to deduce identity

– techniques

Page 39: Class 7: 10/22/12 intro to statistical methods cont

lit review•review section

– review lit, follow explicit and logical scheme.

– 3-5 sections, with subsections if useful

– end sections and subsections with a discussion

•discussion section – synthesize the review (discussion of

discussions)

Page 40: Class 7: 10/22/12 intro to statistical methods cont

• conclusion section (< 1 p)– address original question(s)

• personal reflections section (1 p)– discuss briefly what you learned in

the process of doing the lit review• references

– make sure all citations in references– make sure all references cited

• additional references (optional)– references not cited but which you

want to record

Page 41: Class 7: 10/22/12 intro to statistical methods cont

APA• use first person to talk about yourself,

not third, e.g., “The researcher . . .”• use we (us, our, etc.) only to refer to you

and your co-authors (69-70)• do not italicize Latin abbreviations—e.g.,

et al., etc. and so on• for seriation see 63-65• single quotation marks only within

double• periods and commas always inside

quotation marks

Page 42: Class 7: 10/22/12 intro to statistical methods cont

• italicize new, technical, or key terms or labels the first time, e.g., “The term peer response . . .” (104-106)

• do not separate compound verbs with comma: “She walked down the block past her house and then turned into the driveway.”

• avoid beginning sentences with “however”

• avoid “throat-clearings” to begin sentences, e.g, furthermore, therefore, also, additionally

Page 43: Class 7: 10/22/12 intro to statistical methods cont

colon (80-81)• between a grammatically complete

intro clause and a final clause that illustrates, extends, or amplifies the first. If second clause a complete sentence, capitalize.– Kelly presented two findings:

Teachers preferred . . .• do not use a colon after intro that is

not a complete sentence– The students were Ben, Akiko,

Mustafa. . . .

Page 44: Class 7: 10/22/12 intro to statistical methods cont

this week free and cheap• wed: 9 Billion People and 1 Earth. Andrew

Revkin, Pace University. 4pm, Alice Campbell Alumni Center, 601 S. Lincoln. free, reception to follow

• thurs: Krannert Uncorked, 5pm, free.• thurs: Creating Community through African

Art. Krannert Art Museum, Gallery, free.• thurs: When China Met Africa. film. 7pm,

Urbana Free Library, free. • sat: Brahms Instrumental Music with Piano,

Ian Hobson. 7:30, Smith Hall. $5-10.