administrivia

Administrivia

Lab 2 distributed this Thurs

Introduction to Experiments

Research Design

Correlational

Experimental

• Variation comes from nature

• Researcher draws conclusions from hands-off observation

• Researcher manipulates variables directly

• In a pure experiment, conditions are randomly assigned

Experiments vs Non-Experiments

Experiment is a study in which variation is introduced by the researcher

The type of experiment (such as a true, randomized experiment) depends on several factors– the elements of design dictate the type of experiment you have

Correlational studies include responses from natural groups (e.g., most surveys are non-experiments)

Just because you manipulate something, it does not make it an experiment.

http://imgs.xkcd.com/comics/the_data_so_far.png

Why Experiments? Because we want to test specific,

directional hypotheses.

Because we want to isolate cause from effect.

Because we want tight control of conditions to rule out spurious relationships.

Experiments can be useful for exploration, but only when we want to isolate and explore specific mechanisms.

Because we need carefully-constructed designs that can be replicated to verify findings across environments or situations.

Why Experiments? Consider Pro’s and Con’s of Smoking Research We have decades of observational research on

the link between smoking and health outcomes such as cancer. But, despite overwhelming correlational evidence, it is actually not possible to establish a causal relationship through observation alone.

Smoking Research Example What would a “true experiment” of smoking and health

look like? Randomly assign people from the population to be

smokers or non-smokers (principle of random assignment)

Long-term study over lifetime (since we would need to have enough time for smoking to have negative effects).

We might have to withhold treatments to help with negative health effects in order to get good data on negative health outcomes.

In sum, this would be a terribly unethical thing to do– which is why most researchers accept the overwhelming observational research even though no true experiment on the effects of smoking has ever been conducted by scientists.

But… that doesn’t mean that researchers have not pushed the boundaries of ethics over the years in an attempt to isolate certain causes and effects…

First, a few famous Experiments…for good and bad reasons

Tuskegee Syphilis Study: 1932-1972 clinical study of natural progression of untreated syphilis in rural African American men who thought they were getting free health care from US Government. Enrolled about 400 men who had syphilis but didn’t know it, and 200 without the disease. No one was told that they had syphilis, and those with the disease were not treated in order to observe the progression.

Important because: researchers withheld known treatment to observe disease running its course. Led to the establishment of the Office for Human Research Protections (OHRP) and IRB’s.

Milgram Obedience Study Famous study about role of obedience to

authority figures. Conducted in early 1960’s, at a time when Nazi war criminals were on trial.

The larger question that Milgram was interested in was whether people like the Nazi’s were just “following orders”? Could any average person be pushed to harm others simply by the power of an authority asking him or her to behave a certain way?

Milgram Obedience Study

Basic design: Experimenter asked the teacher to give questions to a student (also called a learner). The learner was a confederate (someone in on the study). The teacher would give electric shocks when the student did not give correct answers to questions. The experimenter would simply tell the teacher that “the study must continue”, no matter how high the electric shocks went.

Conditions included putting the “student” and “teacher” in same room or not, as well as having the experimenter in the same room as the teacher or not.

Important because: raised serious issues of ethics in lab studies, but also showed how tough concepts (such as obedience) could be studied in a controlled setting.

Stanford Prison Experiment

Similar to Milgram’s study, the goal was to learn about the psychological effects of being a prison guard or a prisoner.

University randomly assigned students to be “prisoners” or “guards”. Researchers watched to see how they behaved over time.

Important because: it changed how studies are reviewed by IRB (Institutional Review Board). The psychological trauma to some of the participants raised an important issue about detrimental effects after a study is over.

Video Links to 3 Examples AboveTuskegee Syphilis Study

http://www.c-spanvideo.org/program/Syp

Milgram Obedience Study http://www.youtube.com/watch?v=fCVlI-

_4GZQ

Stanford Prison Experiment http://www.youtube.com/watch?v=sZwf

Ns1pqG0

http://www.c-spanvideo.org/program/Syp

http://www.youtube.com/watch?v=fCVlI-_4GZQ

http://www.youtube.com/watch?v=fCVlI-_4GZQ

http://www.youtube.com/watch?v=sZwfNs1pqG0

http://www.youtube.com/watch?v=sZwfNs1pqG0

In Sum… Many, many types of experiments, some with

small samples (such as lab experiments), others with thousands or even millions of participants (such as online field experiments).

Data scientists working with very large data often conduct experiments with users– and unlike academic research, these experiments are not necessarily disclosed to participants (e.g., Wikimedia foundation (Wikipedia), Google search data, Facebook, Insurance companies, rewards and purchasing programs at companies like Wal-Mart, Target, grocery stores.

In Sum (continued)… Data scientists are especially interested in

experiments because they let us test specific causal mechanisms in a way that may not be possible in other forms of data collection (such as scraping data from the web, or compiling data from multiple sources).

But, even if we could construct a good “true” experiment, we also have to balance issues of ethics. We have seen some classic examples of ethical problems, but that doesn’t mean the issue has gone away. Today, data scientists routinely come up against issues of privacy and data collection, security of sensitive information, and trust between those who collect and analyze information and those who provide it.

True Randomized Experimental Design

(1) Independent Variables must be manipulated (usually by experimenter, sometimes by context)

(2) Participants must be assigned randomly to various conditions or groups

Courtesy http://psychology.ucdavis.edu/SommerB/

Example: Asch Conformity Experiments: “True Experiment” Design

Study: Would individuals conform to peer pressure even if they knew their peers were wrong? Participants sat in room with confederates and had to say which line (A, B, C) matched the line in the other box.

Independent variable manipulated by experimenter: Whether there was any peer pressure to give

incorrect answer or not.

Random Assignment to Control group or Treatment Group:

In treatment group, confederates answered incorrectly and the subject answered last

In control condition, no pressure to give incorrect answer (no confederates were present)

Meanings of the term “Controls” in Research

At least three different meanings:

▪ Control in an Experiment

▪ Controlled Studies

▪ Controls used in a study or analysis (e.g., controlling for gender while examining relationship between education and income)

Active versus Attribute Independent Variables in Experiments

Active independent variable(s):The I.V. is “given” to the

participants, usually for some specified time period. It is often manipulated and controlled by the investigator. (e.g., presence or lack of peer pressure in the Asch conformity experiment)

Attribute independent variable(s):A predictive, defining

characteristic of individuals. Cannot be manipulated. (e.g., Gender of participants)

Two Different Types of Randomization: Randomization in Sample and Assignment Random Sample

System for choosing participants from a population

Generally, larger sampling population leads to better generalizability.

Random Assignment Method for assigning

participants randomly to experiment conditions

Pitfalls in Assignment to Conditions

▪ Salk Vaccine Trail▪ Huge vaccine study for polio.

Wealthy families more likely to consent to study, but incidence of polio was correlated with quality of living environment. Again, allowing self-selection into study rather than true random assignment can affect the interpretation of results.

Pre-test, experimental manipulation and post-testing

Pre-test: allows us to check group equivalence before the intervention X is introduced.

Experimental manipulation/Treatment: An independent variable (X) that the experimenter manipulates.

Post-test: allows us to check group equivalence after intervention X has been introduced.

Hypothesis Random Assignment Measure D.V. Treatment Measure D.V.

(pre-test) (post-test)

Common types of true experiments

RO

O

X O

O

RX

O

O

Pretest-Posttest Control

Post-only Control

R

O

O

X O

O

(1)

(2)

X O (3)O (4)

Solomon 4-group

Three Major Types of Experiments True experiments protect against both time

and group threats to internal validity by randomly assigning subjects to treatment and control groups. The treatment (independent variable) is active.

If we cannot randomly assign subjects to treatment and control groups, then it is a quasi-experiment. The independent variable is active.

If we cannot randomly assign subjects to groups because the groups contain the attribute of interest, and if we give all groups the same treatment, then it is an associational non-experiment. The independent variable is not active.

We have already talked a lot about true experiments by example, so lets focus on the other two major types of experiments…

Example: Pen Study

Question: Do individuals in the US and Japan make different choices about ‘unique’ versus ‘less unique’ items when given a choice?

Pen Study Independent Variables

Cultural difference: Japanese students compared to US students

Pen layout (4 pens of one color, 2 pens of another color)▪ Everyone received the same treatment

Assignment Subjects were not randomly assigned because they

already fell into one of the two societies. Dependent Variable:

Would they choose the ‘common’ pen or the ‘unique’ one?

Example: Website Credibility StudyQuestion: Do people infer different

amounts of credibility in websites with different URLs?

Web Credibility Study Example

Independent Variable Experiment Condition (2 conditions):

Website has professional URL. Website has dubious URL.

(www.gooogle.com/webuygoldandsellprescriptions)

Assignment We cannot control who goes to which

website– we can only wait for enough individuals to go to each site and respond to the questionnaire about website credibility.

Dependent Variable Credibility rating (a series of questionnaire

items)

Internal Validity

Design Concerns Internal Validity External Validity Ecological Validity

Photo credit: http://www.indiana.edu/~p1013447/dictionary/int_val.htm

http://www.indiana.edu/~p1013447/dictionary/int_val.htm

Internal Validity

Internal Validity deals with questions about whether changes in the dependent variable were caused by the treatment.

Cause Effect

? ?

? ?

Threats to Internal Validity History

▪ additional I.V. that occurs between pre-test and post-test

Maturation▪ Subjects simply get older and

change during experiment

Testing▪ Subjects “get used” to being

tested

Regression to the Mean▪ Issue with studies of “high and

low extremes” on some key variable

Contamination and Internal Validity

Demand Characteristics▪ Anything in the experiment that could guide subjects to expected outcome

General Demand Characteristics

Three Key Demand Characteristics: Eval Apprehension: Subjects know that they are

being evaluated and this changes their behavior; creates anxiety and can interfere with task performance.

Social Desirablilty: Subjects act how they think the experimenter wants them to act, or what they think would be proper behavior

Placebo effect: just being treated or being given *anything* can lead to measurable improvement over nothing at all (e.g., sugar pill cures headache, etc).

Solutions to Demand Characteristics Solutions:

Double-blind experiments

Experiments in natural setting (i.e., subjects do not know they are in an experiment)

Hidden measurements

Cover stories▪ Provide a reasonably believable story that explains what the

researcher is doing, while concealing the actual intention of the study.

Expectancy Effects

Experimenter Expectancy▪ Researcher behavior that guides subjects to expected outcome (self-fulfilling prophecy)

Reducing expectancy effects

Blind Researcher is unaware of the

experiment condition that he/she is administering.

Naïve experimenter Those conducting study are not

aware of theory or hypotheses in the experiment

Standardization Experimenter follows a script, and

only the script “Canned” Experimenter

Audio/Video/Print material gives instructions

And More Potential Problems with Experiments!

Selection Bias▪ Issue with non-random selection of

subjects (e.g., using sample by convenience and mistaking this as random…eg., picking a few people “at random” from the same street corner)

Mortality/Attrition▪ Departure of subjects in the

experiment

Diffusion, Sharing of Treatments▪ Control group unexpectedly obtains

treatment

Other ‘social’ threats?▪ “Compensatory rivalry” and “resentful

demoralization”: the control group may realize their status and be upset that they do not get the real treatment.

In Sum… The validity of an experiment is subject to a large

variety of threats. You should be aware that experiments absolutely depend

on stamping out these sources of bias.

This is the fun and the challenge of experimental designs

Three key threats to external validity (generalizability) in experiments

Setting: Physical and social context of the experiment

Population: Is there something specific about the sample that interacts with the treatment?

History: Is there something about the time that interacts with the treatment?

External Validity: How far does the given study/experiment generalize to similar groups, individuals, etc?

Ecological ValidityEcological Validity: Approximation of ‘real-life’ situations

This is such a tough one– for any lab experiment it is really very hard to get high ecological validity, even with computer-based studies.

The Validity Tradeoff: Truth and Myth

Internal Validity

External Validity

Balance is important between the types of validity, but internal validity is usually (if not always) the more important factor.

Experiments in the Field

Some experiments can be conducted in a real-world setting while maintaining random assignment and manipulation of treatments

Field Experiment Example

Milliman (1986) Study of music tempo and restaurant customer behavior

Natural Experiments Variation is provided by natural or

unplanned event Treatment is exogenous (no self-selection) Natural disasters Vietnam war draft School admissions cutoffs Olympics reduces air

pollution Distance from school to

ISP central office Provides an

identification strategy: a way to identify a causal relationship in the data

http://www.williams.edu/astronomy/eclipse/eclipse1998/1998total/images/composite/98cor+s.jpg

Pro’s and Con’s of Experiments Pro’s

Gives researcher tight control over independent factors Allows researcher to test key relationships with as few

confounding factors as possible Allows for direct causal testing

Con’s Often very small N in lab studies; enough for statistical

purposes but not ideal for generalizability. Sometimes give up large amounts of external validity in

favor of internal validity (and vice versa) Require a large amount of planning, training, and time–

sometimes to test relationship between only 2 factors!

administrivia

Documents

true experiment of smoking

effects of smoking

type of experiment

pure experiment

smoking research examplewhat

randomized experiment

negative health effects

negative effects