week 3. experimental design and the responsible use of numbers (i) grs lx 865 topics in linguistics

52
Week 3. Experimental Week 3. Experimental design and the design and the responsible use of responsible use of numbers (I) numbers (I) GRS LX 865 GRS LX 865 Topics in Topics in Linguistics Linguistics

Post on 21-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Week 3. Experimental design Week 3. Experimental design and the responsible use of and the responsible use of

numbers (I)numbers (I)

GRS LX 865GRS LX 865Topics in Topics in

LinguisticsLinguistics

Why we do experimentsWhy we do experiments

In this context, we’re generally interested In this context, we’re generally interested in the state and developmental course of in the state and developmental course of children’s linguistic knowledge.children’s linguistic knowledge. What does the child know?What does the child know? To what extent does it differ from what an adult To what extent does it differ from what an adult

knows?knows? We have logical, abstract reasons to believe We have logical, abstract reasons to believe

that a lot of what kids ultimately know that a lot of what kids ultimately know about language is not deduced from their about language is not deduced from their language input— but what evidence is language input— but what evidence is there?there?

Universal GrammarUniversal Grammar The poverty of the stimulusThe poverty of the stimulus

1 2 3 — what’s next?1 2 3 — what’s next? 1 2 3 5 — what’s next?1 2 3 5 — what’s next?

Properties of “innateness”?Properties of “innateness”? Independence from external evidence.Independence from external evidence. Universality?Universality? Early emergence?Early emergence?

Constraints and the absence of negative Constraints and the absence of negative evidence.evidence. Candidate A:Candidate A: Who does Arnold wanna make breakfast Who does Arnold wanna make breakfast

for? Who does Arnold wanna make breakfast?for? Who does Arnold wanna make breakfast? Candidate B:Candidate B: Who does Arnold wanna make breakfast Who does Arnold wanna make breakfast

for? *Who does Arnold wanna make breakfast?for? *Who does Arnold wanna make breakfast?

HypothesesHypotheses

Where you have two hypotheses that Where you have two hypotheses that make different predictions, you use an make different predictions, you use an experimentexperiment to determine which to determine which predictions are actually borne out.predictions are actually borne out.

A standard setup in child language studies A standard setup in child language studies is pitting an experimental hypothesis (His pitting an experimental hypothesis (H11) ) such as such as Children (Like Those We Are Children (Like Those We Are Testing) Know Grammatical Constraint XTesting) Know Grammatical Constraint X against the null hypothesis (Hagainst the null hypothesis (H00) ) They They Don’tDon’t..

This This shouldshould be difficult be difficult Experiments are Experiments are

naturally subject to naturally subject to errorerror. We’re measuring . We’re measuring things in the real world.things in the real world.

We only want to We only want to reject reject the null hypothesisthe null hypothesis if if we’re sure.we’re sure.

So, we want to “stack So, we want to “stack the deck against” Hthe deck against” H11. . Exclude the possibility Exclude the possibility that kids give the that kids give the correct answers for the correct answers for the wrong reasons.wrong reasons.

HH00 actuallactually truey true

HH00 actuallactually falsey false

Reject Reject HH00

Type I Type I errorerror CorrecCorrec

tt

Do not Do not reject reject

HH00

CorrecCorrectt

Type II Type II errorerror

Production and Production and comprehensioncomprehension

Two things to look at to Two things to look at to assess kids’ grammatical assess kids’ grammatical knowledge.knowledge.

Naturalistic production Naturalistic production (e.g., CHILDES (e.g., CHILDES transcripts) is good for transcripts) is good for some things, but not so some things, but not so good for others.good for others.

If we’re interested in a If we’re interested in a particular construction, particular construction, often this needs to be often this needs to be elicitedelicited in an in an experimental setting in experimental setting in order to get enough order to get enough examples.examples.

Non-production: Non-production: Constraint? Or Constraint? Or preference?preference?

Test Test comprehension.comprehension. Act-outAct-out Grammaticality Grammaticality

judgmentjudgment Truth value Truth value

judgmentjudgment

Elicited productionElicited production WannaWanna contraction contraction

Who do you want to kiss?Who do you want to kiss? Who do you wanna kiss?Who do you wanna kiss? I want to kiss BillI want to kiss Bill I wanna kiss BillI wanna kiss Bill Who do you want to kiss Who do you want to kiss

Bill?Bill? *Who do you wanna kiss *Who do you wanna kiss

Bill?Bill?

All signs point to the last All signs point to the last one being good. But it one being good. But it isn’t. Do kids know this?isn’t. Do kids know this?

So:So: try to get kids try to get kids to say to say Who do you Who do you wanna kiss Billwanna kiss Bill??

Great. Suppose Great. Suppose they don’t. Have they don’t. Have we shown that we shown that they know the they know the constraint?constraint? HintHint: No.: No.

But I don’t wanna But I don’t wanna contract.contract.

The fact that a kid The fact that a kid never says never says Who do you Who do you wanna kiss Bill?wanna kiss Bill? doesn’t doesn’t tell us anything unless tell us anything unless the kid would have the kid would have otherwise contracted.otherwise contracted.

So we need So we need controlscontrols in in order to determine order to determine whether the kid whether the kid knows/uses knows/uses wannawanna contraction to begin contraction to begin with.with.

So:So: Try to get the kid to Try to get the kid to say say Who do you wanna Who do you wanna kiss?kiss? as well as as well as trying trying to get them to say to get them to say Who Who do you wanna kiss Bill?do you wanna kiss Bill?

If the kid never says If the kid never says wannawanna, we have no , we have no evidence of anything.evidence of anything. Also: Maybe the kid Also: Maybe the kid

knows all there is to knows all there is to know about know about wanna wanna contraction, but contraction, but prefersprefers not to contract.not to contract.

Here’s how it might be Here’s how it might be done.done.

Design the experiment to Design the experiment to be some kind of game, to be some kind of game, to keep the kid interested and keep the kid interested and willing to continue.willing to continue.

Use a puppet. Kids are Use a puppet. Kids are more willing to interact more willing to interact with the puppet, more with the puppet, more willing to disagree with a willing to disagree with a puppet.puppet. Though: Gordon (1996) Though: Gordon (1996)

relates a tale of testing relates a tale of testing Kadiweu kids, who “had never Kadiweu kids, who “had never encountered puppets before encountered puppets before and reacted with a mixture of and reacted with a mixture of curiosity and fear that often curiosity and fear that often led to tears”led to tears”

Ratty missed snacktime Ratty missed snacktime and is probably hungry. and is probably hungry. Various toy food items are Various toy food items are arrayed in the playspace.arrayed in the playspace. E: The rat looks kind of E: The rat looks kind of

hungry. I bet he hungry. I bet he wants towants to eat eat something. Ask him what.something. Ask him what.

C: What do you want?C: What do you want? R: Huh?R: Huh? C: What do you wanna eat?C: What do you wanna eat? R: Is that pepperoni pizza R: Is that pepperoni pizza

over there? I’ll have some of over there? I’ll have some of that.that.

E: I bet the rat wants E: I bet the rat wants someone to brush his teeth someone to brush his teeth for him. Ask him who.for him. Ask him who.

C: Who do you want to brush C: Who do you want to brush your teeth?your teeth?

Analyzing the resultsAnalyzing the results Find the kids whoFind the kids who

Produced both Produced both subject and object subject and object questionsquestions

Produced Produced wannawanna sometimes.sometimes.

HH11:: Kids know that Kids know that want towant to cannot cannot contract to contract to wannawanna over a (subject) over a (subject) whwh--trace.trace.

HH00:: They don’t.They don’t.

Expectation given HExpectation given H00 would be that kids would would be that kids would not distinguish subjects not distinguish subjects and objects; they would and objects; they would be as likely to contract in be as likely to contract in one case as in the other.one case as in the other.

If kids contract often If kids contract often with objects and never with objects and never with subjects, that points with subjects, that points to Hto H11. We could reject . We could reject HH00..

Testing interpretationTesting interpretation Do kids assign the Do kids assign the

same meanings to a same meanings to a sentence as adults? sentence as adults? (More meanings? (More meanings? Fewer meanings?)Fewer meanings?) Constraints on meaning Constraints on meaning

(e.g., Binding Theory)(e.g., Binding Theory)

The The Truth Value Truth Value JudgmentJudgment task is a task is a popular way to popular way to approach this.approach this.

Pros:Pros: Fun for the kids.Fun for the kids. Minimal extra cognitive Minimal extra cognitive

demandsdemands Gets at alternative Gets at alternative

meanings an act-out meanings an act-out task can’t reliably task can’t reliably excludeexclude

Con:Con: A trial takes a long A trial takes a long

time, not many data time, not many data points collected.points collected.

TVJTVJ The idea:The idea: Set up a Set up a

context by telling a story. context by telling a story. Provide a test sentence Provide a test sentence which is either true or which is either true or false of the situation (and false of the situation (and have the puppet say it). have the puppet say it). Kid then either agrees Kid then either agrees with the puppet and with the puppet and rewards it, or disagrees rewards it, or disagrees and punishes it. If the and punishes it. If the puppet is wrong, the kid puppet is wrong, the kid is asked “What really is asked “What really happened?”happened?”

Also:Also: Kids often like Kids often like the puppet to be the puppet to be right, and will more right, and will more readily agree with readily agree with the puppet.the puppet.

So:So: stack the deck stack the deck against Hagainst H11, and have , and have adult-impossible adult-impossible readings correspond readings correspond to “yes” responses.to “yes” responses.

Principle C, for example.Principle C, for example. Jumping competition (Crain & Jumping competition (Crain &

Thornton 1998)Thornton 1998) This is a story about a jumping This is a story about a jumping

competition. The judge is competition. The judge is Robocop. Last year he won the Robocop. Last year he won the jumping competition, so this jumping competition, so this year he gets to be judge. This year he gets to be judge. This year, these guys, Cookie year, these guys, Cookie Monster, the Troll, and Grover Monster, the Troll, and Grover are in the jumping competition. are in the jumping competition. They have to try and jump over They have to try and jump over this log, the barrels, and the this log, the barrels, and the benches over here.benches over here.

RR: : The winner of the competition The winner of the competition gets a great prize: colored pasta! gets a great prize: colored pasta! See, it’s in this barrel right here.See, it’s in this barrel right here.

R: Troll, you R: Troll, you jumped very jumped very well. You well. You didn’t crash didn’t crash into anything into anything at all. You at all. You could be the could be the winner. But winner. But let me judge let me judge Grover before Grover before I decide.I decide. Now Troll Now Troll

winning is a winning is a possibility.possibility.

R: Grover, your R: Grover, your jumps were very jumps were very good, too. You didn’t good, too. You didn’t knock anything down, knock anything down, and you were also and you were also very fast. So, I think very fast. So, I think you were the best you were the best jumper. You win the jumper. You win the prize, this colored prize, this colored pasta. Well done, pasta. Well done, Grover. Great job!Grover. Great job! Robocop will remain Robocop will remain

by Grover as a by Grover as a reminder.reminder.

Against all oddsAgainst all odds T: No, Robocop, you’re wrong! I am T: No, Robocop, you’re wrong! I am

the best jumper. I think I should get the best jumper. I think I should get the prize. I’m going to take some the prize. I’m going to take some colored pasta for myself.colored pasta for myself.

K: Let me try to say what happened. K: Let me try to say what happened. That was a story about Robocop, That was a story about Robocop, who was the judge, and Cookie who was the judge, and Cookie Monster, and Grover, and there was Monster, and Grover, and there was the Troll. I know one thing that the Troll. I know one thing that happened. happened. He said that the Troll is He said that the Troll is the best jumper.the best jumper.

C: No!! Bad Kermit. Eat this rag.C: No!! Bad Kermit. Eat this rag. Yet Yet heheii said that Troll said that Trollii is the best is the best

jumper jumper is true.is true.

TVJTVJ Distinguishing meaningDistinguishing meaning11

(disallowed by adult (disallowed by adult constraint) and constraint) and meaningmeaning22 (allowed by (allowed by adult constraint).adult constraint).

Test sentence should be Test sentence should be true true on meaningon meaning11, , false false on meaningon meaning22..

Child judges puppet’s Child judges puppet’s report to be true report to be true (reward) or false (reward) or false (punishment).(punishment).

Evidence for meaningEvidence for meaning11 should be acted out should be acted out last.last.

Linguistic antecedent Linguistic antecedent for meaningfor meaning11 should be should be mentioned (by puppet) mentioned (by puppet) last.last.

What really happened?What really happened? Ensure that the test Ensure that the test

sentence is relevant; it sentence is relevant; it must be clear why it is must be clear why it is true or false (true or false (condition condition of plausible dissentof plausible dissent).).

Experimental designExperimental design What we’re trying to What we’re trying to

determine is the determine is the degree to which degree to which variablesvariables in the in the situation affect one situation affect one another.another. Does a Principle C Does a Principle C

configuration preclude configuration preclude a certain a certain interpretation?interpretation?

Does a subject Does a subject whwh--extraction preclude extraction preclude wannawanna contraction? contraction?

Independent variablesIndependent variables are the presumed causal are the presumed causal variables.variables.

Dependent variablesDependent variables are the presumed are the presumed caused variables.caused variables.

NuisanceNuisance or or confoundingconfounding variablesvariables are other factors that are other factors that may introduce may introduce systematic “noise”systematic “noise”

Between- and within-Between- and within-subjectssubjects

Between-subjectsBetween-subjects designs vary designs vary independent independent variables with the variables with the subjects, so each subjects, so each subject represents subject represents one of the values one of the values ((levelslevels) of the ) of the independent variable.independent variable. Age, for example Age, for example

(“Cross-sectional”).(“Cross-sectional”).

Within-subjectsWithin-subjects designs vary designs vary independent variables independent variables for each subject, so for each subject, so each subject sees all of each subject sees all of the levels of the the levels of the independent variable.independent variable. Subject extraction and Subject extraction and

object extraction, for object extraction, for example.example.

Age, for another Age, for another (“Longitudinal”)(“Longitudinal”)

Task considerationsTask considerations Ideally, we want Ideally, we want

test items to be test items to be distinguished by distinguished by justjust the factor the factor we’re looking at.we’re looking at.

This is important This is important because other because other things may play things may play a role and may a role and may confoundconfound the the result.result.

If we find that kids are slower If we find that kids are slower on:on: Who did Pat say met Chris?Who did Pat say met Chris?

Than on:Than on: Who met Chris?Who met Chris?

Can we conclude that it takes Can we conclude that it takes more time to process a longer-more time to process a longer-distance extraction?distance extraction?

Well, it could just take longer Well, it could just take longer because there are more words— because there are more words— we need to rule that out if we we need to rule that out if we want to conclude that it has to want to conclude that it has to do with long-distance extraction.do with long-distance extraction.

Things that matterThings that matter

Performing the task Performing the task on the test item on the test item changeschanges the subject. the subject. Present the same Present the same

item to them, they’ll item to them, they’ll remember, it’ll affect remember, it’ll affect how they act.how they act.

Seeing a pattern in Seeing a pattern in all of the items they all of the items they get may lead to an get may lead to an irrelevant strategy.irrelevant strategy.

Items should include Items should include controlscontrols.. Ensure that the subjects are Ensure that the subjects are

performing the task.performing the task. Rule out confounding Rule out confounding

variables.variables. Items should include Items should include fillersfillers..

Irrelevant items to mask the Irrelevant items to mask the actual goal of the experimentactual goal of the experiment

Items should be presented Items should be presented in different ordersin different orders ruling out another ruling out another

confounding variableconfounding variable

Things that matterThings that matter The instructionsThe instructions

given matter a lot. Is given matter a lot. Is the task clear?the task clear? Circle 1 for Circle 1 for

grammatical, 5 for grammatical, 5 for ungrammatical.ungrammatical.

Circle 1 if the sentence Circle 1 if the sentence sounds ok, 5 if you sounds ok, 5 if you would never use it.would never use it.

Give the puppet a Give the puppet a cookie if his sentence cookie if his sentence makes sense, and give makes sense, and give him a rag if his him a rag if his sentence is silly.sentence is silly.

Practice with Practice with feedbackfeedback To confirm that the To confirm that the

subjects understand subjects understand the task (and are the task (and are comfortable that they comfortable that they do), run a couple of do), run a couple of practice trials.practice trials.

Practice items should Practice items should not be test items. not be test items. Should be relatively Should be relatively easy.easy.

Things that matterThings that matter Balance the responsesBalance the responses

If a “no box” will be If a “no box” will be considered to have scored considered to have scored perfectly, there is a huge perfectly, there is a huge uncontrolled confound.uncontrolled confound.

If you are testing for If you are testing for obliviousness to a obliviousness to a constraint, but constraint, but obliviousness would yield obliviousness would yield all “yes” responses (or a all “yes” responses (or a big preponderance), big preponderance), subjects may start to subjects may start to “second guess” “second guess” themselves.themselves.

Fillers/controls are for this.Fillers/controls are for this.

Balance the itemsBalance the items There should be the same There should be the same

number of items at each level number of items at each level of your independent of your independent variable(s).variable(s).

This maximizes the power of This maximizes the power of statistical analysis later.statistical analysis later.

If a subject misses one, not a If a subject misses one, not a huge problem, but design it huge problem, but design it as a nice “square” if you can.as a nice “square” if you can.

Balance the conditionsBalance the conditions Eliminate confounds.Eliminate confounds. Lexical itemsLexical items

(Known? Frequent? (Known? Frequent? Ambiguous? Long?)Ambiguous? Long?)

ComplexityComplexity

The sandwich arrived.The sandwich arrived. The sandwich the judge ordered The sandwich the judge ordered

arrived.arrived. The sandwich the judge the president The sandwich the judge the president

appointed ordered arrived.appointed ordered arrived.

The president appointed the judge The president appointed the judge who ordered the sandwich that who ordered the sandwich that arrived.arrived.

ComplexityComplexity The nanny was adored by all the children.The nanny was adored by all the children. The nanny who the agency sent was The nanny who the agency sent was

adored by all the children.adored by all the children. The nanny who the agency that the The nanny who the agency that the

neighbors recommended sent was adored neighbors recommended sent was adored by all the children.by all the children.

The neighbors recommended the agency The neighbors recommended the agency that sent the nanny who was adored by all that sent the nanny who was adored by all the children.the children.

What makes those first What makes those first sentences so difficult?sentences so difficult?

Some kind of processing difficulty.Some kind of processing difficulty.

Obvious candidate (Chomsky & Miller 1963, Obvious candidate (Chomsky & Miller 1963, Kimball 1973): You can’t keep track of more than Kimball 1973): You can’t keep track of more than two sentences at a time.two sentences at a time.

[ The sandwich [ the judge [ the president appointed ] [ The sandwich [ the judge [ the president appointed ] ordered ] arrived ].ordered ] arrived ].

If at any point you need more than two verbs to If at any point you need more than two verbs to finish, it’s hard.finish, it’s hard.

Processing loadProcessing load

The idea behind this is that the The idea behind this is that the human sentence processing human sentence processing mechanism has some limited amount mechanism has some limited amount of storage capacity. It’s memory-of storage capacity. It’s memory-related, in some sense.related, in some sense.

(Cf. the 7 ± 2 digit span—short term (Cf. the 7 ± 2 digit span—short term memory has limits, the parser is memory has limits, the parser is sensitive to those/similar limits)sensitive to those/similar limits)

That’s easy enoughThat’s easy enough The celebrity [that attacked the The celebrity [that attacked the

photographer] apologized on national TV.photographer] apologized on national TV.

The celebrity [that the photographer The celebrity [that the photographer attacked] applied for a restraining order.attacked] applied for a restraining order.

The first one is slightly easier, but we The first one is slightly easier, but we have no explanation for it under the “two have no explanation for it under the “two sentences” view.sentences” view.

What’s different?What’s different?

Perhaps it’s “floating” Perhaps it’s “floating” --rolesroles

The celebrity that _ attacked the photographer The celebrity that _ attacked the photographer apologized.apologized. Never more than one floating Never more than one floating -role.-role.

The celebrity that the photographer attacked _ The celebrity that the photographer attacked _ applied…applied… At one point, two floating At one point, two floating -roles.-roles.

There seems to be something about hanging onto There seems to be something about hanging onto these nouns without having something to hook these nouns without having something to hook them onto. (Also sounds digit-span-like… There’s them onto. (Also sounds digit-span-like… There’s a a reason reason phone numbers are divided).phone numbers are divided).

ComplexityComplexity

The nanny [who the agency [that John The nanny [who the agency [that John recommended] sent] was adored by all recommended] sent] was adored by all the children.the children.

(Thanks!) The nanny [who the agency (Thanks!) The nanny [who the agency [that [that youyou recommended] sent] was recommended] sent] was adored by all the children.adored by all the children.

Well, that’s funny—Well, that’s funny—now now what’s what’s different?different?

ReferenceReference

The nanny [who the agency [that The nanny [who the agency [that youyou recommended] sent] was adored…recommended] sent] was adored…

The nanny [who the agency [that The nanny [who the agency [that JohnJohn recommended] sent] was adored…recommended] sent] was adored…

The nanny [who the agency [that The nanny [who the agency [that the the neighborneighbor recommended] sent] was recommended] sent] was adoredadored

The nanny [who the agency [that The nanny [who the agency [that theythey recommended] sent] was adored…recommended] sent] was adored…

It seems like there’s a It seems like there’s a real difference—real difference—isis there? there?

Here is where the psycholinguistic Here is where the psycholinguistic experiment comes in.experiment comes in.

Suppose we want to test—what’s the Suppose we want to test—what’s the real difference in processing real difference in processing difficulty between these:difficulty between these: pronouns with a referent (pronouns with a referent (youyou)) proper names (proper names (JohnJohn)) definite descriptions (definite descriptions (the studentthe student)) pronouns without a referent (pronouns without a referent (theythey))

Designing an experimentDesigning an experiment A couple of ways to go about this…A couple of ways to go about this… Questionnaire:Questionnaire:

The rat the cat the dog chased caught died.The rat the cat the dog chased caught died. (bad) 1 2 3 4 5 (good)(bad) 1 2 3 4 5 (good)

On-line reaction time processing:On-line reaction time processing: The rat --- --- --- --- ----- ------ ----.The rat --- --- --- --- ----- ------ ----. --- --- the cat --- --- ----- ------ ----.--- --- the cat --- --- ----- ------ ----. --- --- --- --- the dog ------ ------ ----.--- --- --- --- the dog ------ ------ ----. --- --- --- --- --- --- chased ------- ----.--- --- --- --- --- --- chased ------- ----.

Designing an experimentDesigning an experiment

Questionnaires are easy, quick, easy Questionnaires are easy, quick, easy to administer.to administer.

They give you only course-grained They give you only course-grained judgments about the whole sentence judgments about the whole sentence (probably about the point of (probably about the point of maximum complexity)maximum complexity)

On-line experiments are more On-line experiments are more difficult, but we can see difficult, but we can see wherewhere people get bogged down.people get bogged down.

ConditionsConditions

At the outset, we need to define what At the outset, we need to define what we’re going to test for.we’re going to test for.

Suppose we’re going to do a simple Suppose we’re going to do a simple test of the test of the thatthat-trace effect.-trace effect.

The question is: are sentences that The question is: are sentences that violate the violate the thatthat-trace filter worse -trace filter worse than those that don’t?than those that don’t? Who did John say that left?Who did John say that left? Which capybara did Madonna meet on Which capybara did Madonna meet on

Mars?Mars?

ConfoundsConfounds

Controlling for confounds is one of Controlling for confounds is one of the most important things you have the most important things you have to do.to do.

ThatThat-trace filter violations are not -trace filter violations are not the only things that differentiate the only things that differentiate these sentences.these sentences. Who did John say that left?Who did John say that left? Which capybara did Madonna meet on Which capybara did Madonna meet on

Mars?Mars?

ConfoundsConfounds Who did John say that left?Who did John say that left? Which capybara did Madonna meet on Mars?Which capybara did Madonna meet on Mars?

Differences in Differences in lexical frequencylexical frequency can have can have a big effect on processing difficulty/time.a big effect on processing difficulty/time.

Differences in Differences in plausibilityplausibility can have a big can have a big effect on ratings from subjects.effect on ratings from subjects.

Differences in Differences in lengthlength can conceivably can conceivably play a role.play a role.

Differences in structure can have an Differences in structure can have an effect.effect.

ConfoundsConfounds Who did John say that left?Who did John say that left? Which capybara did Madonna meet on Which capybara did Madonna meet on

Mars?Mars?

The point is: If you find that one The point is: If you find that one sentence is judged worse than the sentence is judged worse than the other, other, we’ve learned nothingwe’ve learned nothing. We . We have no idea to what extent the have no idea to what extent the thatthat--trace violation played a role in the trace violation played a role in the difference.difference.

ConfoundsConfounds You want to do everything you can to be testing You want to do everything you can to be testing

exactly exactly what you mean to be testing for.what you mean to be testing for. We can’t We can’t control control frequency, familiarity, frequency, familiarity,

plausibility very reliably—but we can plausibility very reliably—but we can control forcontrol for them to some extent.them to some extent. Who did John say that left?Who did John say that left? Who did John say left?Who did John say left?

Keep everything the same and at least they don’t Keep everything the same and at least they don’t differdiffer in structure, frequency, plausibility—only in structure, frequency, plausibility—only in in thatthat-trace. (Well, and here, length).-trace. (Well, and here, length). HoweverHowever—note that length now works against —note that length now works against thatthat--

trace, unless shorter sentences are harder.trace, unless shorter sentences are harder.

ConditionsConditions To start, we might say we want to test two To start, we might say we want to test two

conditionsconditions:: Sentences with a Sentences with a thatthat-trace violation-trace violation Sentences with no Sentences with no thatthat-trace violation-trace violation

But we can’t build these without a length But we can’t build these without a length confound—holding everything else constant, we confound—holding everything else constant, we still have one fewer words in the still have one fewer words in the thatthat-trace case. -trace case. How do we solve this?How do we solve this?

How can we show that the effect of the extra How can we show that the effect of the extra word word thatthat isn’t responsible for the overall effect? isn’t responsible for the overall effect?

ConditionsConditions

The trick we’ll use is to have a second The trick we’ll use is to have a second set of conditions, testing only the exact set of conditions, testing only the exact length issue. There’s no length issue. There’s no thatthat-trace -trace problem in object questions, so we can problem in object questions, so we can compare:compare: Who did John say Mary met?Who did John say Mary met? Who did John say that Mary met?Who did John say that Mary met?

to see how the difference compares to:to see how the difference compares to: Who did John say met Mary?Who did John say met Mary? Who did John say that met Mary?Who did John say that met Mary?

FactorsFactors We now have two “factors”—our sentences differ We now have two “factors”—our sentences differ

in terms of:in terms of: subject vs. object questionsubject vs. object question presence vs. absence of presence vs. absence of thatthat

When we analyze the result, we can determine When we analyze the result, we can determine the extent of the influence of the second factor the extent of the influence of the second factor by looking at the object condition and comparing by looking at the object condition and comparing it to the (disproportionately larger) effect of the it to the (disproportionately larger) effect of the presence of presence of thatthat in the subject condition. in the subject condition.

2x2 factorial design2x2 factorial design

Often this is drawn in a table, with each Often this is drawn in a table, with each factor on a different dimension.factor on a different dimension.

This is known as a 2x2 factorial design.This is known as a 2x2 factorial design.

without without thatthat with with thatthat

Subject Subject extractioextractionn

Who do you Who do you think likes John?think likes John?

Who do you think Who do you think that likes John?that likes John?

ObjectObjectextractioextractionn

Who do you Who do you think John likes?think John likes?

Who do you think Who do you think that John likes?that John likes?

ContextContext It turns out that the It turns out that the contextcontext also seems to have also seems to have

an effect on people’s ratings of sentences.an effect on people’s ratings of sentences. What comes before can color your subjects’ What comes before can color your subjects’

opinions. This too needs to be controlled for.opinions. This too needs to be controlled for. One aspect of this is that we generally avoid One aspect of this is that we generally avoid

showing a single subject two versions of the showing a single subject two versions of the same sentence (more relevant when they’re more same sentence (more relevant when they’re more unique than the unique than the John and MaryJohn and Mary sentences)—the sentences)—the reaction to the second viewing may be based a reaction to the second viewing may be based a lot on the first one.lot on the first one.

Another is that you want to give the sentences in Another is that you want to give the sentences in a different order to different subjects.a different order to different subjects.

StrategyStrategy You also don’t want your subjects to “catch on” You also don’t want your subjects to “catch on”

to what you’re testing for—they will often see to what you’re testing for—they will often see that they’re getting a lot of sentences with a that they’re getting a lot of sentences with a particular structure and start responding to them particular structure and start responding to them based on their own theory of whether the based on their own theory of whether the sentence should be good or not, no longer sentence should be good or not, no longer performing the task.performing the task.

Nor do you want to include people who seem to Nor do you want to include people who seem to simply have a crazy grammar (or more likely just simply have a crazy grammar (or more likely just aren’t understanding or doing the task).aren’t understanding or doing the task).

FillersFillers The solution to both problems is traditionally to The solution to both problems is traditionally to

use “fillers”, sentences which are not really part use “fillers”, sentences which are not really part of the experiment.of the experiment.

These can provide a baseline to show that a These can provide a baseline to show that a given subject is behaving “normally” and can given subject is behaving “normally” and can serve to obscure the real “test items.”serve to obscure the real “test items.”

There’s no answer to “how many fillers should There’s no answer to “how many fillers should there be?” but it shouldn’t be fewer than the test there be?” but it shouldn’t be fewer than the test items, and probably a 2:1 (filler:test item) ration items, and probably a 2:1 (filler:test item) ration is a good idea.is a good idea.

Fillers can’t be all good! About half should be Fillers can’t be all good! About half should be bad.bad.

Instructions and practiceInstructions and practice

Another vital aspect of this procedure is Another vital aspect of this procedure is to be sure that the subjects to be sure that the subjects understand understand the taskthe task that they are supposed to be that they are supposed to be performing (and all in the same way).performing (and all in the same way).

The wordings of the instructions and The wordings of the instructions and the rating scales are very important, the rating scales are very important, and it’s a good idea to give subjects a and it’s a good idea to give subjects a few “practice” items before the test few “practice” items before the test begins (clear cases for which the begins (clear cases for which the answers are provided).answers are provided).

InstructionsInstructions ““Is the sentence grammatical?” is not a good Is the sentence grammatical?” is not a good

instruction.instruction. The closest the naïve subject can come to The closest the naïve subject can come to

“grammatical” will probably be to evaluate based on “grammatical” will probably be to evaluate based on prescriptive rules learned in grammar classes—the prescriptive rules learned in grammar classes—the term does not have the same meaning in common term does not have the same meaning in common usage.usage.

““Is this a good sentence?” also has problems.Is this a good sentence?” also has problems. I’d never say that, I’d say it another way.I’d never say that, I’d say it another way. That could never happen.That could never happen.

Numerical/category Numerical/category ratingsratings

How do you ask people to judge?How do you ask people to judge? Good/badGood/bad

Forces a choice, for anything other than “certainly Forces a choice, for anything other than “certainly good” and “certainly bad” there’s a chance that it good” and “certainly bad” there’s a chance that it doesn’t reflect the subject’s actual opinion—no doesn’t reflect the subject’s actual opinion—no differentiation between “great!” and “well, kind of ok”differentiation between “great!” and “well, kind of ok”

Good/neutral/badGood/neutral/bad Neutral also tends to get used for “I can’t decide” Neutral also tends to get used for “I can’t decide”

which is different from “I’m confident it has an in-which is different from “I’m confident it has an in-between status” (doesn’t change much if you call it “in-between status” (doesn’t change much if you call it “in-between”)between”)

Numerical/category Numerical/category ratingsratings

Rate the sentence: (good) 1 2 3 4 5 Rate the sentence: (good) 1 2 3 4 5 (bad)(bad) Some people will never use the ends of Some people will never use the ends of

the scale, likely to confound certainty with the scale, likely to confound certainty with acceptability. Also, for certain acceptability. Also, for certain applications, “3” is unusable.applications, “3” is unusable.

Rate the sentence: (good) 1 2 3 4 (bad)Rate the sentence: (good) 1 2 3 4 (bad) Can be treated as a categorial judgment, Can be treated as a categorial judgment,

may be able to factor out some personality may be able to factor out some personality aspects. This is the one I tend to like best.aspects. This is the one I tend to like best.

Online tasksOnline tasks The nice thing about an online experiment is it to The nice thing about an online experiment is it to

some extent takes it “out of their hands.” The some extent takes it “out of their hands.” The subject simply reacts, and we time it.subject simply reacts, and we time it.

Nevertheless, it is still important to ensure that Nevertheless, it is still important to ensure that the subject is performing the task, paying the subject is performing the task, paying attention.attention. Often can be addressed by questions about the Often can be addressed by questions about the

sentence afterwards they must answer.sentence afterwards they must answer. Feedback can strengthen the motivation.Feedback can strengthen the motivation.

Your task (Lab #3)Your task (Lab #3) Negative questions in Negative questions in

child English often don’t child English often don’t have the adult shape.have the adult shape.

Suppose that we’re Suppose that we’re testing the hypothesis testing the hypothesis that children have that children have difficulty with negation difficulty with negation on an inverted I (negation on an inverted I (negation has to stay within the IP).has to stay within the IP).

Predicts:Predicts: trouble with trouble with What didn’t you buy?What didn’t you buy? but but not necessarily with not necessarily with Who Who didn’t you meet?didn’t you meet?

Where he couldn’t eat the raisin?Where he couldn’t eat the raisin? What did he didn’t wanna bring What did he didn’t wanna bring

to school?to school? Why can you not eat chocolate?Why can you not eat chocolate?

Design an experiment to Design an experiment to test the prediction.test the prediction. What type of experiment?What type of experiment? What factors?What factors? Give an example of at least Give an example of at least

two different trials.two different trials. What would you expect to What would you expect to

see if the hypothesis is see if the hypothesis is true?true?