individual differences reveal stages of l2 grammatical...

17
Bilingualism: Language and Cognition http://journals.cambridge.org/BIL Additional services for Bilingualism: Language and Cognition: Email alerts: Click here Subscriptions: Click here Commercial reprints: Click here Terms of use : Click here Individual differences reveal stages of L2 grammatical acquisition: ERP evidence DARREN TANNER, JUDITH MCLAUGHLIN, JULIA HERSCHENSOHN and LEE OSTERHOUT Bilingualism: Language and Cognition / FirstView Article / August 2012, pp 1 16 DOI: 10.1017/S1366728912000302, Published online: 16 August 2012 Link to this article: http://journals.cambridge.org/abstract_S1366728912000302 How to cite this article: DARREN TANNER, JUDITH MCLAUGHLIN, JULIA HERSCHENSOHN and LEE OSTERHOUT Individual differences reveal stages of L2 grammatical acquisition: ERP evidence. Bilingualism: Language and Cognition, Available on CJO 2012 doi:10.1017/S1366728912000302 Request Permissions : Click here Downloaded from http://journals.cambridge.org/BIL, IP address: 128.118.88.243 on 17 Aug 2012

Upload: others

Post on 11-Aug-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Individual differences reveal stages of L2 grammatical ...faculty.washington.edu/losterho/BLC_tanner_etal_2012.pdf · Keywords: N400, P600, second language acquisition, individual

Bilingualism: Language and Cognitionhttp://journals.cambridge.org/BIL

Additional services for Bilingualism: Language and Cognition:

Email alerts: Click hereSubscriptions: Click hereCommercial reprints: Click hereTerms of use : Click here

Individual differences reveal stages of L2 grammatical acquisition: ERP evidence

DARREN TANNER, JUDITH MCLAUGHLIN, JULIA HERSCHENSOHN and LEE OSTERHOUT

Bilingualism: Language and Cognition / FirstView Article / August 2012, pp 1 ­ 16DOI: 10.1017/S1366728912000302, Published online: 16 August 2012

Link to this article: http://journals.cambridge.org/abstract_S1366728912000302

How to cite this article:DARREN TANNER, JUDITH MCLAUGHLIN, JULIA HERSCHENSOHN and LEE OSTERHOUT Individual differences reveal stages of L2 grammatical acquisition: ERP evidence. Bilingualism: Language and Cognition, Available on CJO 2012 doi:10.1017/S1366728912000302

Request Permissions : Click here

Downloaded from http://journals.cambridge.org/BIL, IP address: 128.118.88.243 on 17 Aug 2012

Page 2: Individual differences reveal stages of L2 grammatical ...faculty.washington.edu/losterho/BLC_tanner_etal_2012.pdf · Keywords: N400, P600, second language acquisition, individual

http://journals.cambridge.org Downloaded: 17 Aug 2012 IP address: 128.118.88.243

Bilingualism: Language and Cognition: page 1 of 16 C© Cambridge University Press 2012 doi:10.1017/S1366728912000302

Individual differences revealstages of L2 grammaticalacquisition: ERP evidence∗

DA R R E N TA N N E RPennsylvania State University, Departmentof PsychologyJ U D I T H M C L AU G H L I NUniversity of Washington, Department of PsychologyJ U L I A H E R S C H E N S O H NUniversity of Washington, Department of LinguisticsL E E O S T E R H O U TUniversity of Washington, Department of Psychology

(Received: August 19, 2011; final revision received: April 23, 2012; accepted: June 11, 2012)

Here we report findings from a cross-sectional study of morphosyntactic processing in native German speakers and nativeEnglish speakers enrolled in college-level German courses. Event-related brain potentials were recorded while participantsread sentences that were either well-formed or violated German subject–verb agreement. Results showed that grammaticalviolations elicited large P600 effects in the native Germans and learners enrolled in third-year courses. Grand meanwaveforms for learners enrolled in first-year courses showed a biphasic N400–P600 response. However, subsequentcorrelation analyses revealed that most individuals showed either an N400 or a P600, but not both, and that brain responsetype was associated with behavioral measures of grammatical sensitivity. These results support models of second languageacquisition which implicate qualitative changes in the neural substrates of second language grammar processing associatedwith learning. Importantly, we show that new insights into L2 learning result when the cross-subject variability is treated as asource of evidence rather than a source of noise.

Keywords: N400, P600, second language acquisition, individual differences, morphosyntax, ERP

Introduction

Over the last several decades, there has been anenormous increase in interest in the neural substratesof language processing. Accordingly, there has beena proliferation of studies using event-related brainpotentials (ERPs) which have revealed a great dealabout how and when different types of information areintegrated during real-time comprehension in native (L1)speakers of a language. ERPs have also been used tostudy neurocognitive aspects of second language (L2)processing, as ERPs’ multidimentional nature allows theinvestigation of fundamental questions about the cognitiveprocesses subserving late-learned languages. In studies of

* We would like to thank the participants, as well as members ofthe Cognitive Neuroscience of Language Lab at the University ofWashington, including Missy Takahashi, Ilona Pitkänen, and GeoffValentine for their help in collecting the data for this study. Wewould especially like to thank Kayo Inoue for valuable discussionand insights into individual variation. This research was supportedby grant R01DC01947 from the National Institute on Deafness andOther Communication Disorders to Lee Osterhout. Portions of thismanuscript were prepared while Darren Tanner was supported bya Neurolinguistics Dissertation Fellowship from the William OrrDingwall Foundation and by NSF OISE-0968369. We would alsolike to thank two anonymous reviewers who provided thoughtfulcomments on earlier versions of this article. Any remaining errorsare, of course, our own.

Address for correspondence:Darren Tanner, Pennsylvania State University, Center for Language Science, 4F Thomas Building, University Park, PA 16802, [email protected]

L2 learning, identifying whether or not learners’ ERPwaveforms approximate those of native speakers hassometimes been taken as a ‘litmus test’ for whether L2processing is fundamentally similar to or different fromnative language processing. For example, an experimentaleffect size smaller than that found in native speakers isoften taken to mean less robust processing in the learnerpopulation, while a qualitatively different ERP effect orthe inability to detect some effect is often taken to reflecta fundamental difference in the neural substrates of L2processing or a lack of that specific neurocognitive processin the group of learners, respectively (see e.g., Rossi,Gugler, Friederici & Hahne, 2006; Sabourin & Stowe,2008, for examples of these types of inferences).

An important caveat, however, is that much of thepublished work has reported ERPs that represent averagesover both trials and individuals. In order to achieve anadequate signal-to-noise ratio, voltages from the rawelectroencephalogram (EEG) in a time epoch of interestare averaged over all trials in a given experimentalcondition within subjects, and then averaged againover subjects. These grand mean waveforms representbrainwave activity which is time- and phase-locked tothe onset of the stimulus of interest and consistent acrossboth trials and subjects (see Handy, 2005; Luck, 2005). Interms of L1 processing, researchers generally assume thatmonolingual native speakers of a language will exhibit

Page 3: Individual differences reveal stages of L2 grammatical ...faculty.washington.edu/losterho/BLC_tanner_etal_2012.pdf · Keywords: N400, P600, second language acquisition, individual

http://journals.cambridge.org Downloaded: 17 Aug 2012 IP address: 128.118.88.243

2 Darren Tanner, Judith McLaughlin, Julia Herschensohn and Lee Osterhout

similar neural signatures of language processing. Thisassumption seems reasonable, as there is a remarkableconsistency in ERP responses seen across experiments andlanguages. However, L2 learning is subject to significantindividual variation, which in turn can lead to problemsof interpretation for traditional ERP analyses. We showhere that in certain circumstances this variability is highlysystematic. We show further that new insights into L2learning result when the cross-subject variability is treatedas a source of evidence rather than a source of noise.

Within the context of native languages, the use of grandmeans has proven to be a useful tool for studying languageprocessing. One of the most remarkably consistentand replicable results over 30 years of cross-linguisticlanguage-related ERP research is that lexico-semanticand morphosyntactic manipulations elicit qualitativelydifferent brain responses. All content words elicit anegative-going brain wave with a peak at around 400ms after presentation (the N400), but the size of thispeak can be modulated by numerous factors, such asa word’s semantic relatedness to a preceding context,cloze probability, and corpus frequency (Bentin, 1987;Kutas & Federmeier, 2000; Kutas & Hillyard, 1980;Osterhout & Nicol, 1999). Larger peak amplitudes arethought to reflect greater difficulty with lexical accessand integration (the N400 ‘effect’). On the other hand,relative to well-formed controls, a wide range of sentence-embedded morphosyntactic anomalies (such as violationsof agreement, tense, case, and verb subcategorization)elicit a large positive-going wave with a peak around 600ms poststimulus (the P600: Ainsworth-Darnell, Shulman& Boland, 1998; Friederici, Hahne & Mecklinger, 1996;Hagoort, Brown & Groothusen, 1993; Kaan, Harris,Gibson & Holcomb, 2000; Osterhout & Holcomb, 1992,1995). Some studies of morphosyntactic processing havereported an additional negative-going wave with anonset of 100–400 ms poststimulus with a largely leftanterior distribution preceding the P600 (the Left AnteriorNegativity, or LAN: Friederici et al., 1996; Neville,Nicol, Barss, Forster & Garrett, 1991; Osterhout &Holcomb, 1992). Given the reliability of these resultsacross languages, experimental manipulations, and taskdemands, it is clear that ERPs are differentially sensitive todistinct levels of processing, and that grand mean analysescapture this consistency.1

1 There are some exceptions to the generalization that semantic andsyntactic violations always elicit N400 and P600 effects, respectively.Some studies have reported N400 effects to syntactic agreementviolations (e.g., Bentin & Deutsch, 2001; Severens et al., 2008),and others have reported P600 effects to certain types of semanticviolations (e.g., Kim & Osterhout, 2005; Kolk, Chwilla, van Herten& Oor, 2003; Nieuwland & Van Berkum, 2005; van de Meerendonk,Kolk, Vissers & Chwilla, 2010). However, the semantics/N400 andsyntax/P600 correlation holds under the conditions investigated in

Other research has shown that ERPs are also sensitiveto individual differences in L1 processing. For example,the amplitude and onset of ERP effects can be modulatedby individuals’ working memory capacity (King &Kutas, 1995; Vos, Gunter, Kolk & Mulder, 2001). Morerecent research has indicated that individuals’ brainresponses to syntactic anomalies can vary systematicallywith differences in language proficiency, even amongmonolingual native speakers of a language. Pakulakand Neville (2010) reported a correlation betweenwaveform characteristics (the laterality of an early LANcomponent and the amplitude of the P600 component)and participants’ L1 (English) proficiency. The anomalieselicited a more left-lateralized LAN and a larger-amplitude P600 in more proficient participants. Otherresearchers have shown that not only can quantitativeaspects of ERP responses vary across individuals,but also the type of response. For example, somestudies have demonstrated that under certain conditions,biphasic negative–positive responses to anomalies seen ingrand mean waveforms may not represent true biphasicresponses within individuals, but rather be an artifact ofaveraging across individuals, some of whom show anN400 and some of whom show a P600 (Nieuwland & VanBerkum, 2008; Osterhout, 1997; Osterhout, McLaughlin,Kim, Greewald & Inoue, 2004). More recently, resultsfrom Inoue & Osterhout (2012) indicate that within andacross individuals, N400 and P600 effect magnitudesare negatively correlated, such that as one increasesin magnitude, the other decreases to a similar degree.Furthermore, Nakano and colleagues (Nakano, Saron &Swaab, 2010) showed that working memory span canmodulate type of response to verb–argument animacyviolations. In their study, those with lower span measuresshowed N400 effects to animacy violations whereasthose with higher span measures showed P600 effects. Ittherefore seems that systematic individual variation existsbut that this variability is obscured by traditional grandmean ERP waveforms.

For L2 learners the assumption of homogeneity ofresponses across individuals may be even more tenuous.Unlike L1 acquisition, success in L2 learning hasbeen shown to correlate with a number of individualfactors such as general intelligence, specific languageaptitude, learning strategy, and motivation (Dörnyei &Skehan, 2003; Naiman, Fröhlich, Stern & Todesco, 1996;Robinson, 2002; Skehan, 1989). McDonald (2006) hasshown that L2 learners’, but not native speakers’, accuracyand reaction time in a grammaticality judgment task werecorrelated with working memory and lexical decodingmeasures. L2 learners also have shown variabilityin grammaticality judgment accuracy across testing

this study (i.e., the presentation of relatively simple sentences, seeKuperberg, 2007, for discussion).

Page 4: Individual differences reveal stages of L2 grammatical ...faculty.washington.edu/losterho/BLC_tanner_etal_2012.pdf · Keywords: N400, P600, second language acquisition, individual

http://journals.cambridge.org Downloaded: 17 Aug 2012 IP address: 128.118.88.243

Individual differences in L2 ERPs 3

sessions, even when identical items were used on bothoccasions (Johnson, Shenkman, Newport & Medin,1996). Learners additionally can show knowledge of L2grammatical information in offline tasks, but no sensitivityin online tasks, suggesting greater variability in the timingof access and integration of that knowledge relative tonatives (Clahsen & Felser, 2006). This variability is madeapparent in a study of reaction time and accuracy ina grammaticality judgment task by McDonald (2000):the reported standard deviations for late L2 learnerswere generally two and three times larger for reactiontime and accuracy, respectively, than for native speakercontrols. In terms of its implications for ERP researchinto L2 processing, this greater variation, both betweenindividuals and between trials within individuals, meansthat there may be increased fluctuation in the timing andnature of neural responses to L2 stimuli, thus obscuringwhat may be true effects from surfacing in grand meanwaveforms.

Indeed, ERP research into L2 processing has yieldedsomewhat mixed results regarding the nature and statusof syntactic processes in non-native speakers. Severalstudies have shown that P600s can be reliably elicited innon-native speakers, suggesting some continuity betweennative and non-native syntactic processing systems,especially for grammatical features shared across theL1 and L2, and for novel L2 features for learners athigh L2 proficiency (Foucart & Frenck-Mestre, 2011;Frenck-Mestre, Osterhout, McLaughlin & Foucart, 2008;Gillon Dowens, Guo, Guo, Barber & Carreiras, 2011;Hahne, Mueller & Clahsen, 2006; Morgan-Short, Sanz,Steinhauer & Ullman, 2010; Morgan-Short, Steinhauer,Sanz & Ullman, 2012; Rossi et al., 2006; Tokowicz &MacWhinney, 2005). Others have failed to find robustP600 effects to syntactic anomalies, usually when theL2 feature is not found or is realized differently inthe L1 (Foucart & Frenck-Mestre, 2011; Hahne &Friederici, 2001; Ojima, Nakata & Kakigi, 2005; Sabourin& Haverkort, 2003; Sabourin & Stowe, 2008). Stillothers have reported that syntactic anomalies can elicitqualitatively different responses in L2 learners versusnative speakers, usually in the form of a negativity ratherthan a positivity (Chen, Shu, Liu, Zhao & Li, 2007; Guo,Guo, Yan, Jiang & Peng, 2009; Sabourin & Stowe, 2008)or a biphasic negative–positive response (Weber & Lavric,2008).

It should be noted that the studies mentioned abovewhich have failed to find classic P600 effects in L2learners used traditional grand averages to study ERPeffects. However, as pointed out by Osterhout andcolleagues (Osterhout, McLaughlin, Pitkänen, Frenck-Mestre & Molinaro, 2006) and McLaughlin andcolleagues (McLaughlin, Tanner, Pitkänen, Frenck-Mestre, Inoue, Valentine & Osterhout, 2010), null resultsin L2 ERP research are especially problematic to interpret,

since a given electrophysiological effect may be presenton most trials in a few individuals, or on a few trialsin most individuals, but be obscured in the averagingprocess due to noise in the raw electroencephalogram.Variability in timing of the effect across trials andindividuals can additionally reduce effect sizes in ERPgrand means, even when the true amplitude of agiven electrophysiological effect is consistent acrosstrials (Luck, 2005). Nonetheless, given the findings ofsystematic variability in L1 processing reported above,it seems likely that at least some variability between L2learners may also be systematic and therefore observablein analyses of individuals’ ERPs.

Only a few studies have investigated individualdifferences in ERP correlates of L2 syntactic processing.These experiments have generally used grouped designs toinvestigate the impact of some individual-level variable,such as age of arrival (Weber-Fox & Neville, 1996) orL2 proficiency (Rossi et al., 2006), on learners’ brainresponses to syntactic violations. Using this approachcould reduce problematic between-subject variability, asgroup members would be relatively homogenous withregard to some individual difference dimension (e.g.,proficiency; see Steinhauer, White & Drury, 2009; vanHell & Tokowicz, 2010, for discussion). Others havefurther reduced between-subject variability by adoptingwithin-subjects longitudinal designs to study changes inindividuals’ brain responses over time as L2 proficiencyincreases (McLaughlin et al., 2010; Morgan-Short etal., 2010; Morgan-Short et al., 2012; Osterhout et al.,2006), or artificial language training paradigms whichallow learners to reach high proficiency in a very shortamount of time (Friederici, Steinhauer & Pfeifer, 2002;Morgan-Short et al., 2010; Morgan-Short et al., 2012).Another possibility for studying individual differencesin processing is to use regression-based statisticaltechniques, as regression models have the ability tocapture potentially linear and graded effects of individualdifferences measures. However, only a few studies haveused this approach, and nearly all of these have focusedon modulations of the N400 component associated withsemantic processing (Moreno & Kutas, 2005; Newman,Tremblay, Nichols, Neville & Ullman, 2012; Ojima,Matsuba-Kurita, Nakamura, Hoshino & Hagiwara, 2011;though see Bond, Gabriele, Fiorentino & Alemán Bañón,2011).

In the study reported below we cross-sectionallyinvestigated grammatical processing in English-speakinglearners of L2 German who were enrolled in classroom-based university German courses. Our participantsare therefore representative of a common L2 learnerpopulation in the United States. We recorded participants’brain responses as they read sentences that were eitherwell-formed or contained violations of German subject–verb agreement. Verb agreement is a grammatical feature

Page 5: Individual differences reveal stages of L2 grammatical ...faculty.washington.edu/losterho/BLC_tanner_etal_2012.pdf · Keywords: N400, P600, second language acquisition, individual

http://journals.cambridge.org Downloaded: 17 Aug 2012 IP address: 128.118.88.243

4 Darren Tanner, Judith McLaughlin, Julia Herschensohn and Lee Osterhout

shared by both English and German.2 Shared featuresare often transferred from the L1 to the L2, and shouldthus be acquired early during the L2 learning process(MacWhinney, 2005; Sabourin, Stowe & de Haan, 2006;Schwartz & Sprouse, 1996). We quantified learners’brain responses first using grand mean analyses, andthen analyzed individual variation among learners’ ERPresponses with regression-based models. We demonstratethat although grand mean analyses showed statisticallyrobust findings in L2 learners of all levels, they obscuredsystematic, qualitative and quantitative differences amonglearners’ brain responses to L2 grammatical anomalies.

Method

Participants

Our participants included 13 native speakers of German(mean age: 28 years; range: 18–51; eight female) and 33native English-speaking students enrolled in university-level second language German courses. Twenty werenovice learners enrolled in the final course of the first-yearGerman sequence (mean hours of instruction = 123.8,SD = 10.0; mean age: 20 years; range: 18–25; 10 female)and 13 were enrolled in third-year German courses (meanage: 20 years; range: 19–24; six female). All participantswere healthy and had normal or corrected-to-normalvision and gave their informed consent after the natureand possible consequences of the study were explained.Participants received a small monetary compensation fortaking part in the study.

Materials

Stimuli were sentences in German consisting of lexicalitems chosen from the first seven chapters of thetextbook used in first-year German courses at theUniversity of Washington. Sixty sentence pairs werecreated, with one member of each pair being semanticallycoherent and grammatical and the second memberbeing identical, except for showing incorrect agreementbetween the subject pronoun and verb (e.g., Ichwohne/∗wohnt in Berlin, “I live/∗lives in Berlin”). Allperson/number combinations in German are marked withovert, phonologically realized morphemes. Grammatical

2 There are, however, differences between English and German in howthey realize this agreement. English marks agreement on the copulabe and marks 3rd person singular agreement on lexical verbs (with-s), but only in the present tense. German, on the other hand, marksfour unique person–number combinations in both present and pasttense, but the morphological expression of agreement in German ishighly regular (Durrell, 2006). Nonetheless, given the similarity ofthe grammatical contrast across the two languages and the regularityof the German rule, we expect English learners of German to showlittle difficulty with German agreement.

and ungrammatical sentence pairs were distributedacross two lists in a Latin-square design, such thateach list contained only one version of each sentence.Experimental sentences were randomized among 140filler sentences (70 ungrammatical) containing othertypes of syntactic anomalies. Sixty sentences containedviolations of number agreement between a determiner orquantifier and noun (e.g., Viele/∗ein Bücher liegen aufdem Tisch, “Many/∗a books are on the table”) and 10sentences contained an extra auxiliary verb (e.g., ∗MeinBruder macht sind seine Arbeit, “My brother does are hiswork”). Each list contained a total of 200 sentences, halfof which were ungrammatical.

Procedure

Participants were tested in a single session lastingapproximately 85 minutes (including about 30 minutes ofexperimental preparation). Upon arrival in the laboratory,each participant was asked to fill out an abridged version ofthe Edinburgh Handedness Questionnaire and a languagehistory questionnaire. Each participant was randomlyassigned to one of the stimulus lists and was seatedin a comfortable recliner in front of a CRT monitor.Participants were instructed to relax and minimizemovements while reading and to read each sentence asnormally as possible. Each trial consisted of the followingevents: each sentence was preceded by a blank screenfor 1000 ms, followed by a fixation cross, followed bya stimulus sentence, presented one word at a time. Thefixation cross and each word appeared on the screenfor 475 ms followed by a 250 ms blank screen betweenwords. Sentence-ending words appeared with a full stopfollowed by a “Good/Bad” response prompt. Participantswere instructed to respond “good” if they felt it was awell-formed, grammatical sentence in German and “bad”if they felt it was ungrammatical or violated some ruleof German. Participants were randomly assigned to useeither their left or right hand for the “good” response.

Data acquisition and analysis

Continuous EEG was recorded from 19 tin electrodesattached to an elastic cap (Eletro-cap International) inaccordance with the 10–20 system (Jasper, 1958). Eyemovements and blinks were monitored by two electrodes,one placed beneath the left eye and one placed to the rightof the right eye. The electrodes were referenced to anelectrode placed over the left mastoid and were amplifiedwith a bandpass of 0.01–100 Hz (3 dB cutoff) by an SAInstruments bioamplifier system. EEG was recorded froman additional electrode placed on the right mastoid toidentify if there were any experimental effects detectableover the mastoids; no such effects were found. Impedances

Page 6: Individual differences reveal stages of L2 grammatical ...faculty.washington.edu/losterho/BLC_tanner_etal_2012.pdf · Keywords: N400, P600, second language acquisition, individual

http://journals.cambridge.org Downloaded: 17 Aug 2012 IP address: 128.118.88.243

Individual differences in L2 ERPs 5

at scalp and mastoid electrodes were held below 5 k� andbelow 15 k� at eye electrodes.

Continuous analog-to-digital conversion of the EEGand stimulus trigger codes was performed at a samplingfrequency of 200 Hz. ERPs, time-locked to the onsetof the critical word, were averaged off-line for eachparticipant at each electrode site in each condition. Adigital low-pass filter of 30 Hz was applied to individuals’averaged waveforms prior to analysis. Grand averagewaveforms were created by averaging over participants.Trials characterized by eye blinks, excessive muscleartifact, or amplifier blocking were not included in theaverages; 11.8% of trials overall were removed dueto artifacts. The number of rejections did not differsignificantly between conditions or groups.

Behavioral results were quantified both using d-primescores (Wickens, 2002) and proportion correct in thegrammatical and ungrammatical sentence conditions.Behavioral results were analyzed with ANOVAs usinggroup (native, third year, first year) as a between-subjects factor; ANOVAs on proportion correct containedgrammaticality (grammatical, ungrammatical) as anadditional repeated-measures factor. ERP components ofinterest were quantified by computer as mean voltagewithin a window of activity. In accordance with previousliterature and visual inspection of the data, the followingtime windows were chosen: 50–150 ms (N1), 150–300 ms (P2), 300–500 ms (N400), and 500–800 ms(P600), relative to a 100 ms prestimulus baseline.Within each time window ANOVAs were calculatedwith grammaticality (grammatical, ungrammatical) as awithin-subjects factor. Data from midline (Fz, Cz, Pz),medial–lateral (right hemisphere: Fp2, F4, C4, P4, O2;left hemisphere: Fp1, F3, C3, P3, O1), and lateral–lateral (right hemisphere: F8, T8, P8; left hemisphere:F7, T7, P7) electrode sites were treated separately inorder to identify topographic and hemispheric differences.ANOVAs on midline electrodes included electrode as anadditional within-subjects factor (three levels), ANOVAson medial–lateral electrodes included hemisphere (twolevels) and electrode pair (five levels) as additionalwithin-subjects factors, and ANOVAs over lateral–lateralelectrodes included hemisphere (two levels) and electrodepair (three levels) as additional within-subjects factors.The Greenhouse–Geisser correction for inhomogenetityof variance was applied to all repeated measures on ERPdata with greater than one degree of freedom in thenumerator. In such cases, the corrected p-value is reported.

Results

Behavioral results

Mean d-prime scores, proportions judged correctly, andstandard deviations are reported in Table 1. On average,

Table 1. Mean d-prime scores and proportion ofsentences judged correctly for native speakers,third-year learners, and first-year learners. Standarddeviations are reported in parentheses.

Proportion correct

Group d-prime Grammatical Ungrammatical

Native speakers 4.85 (0.93) 0.95 (0.05) 0.98 (0.05)

Third year 3.61 (1.40) 0.93 (0.05) 0.90 (0.14)

First year 3.20 (1.34) 0.91 (0.74) 0.88 (0.13)

Total 3.78 (1.41) 0.93 (0.06) 0.92 (0.12)

Note: A d-prime of 0 indicates chance performance on the acceptabilityjudgment task; a d-prime of 4 indicates near-perfect discrimination betweenwell-formed and ill-formed sentences.

all participants, including first-year learners, performedvery well in the acceptability judgment task. Statisticalanalyses for d-prime scores showed a main effect of group,F(2,43) = 6.991, MSE = 1.578, p = .002. A Tukey’s HSDpost-hoc test showed significant differences between thefirst-year learners and native speakers, p = .002, andbetween the third-year learners and native speakers,p = .040. There were no differences between the first andthird-year learners, p = .637. An ANOVA on proportionjudged correctly showed a main effect of group,F(2,43) = 4.216, MSE = 0.010, p = .021, but no effect ofgrammaticality, F < 1, and no grammaticality by groupinteraction, F(2,43) = 1.185, MSE = 0.007, p = .316.A Tukey’s HSD post-hoc test showed a significantdifference between native speakers and first-year learners,p = .016, but no differences between the other groups,ps > .167.

Event-related potentials results

Grand mean analysesGrand mean waveforms for native speakers are plottedin Figure 1. In these and all subsequent waveforms,the general shapes of the waveforms were consistentwith previous data using visually presented languagestimuli (e.g., Osterhout & Holcomb, 1992; Osterhout &Mobley, 1995). Statistical analyses of native speakers’ERP responses showed that there were no reliable effectsin the early time windows; however, there was a trendtoward a main effect of grammaticality in the 300–500ms time window [midline: F(1,12) = 3.651, MSE = 6.284,p = .080; medial–lateral: F(1,12) = 4.047, MSE = 9.992,p = .067], suggesting the onset of a positivity toungrammatical verbs. In the 500–800 ms time windowthere was a significant main effect of grammaticality,indicating a P600 effect to ungrammatical verbs [midline:

Page 7: Individual differences reveal stages of L2 grammatical ...faculty.washington.edu/losterho/BLC_tanner_etal_2012.pdf · Keywords: N400, P600, second language acquisition, individual

http://journals.cambridge.org Downloaded: 17 Aug 2012 IP address: 128.118.88.243

6 Darren Tanner, Judith McLaughlin, Julia Herschensohn and Lee Osterhout

Figure 1. Grand average ERP waveforms for native German speakers (n = 13) to grammatical (solid line) andungrammatical (dashed line) verbs. Onset of the verb is indicated by the vertical bar. Calibration bar shows 3μV of activity;each tick mark represents 100 ms of time. Positive voltage is plotted down.

F(1,12) = 26.407, MSE = 6.956, p = .0003; medial–lateral: F(1,12) = 31.163, MSE = 16.473, p = .0001;lateral–lateral: F(1,12) = 19.302, MSE = 7.075, p =.0009] that was largest over posterior electrodes[grammaticality × electrode interaction, midline:F(2,24) = 3.766, MSE = 1.291, p = .045; medial–lateral:F(4,48) = 5.249, MSE = 3.329, p = .014; lateral–lateral:F(2,24) = 5.098, MSE = 0.949, p = .024]. The P600additionally showed a slight right-hemisphere bias overlateral–lateral electrodes [grammaticality × hemisphereinteraction: F(1,12) = 6.273, MSE = 1.044, p = .028].

Third-year learners’ brain responses (Figure 2) showedno significant effects in the N1, P2, or N400 timewindows. In the 500–800 ms time window there wasa main effect of grammaticality, indicating a reliableP600 effect to violations of subject–verb agreement[midline: F(1,12) = 18.316, MSE = 9.731, p = .001;medial–lateral: F(1,12) = 22.103, MSE = 18.826, p <

.0005; lateral–lateral: F(1,12) = 17.804, MSE = 5.566,

p = .001]. However, there were no significant interactionswith electrode or hemisphere in this time window.

ERPs from first-year learners (Figure 3) showed nosignificant effects in the N1 and P2 time windows, butthere was a significant main effect of grammaticalityover midline electrodes and a near-significant effectover lateral sites in the 300–500 ms window, indicatingan N400-like negativity to disagreeing verbs [midline:F(1,19) = 5.776, MSE = 7.251, p = .027; medial–lateral:F(1,19) = 3.759, MSE = 16.772, p = .068; lateral–lateral:F(1,19) = 3.751, MSE = 5.412, p = .068]. There were nointeractions with electrode or hemisphere. This N400was followed by a trend toward a P600 effect overmidline electrodes in the 500–800 ms time window [maineffect of grammaticality: F(1,19) = 3.156, MSE = 13.493,p = .092]; there were no significant or near-significanteffects over medial–lateral or lateral–lateral sites. Thus,grand mean waveforms to disagreeing verbs showed asmall biphasic response: ungrammatical verbs elicited a

Page 8: Individual differences reveal stages of L2 grammatical ...faculty.washington.edu/losterho/BLC_tanner_etal_2012.pdf · Keywords: N400, P600, second language acquisition, individual

http://journals.cambridge.org Downloaded: 17 Aug 2012 IP address: 128.118.88.243

Individual differences in L2 ERPs 7

Figure 2. Grand average ERP waveforms for learners enrolled in third-year German courses (n = 13) to grammatical (solidline) and ungrammatical (dashed line) verbs. Onset of the verb is indicated by the vertical bar. Calibration bar shows 3μV ofactivity; each tick mark represents 100 ms of time. Positive voltage is plotted down.

broadly distributed negativity in the 300–500 ms timewindow, but a small positivity in the 500–800 ms timewindow that did not reach full significance.

Analyses of individuals’ ERP responsesAs noted above, first-year learners’ grand meanwaveforms showed a small biphasic response todisagreeing verbs. However, inspection of individuals’waveforms showed that most learners did not show thisbiphasic response. Rather, for most subjects the responseto these words was either dominated by an enhancedN400 or by the later positivity. Following Inoue andOsterhout (2012), we further investigated this by firstcomputing the magnitude of the N400 and P600 effectsfor each individual, and then regressing the N400 effectmagnitude onto that of the P600 effect for first-yearlearners. N400 effect magnitude was computed as meanamplitude in the 300–500 ms window in the grammaticalcondition minus mean amplitude in the ungrammatical

condition, averaged over midline electrodes; P600 effectmagnitude was computed as mean amplitude in the 500–800 ms window in the ungrammatical condition minusthe mean amplitude in the grammatical condition, againaveraged over midline electrodes. The two effects weresignificantly negatively correlated, r = –.616, p = .004.As can be seen in Figure 4, learners’ brain responsesshowed a similar function to that reported by Inoue andOsterhout for native speakers of Japanese processingcase violations: brain responses varied along an N400–P600 continuum such that as one response increased,the other decreased. First-year learners were dividedinto N400 (n = 9) and P600 (n = 11) groups, based onwhether the individual’s response showed an N400 orP600 dominance. Grand mean waveforms for learnersin the N400 group showed mild differences in theprestimulus baseline, so a corrected 50 ms prestimulusto 50 ms poststimulus baseline was used for this group.ERP responses over midline electrodes for these separate

Page 9: Individual differences reveal stages of L2 grammatical ...faculty.washington.edu/losterho/BLC_tanner_etal_2012.pdf · Keywords: N400, P600, second language acquisition, individual

http://journals.cambridge.org Downloaded: 17 Aug 2012 IP address: 128.118.88.243

8 Darren Tanner, Judith McLaughlin, Julia Herschensohn and Lee Osterhout

Figure 3. Grand average ERP waveforms for learners enrolled in first-year German courses (n = 20) to grammatical (solidline) and ungrammatical (dashed line) verbs. Onset of the verb is indicated by the vertical bar. Calibration bar shows 3μV ofactivity; each tick mark represents 100 ms of time. Positive voltage is plotted down.

groups are shown in Figure 5. Learners in the N400group showed a significant effect of grammaticalityin the 300–500 ms time window, indicating a reliableN400 effect, F(1,8) = 10.020, MSE = 6.940, p = .013,but no significant effects in the P600 window, Fs <

1. Learners in the P600 group showed no effects overmidline electrodes in the N400 time window, Fs < 1.5;however, there was a significant effect of grammaticalityin the later time window, F(1,10) = 37.290, MSE = 4.609,p = .0001. Thus, the biphasic response seen in the grandmean waveform was in fact an artifact of averagingover individuals who showed qualitatively different brainresponses to disagreeing verbs (see Nieuwland & VanBerkum, 2008; Osterhout, 1997; Osterhout & Inoue,2012).

In order to investigate what factors may havebeen important in predicting the type and magnitudeof response learners showed to disagreeing verbs,we conducted a series of correlation analyses usingindividuals’ N400 and P600 effect magnitudes over

electrode Pz, where ERP effects were the largest. First-year learners’ P600 effect magnitudes were reliablycorrelated with d-prime scores, r = .532, p = .016; forthird-year learners the correlation neared significance,r = .504, p = .079; and for all learners (first- and third-year) combined the correlation was highly significant,r = .534, p = .001 (Figure 6). Thus, learners’ P600 effectmagnitudes increased linearly with their ability to detectagreement anomalies. For native speakers there wasno relationship between d-prime and P600 amplitude,r = .274, p = .344. Since learners’ P600 responses wereassociated with better performance in the acceptabilityjudgment task, it is also possible that N400 responseswere associated with poorer performance. Correlationsdid not reach significance for first-year learners, r = –.297,p = .204; third-year learners, r = –.406, p = .169; or nativespeakers, r = –.322, p = .261. However, the correlationfor all learners combined was weak, but did reachstatistical significance, r = .367, p = .036 (Figure 7). Inthis time window enhanced negativities to ungrammatical

Page 10: Individual differences reveal stages of L2 grammatical ...faculty.washington.edu/losterho/BLC_tanner_etal_2012.pdf · Keywords: N400, P600, second language acquisition, individual

http://journals.cambridge.org Downloaded: 17 Aug 2012 IP address: 128.118.88.243

Individual differences in L2 ERPs 9

Figure 4. Scatterplot showing the distribution of N400 and P600 effect magnitudes across first-year learners, averaged acrossthree midline electrodes (Fz, Cz, and Pz). Each dot represents a data point from a single learner. The solid line shows thebest-fit line for the data from the regression analysis. The dashed line represents equal N400 and P600 effect magnitudes andshows where learners were divided into groups: individuals above/to the left of the dashed line showed primarily an N400effect to German verb agreement violations, while individuals below/to the right of the dashed line showed primarily a P600effect.

Figure 5. ERPs over midline electrodes to grammatical(solid line) and ungrammatical (dashed line) verbs forfirst-year learners who showed either N400-dominant (leftpanel; n = 9) or P600-dominant (right panel; n = 11) brainresponses. Onset of the verb is indicated by the vertical bar.Calibration bar shows 3 μV of activity; each tick markrepresents 100 ms of time. Positive voltage is plotted down.

verbs were associated with poorer performance in theacceptability judgment task.

In a study of word learning in L2 French, McLaughlinand colleagues (McLaughlin, Osterhout & Kim, 2004)

found that learners’ individual N400 amplitudes toFrench-like pseudowords were highly correlated with thenumber of hours of instruction the subjects had beenexposed to during the first quarter of classroom Frenchinstruction. In order to test for a similar correlation in thecurrent data, the number of hours of classroom exposurewas computed for all first-year learners. There was nocorrelation between hours of exposure and amplitudedifference over any of the midline electrodes in the N400or in the P600 time window. Moreover a regression modelincluding d-prime score as an independent variable andP600 magnitude at Pz as the dependent variable wassignificant, R2

Adjusted = .243, F(1,18) = 7.098, p = .016;a model including both d-prime score and hours ofinstruction only neared significance, R2

Adjusted = .198,F(2,17) = 3.352, p = .059. Whereas d-prime scores aloneaccount for approximately 24% of the variance in P600effect magnitude, including hours of instruction as anindependent variable actually removed predictive powerfrom the overall model. Partial correlations in the secondmodel show that after controlling for effects of d-prime score, there was no relationship between hoursof instruction and P600 magnitude, r = .004, while d-prime remained significant after controlling for hoursof instruction, r = .514, p = .02. No regression modelincluding d-prime score, hours of instruction, or acombination of the two accounted for a significant portionof the variance in N400 amplitudes. It therefore seemsthat the variation in individuals’ brain responses is morea function of grammatical learning than pure classroomexposure.

Page 11: Individual differences reveal stages of L2 grammatical ...faculty.washington.edu/losterho/BLC_tanner_etal_2012.pdf · Keywords: N400, P600, second language acquisition, individual

http://journals.cambridge.org Downloaded: 17 Aug 2012 IP address: 128.118.88.243

10 Darren Tanner, Judith McLaughlin, Julia Herschensohn and Lee Osterhout

Figure 6. Correlation between P600 effect magnitude and d-prime scores from the acceptability judgment task for alllearners. P600 effect magnitude is quantified as mean amplitude in 500–800 ms time window over Pz in the ungrammaticalminus grammatical condition. More positive values on the y-axis reflect larger P600 effects.

Figure 7. Correlation between N400 effect magnitude and d-prime scores from the acceptability judgment task for alllearners. N400 effect size is quantified as mean amplitude between 300–500 ms time window over Pz in the grammaticalminus ungrammatical condition. More positive values on the y-axis reflect larger N400 effects.

A further issue is the relationship between participants’end-of-sentence judgments and their online brainresponses. One possibility is that the correlation betweend-prime scores and P600 magnitude might reflect ascenario where individuals who performed more poorlyon the grammaticality judgment task showed a smallerP600 effect on any given trial than those who performedbetter. Alternately, participants might show a full P600on any trial when they recognized the agreement error,but no P600 on the other trials; the result after averagingwould then be that those who recognized fewer errors(and who had lower d-prime scores) would show smalleraverage P600 effects. Moreover, it is also possible

that correctly- and incorrectly-judged trials elicitedqualitatively different ERP effects, such that incorrectlyjudged ungrammatical trials would elicit an N400, whilecorrectly judged ungrammatical trials would elicit a P600effect. The net result would be that those who show poorerjudgment performance would show an N400-dominantbrain response, whereas those who show better judgmentwould show a P600-dominant response.

To investigate these possibilities, we computedresponse-contingent averages, including only thosetrials which were ultimately judged correctly by theparticipants. Difference waveforms comparing all-trialaverages and response-contingent averages for third-year

Page 12: Individual differences reveal stages of L2 grammatical ...faculty.washington.edu/losterho/BLC_tanner_etal_2012.pdf · Keywords: N400, P600, second language acquisition, individual

http://journals.cambridge.org Downloaded: 17 Aug 2012 IP address: 128.118.88.243

Individual differences in L2 ERPs 11

Figure 8. Grand average difference waves for the ungrammatical minus grammatical conditions, comparing effect sizes forthe all-trial and response-contingent analyses. Positive or negative deviations from zero indicate a positivity or negativity inthe ungrammatical condition relative to the grammatical condition, respectively. Difference waveforms were filtered with a10 Hz low-pass filter for presentation purposes. Onset of the verb is indicated by the vertical bar. Calibration bar shows 3 μVof activity; each tick mark represents 100 ms of time. Positive voltage is plotted down.

learners and first-year learners in the N400- and P600-dominant groups are shown in Figure 8. As can beseen, the two sets of averages show very similareffects of the grammaticality manipulations. N400 andP600 effect magnitudes in the all-trial and response-contingent averages were nearly perfectly correlatedwithin learners (N400 effect magnitude: r = .907, p <

.000001; P600 effect magnitude: r = .969, p < .000001).Additionally, the correlation between d-prime scores andP600 magnitude for all learners remained significant evenwhen including only correctly-judged trials in the ERPaverages, r = .494, p = .004. Overall this indicates thatN400 effects were not driven only by incorrectly-judgedtrials, as the N400 effects remained robust even whenconsidering only correctly-judged trials. The remainingcorrelation between d-prime and P600 magnitude alsoindicates that the full P600 on correctly-judged trials/noP600 on incorrectly-judged trials account is incorrect.Nonetheless, this does not unequivocally show that P600effects on any given trial were consistently smaller acrossall trials in those with lower d-prime scores. Indeed,the linear change in P600 magnitude may still havebeen driven by cross-trial differences in effect amplitude.The present results simply indicate that these differenceswere not directly associated with an individual’s eventualjudgment about a given sentence (see McLaughlin etal., 2004; Tokowicz & MacWhinney, 2005, for evidenceof dissociations between on-line brain responses and

off-line judgments). Future research on trial-levelmodeling of brain responses may shed light on this issue(see Zayas, Greenwald & Osterhout, 2010).

Discussion

The study reported here investigated morphosyntacticprocessing in native German speakers and in English-speaking university students enrolled in their first or thirdyear of German instruction. Our most striking findingwas the existence of systematic individual differences inthe learners’ ERP responses to subject–verb agreementanomalies. These anomalies elicited an N400 effectin some learners and a P600 effect in others. Theamplitudes of these effects were negatively correlatedacross learners, and accuracy in the sentence-acceptabilityjudgment task predicted the amplitude of the ERPresponse to ungrammatical stimuli, with greater accuracybeing associated with more positive-going brain activitythroughout the N400 and P600 windows.

Prior work has shown that individuals differ withrespect to working memory capacity, vocabularyknowledge, neural efficiency, and in many other waysthat could impact language processing (Prat, 2011). Onepossibility, therefore, is that the individual differencesamong the German learners reflect durable subjectvariables (i.e., “traits”) that persist over time. If so, then theindividual differences observed here might be expected

Page 13: Individual differences reveal stages of L2 grammatical ...faculty.washington.edu/losterho/BLC_tanner_etal_2012.pdf · Keywords: N400, P600, second language acquisition, individual

http://journals.cambridge.org Downloaded: 17 Aug 2012 IP address: 128.118.88.243

12 Darren Tanner, Judith McLaughlin, Julia Herschensohn and Lee Osterhout

to persist even as the learner becomes more proficientin the L2. An alternative possibility, however, is thatlearners were progressing between two distinct processingstages (as manifested in the N400 and P600 responses tomorphosyntactic anomalies), and that individual learnersvaried with respect to the rate of transition between the twostages. A compelling test of these different interpretationsrequires a longitudinal design that tracks learners over anextended period of L2 instruction. Some relevant evidenceis provided by a longitudinal ERP study of first-yearFrench learners (Osterhout et al., 2012; see McLaughlinet al., 2010; Osterhout et al., 2006, for preliminary reports;see also Morgan-Short et al., 2010; Morgan-Short et al.,2012, for similar findings from an artificial languagelearning study). ERPs were recorded to violations ofFrench subject–verb agreement. Most learners respondedto these anomalies with an N400 effect after about onemonth of L2 instruction and a P600 effect after about sevenmonths of instruction. When tested during the middleof the instructional period (after about four months ofinstruction), the grand average ERP revealed a small-amplitude biphasic N400–P600 effect. Inspection ofindividual subjects’ ERPs showed that the grand averageobscured robust individual differences, such that somelearners showed an N400 effect and others a P600 effectto the same set of agreement anomalies. Learners’ N400and P600 effect magnitudes were negatively correlated ateach testing session.

Collectively, the evidence seems to indicate thatindividual learners progress through distinct stages oflearning, but that the rate of progression varies acrosslearners. An important goal is to characterize thefunctional significance of the developmental stages.Whereas the current state of the field does not allowone to draw a direct link between a given ERP effectand a specific underlying cognitive or linguistic process,some parallels exist between claims in the broaderpsycholinguistics literature and the pattern of resultsobtained here. For example, some have argued thatalthough native speakers typically compute detailedsyntactic representations, they may sometimes useshallow (or ‘good enough’) processing heuristics insteadof full syntactic parses during language comprehensionin complex syntactic situations, such as passiveconstructions and garden path sentences (Christianson,2008; Christianson, Hollingworth, Halliwell & Ferreira,2001; Ferreira, Bailey & Ferraro, 2002). Theoristshave proposed a link between the use of a shallower,heuristic or lexical processing stream and a deeper,rule-based or combinatorial processing stream, and theN400 and P600 components, respectively (Kuperberg,2007; Severens, Jansma & Hartsuiker, 2008; Tanner,2011). One possibility is that novice L2 learners weremore reliant on these shallower lexical or probabilisticprocessing heuristics than native speakers and more

advanced learners for even simple grammatical relations,like agreement. The shift to a P600-dominant responsemight reflect the gradual development of a more abstract,rule-based processing stream for L2 grammar, as istypically employed by L1 speakers in these constructions.The additional negative correlation between the N400and P600 effect magnitudes might be explainable interms of processing models which posit a “competitivedynamic” between the two streams (Jackendoff, 2007;Kim & Osterhout, 2005; MacWhinney, Bates & Kliegl,1984).

The current data also share some features withpredictions made by Ullman’s Declarative/Procedural(D/P) model (Ullman, 2001, 2004, 2005). For example,the N400 effect elicited by morphosyntactic violationsin early-stage L2 acquisition is compatible with the D/Pmodel’s prediction that both grammatical and lexicalprocessing in novice learners will show heavy relianceon the declarative memory system. With increasingproficiency, grammatical processing should then showincreased reliance on the procedural memory system.However, Ullman argues that use of procedural memorywill be indexed by a LAN effect in response togrammatical anomalies. LANs were not found in anygroup (learners or native) in this study. Moreover, LANeffects are missing in native speakers in many studies ofsyntactic processing (e.g., Ainsworth-Darnell et al., 1998;Allen, Badecker & Osterhout, 2003; Frenck-Mestre et al.,2008; Hagoort, 2003; Hagoort & Brown, 1999; Hagoortet al., 1993; Kaan, 2002; Nevins, Dillon, Malhotra &Phillips, 2007; Severens et al., 2008), so it is difficultto interpret the absence of this effect in the currentstudy as reflecting incomplete grammatical acquisitionor deficient processing in our more advanced learnersor native speakers. More research is needed to preciselyidentify the experimental conditions under which LANeffects are reliably elicited.

The qualitative change in processing seen in early-stage L2 learners is incompatible with some recentproposals about L2 processing (Clahsen & Felser,2006; Clahsen, Felser, Neubauer, Sato & Silva, 2010).Clahsen and colleagues argue that L2 learners arerestricted to the shallower, ‘good enough’ parses that aresometimes available to native speakers, regardless of L2proficiency or L1–L2 pairing. However, the current dataindicate that adult L2 learners can move beyond shallowprocessing heuristics and develop deeper grammaticalprocessing strategies within only a few months ofclassroom instruction. Moreover, L2 proficiency canhave an effect on a learner’s depth of processing, asa relative reliance on the shallower lexical/heuristic ordeeper grammatical/combinatorial processing stream wasassociated with behavioral measures of grammaticallearning in the current study. The data reported hereprovide strong evidence that learners at different stages

Page 14: Individual differences reveal stages of L2 grammatical ...faculty.washington.edu/losterho/BLC_tanner_etal_2012.pdf · Keywords: N400, P600, second language acquisition, individual

http://journals.cambridge.org Downloaded: 17 Aug 2012 IP address: 128.118.88.243

Individual differences in L2 ERPs 13

of development use qualitatively different processingstreams to deal with L2 grammatical information, andare consistent with longitudinal findings that individualsmay shift dominance from one stream to the other astheir L2 competence increases over time (see McLaughlinet al., 2010; Steinhauer et al., 2009, for further discussionabout the possible functional significance of the N400–P600 shift).

In the present study, effect magnitude correlated withaccuracy in the grammaticality judgment task for L2learners. The relationship between effect magnitude andd-prime is reminiscent of that reported by Pakulak andNeville (2010), who found that participants’ P600 effectmagnitudes were linearly related to their proficiency in L1English. Our findings are also consistent with suggestionsmade by Steinhauer and colleagues (Steinhauer et al.,2009; see also van Hell & Tokowicz, 2010) that increasingL2 proficiency co-occurs with a more L1-like profile ofERP responses to L2 anomalies. However, this resultmight not generalize to all situations or populations oflanguage learners. Recent results from our lab showa similar profile of brain responses in high-proficiencylate L1-Spanish–L2-English bilinguals with long-term L2immersion as seen here in novice L2 learners (Tanner,Inoue & Osterhout, 2012; see also Tanner, 2011). Insteadof proficiency, motivation provided the strongest correlateof brain response type in that study. However, there areseveral demographic differences between the Spanish–English bilinguals and the novice learners in the currentstudy, including age of acquisition and amount of L2exposure. Also, the types of linguistic input received byimmersed versus classroom L2 learners may have had animpact on brain response profiles. More research is neededin order to identify how input and other individual-levelvariables interact in shaping L2 learning and processing(see e.g., Morgan-Short, Faretta, Brill, Wong & Wong,2012). Language proficiency may therefore be one ofmany factors responsible for determining the neuralsubstrates of syntactic processing (Prat, 2011).

Finally, the present findings compellingly illustratethe dangers inherent in the exclusive use of grand-average ERPs to characterize L2 sentence processing.In some cases a thorough investigation of between-learner variability can be more informative than inspectionof grand mean waveforms. Our results add to resultsfrom the lexico-semantic processing domain showingthat regression-based statistical methods can be usedon ERP data to model individual difference profiles(Moreno & Kutas, 2005; Newman et al., 2012; Ojimaet al., 2011). However, our strongest predictor of brainresponses, namely d-prime scores, was an experiment-internal variable. A logical next question is what othervariables may be at play in predicting how quickly novicelearners grammaticalize L2 features (i.e., how quicklythey move from N400 to P600 responses to subject–

verb agreement violations), or what factors predict themagnitude of ERP effects at higher levels of proficiency.Behavioral research has long noted correlations betweenlearning measures and certain cognitive and affectivevariables. It remains to be seen how these variables maponto the neurocognitive correlates of learning that wereport here (see Bond et al., 2011, for a first attempt to linkindividuals’ specific language aptitude and non-verbalreasoning ability with L2 ERP effects). Nonetheless, theapproach taken here demonstrates that ERPs provide avaluable tool for understanding individual factors in L2grammatical learning, and encourage us to hope thatfuture research will elucidate determinants of the rate andsuccess of L2 acquisition.

References

Ainsworth-Darnell, K., Shulman, R., & Boland, J. (1998).Dissociating brain responses to syntactic and semanticanomalies: Evidence from event-related brain potentials.Journal of Memory and Language, 38, 112–130.

Allen, M., Badecker, W., & Osterhout, L. (2003). Morphologicalanalysis in sentence processing: An ERP study. Languageand Cognitive Processes, 18, 405–430.

Bentin, S. (1987). Event-related potentials, semantic processes,and expectancy factors in word recognition. Brain andLanguage, 31, 308–327.

Bentin, S., & Deutsch, A. (2001). Syntactic and semantic factorsin processing gender agreement in Hebrew: Evidencefrom ERPs and eye movements. Journal of Memory andLanguage, 45, 200–224.

Bond, K., Gabriele, A., Fiorentino, R., & Alemán Bañón, J.(2011). Individual differences and the role of the L1 inL2 processing: An ERP investigation. In D. Tanner & J.Herschensohn (eds.), Proceedings of the 11th GenerativeApproaches to Second Language Acquisition Conference(GASLA 2011), pp. 17–29. Somerville, MA: CascadillaProceedings Project.

Chen, L., Shu, H. U. A., Liu, Y., Zhao, J., & Li, P. (2007).ERP signatures of subject–verb agreement in L2 learning.Bilingualism: Language and Cognition, 10, 161–174.

Christianson, K. (2008). Sensitivity to syntactic changesin garden path sentences. Journal of PsycholinguisticResearch, 37, 391–403.

Christianson, K., Hollingworth, A., Halliwell, J. F., & Ferreira,F. (2001). Thematic roles assigned along the garden pathlinger. Cognitive Psychology, 42, 368–407.

Clahsen, H., & Felser, C. (2006). Grammatical processing inlanguage learners. Applied Psycholinguistics, 27, 3–42.

Clahsen, H., Felser, C., Neubauer, K., Sato, M., & Silva, R.(2010). Morphological structure in native and nonnativelanguage processing. Language Learning, 60, 21–43.

Dörnyei, Z., & Skehan, P. (2003). Individual differences insecond language learning. In C. J. Doughty & M. H.Long (eds.), The handbook of second language acquisition,pp. 589–630. Malden, MA: Blackwell.

Durrell, M. (2006). Hammer’s German grammar and usage.London: Arnold.

Page 15: Individual differences reveal stages of L2 grammatical ...faculty.washington.edu/losterho/BLC_tanner_etal_2012.pdf · Keywords: N400, P600, second language acquisition, individual

http://journals.cambridge.org Downloaded: 17 Aug 2012 IP address: 128.118.88.243

14 Darren Tanner, Judith McLaughlin, Julia Herschensohn and Lee Osterhout

Ferreira, F., Bailey, K., G. D., & Ferraro, V. (2002).Good-enough representations in language comprehension.Current Directions in Psychological Science, 11, 11–15.

Foucart, A., & Frenck-Mestre, C. (2011). Grammaticalgender processing in L2: Electrophysiological evidenceof the effect of L1–L2 syntactic similarity. Bilingualism:Language and Cognition, 14, 379–399.

Frenck-Mestre, C., Osterhout, L., McLaughlin, J., & Foucart,A. (2008). The effect of phonological realization ofinflectional morphology on verbal agreement in French:Evidence from ERPs. Acta Psychologica, 128, 528–536.

Friederici, A. D., Hahne, A., & Mecklinger, A. (1996).Temporal structure of syntactic processing: Early and lateevent-related potential effects. Journal of ExperimentalPsychology: Learning, Memory, and Cognition, 22, 1219–1248.

Friederici, A. D., Steinhauer, K., & Pfeifer, E. (2002). Brainsignatures of artificial language processing: Evidencechallenging the critical period hypothesis. Proceedings ofthe National Academy of Sciences USA, 99, 529–534.

Gillon Dowens, M., Guo, T., Guo, J., Barber, H., & Carreiras, M.(2011). Gender and number processing in Chinese learnersof Spanish – Evidence from event related potentials.Neuropsychologia, 49, 1651–1659.

Guo, J., Guo, T., Yan, Y., Jiang, N., & Peng, D. (2009).ERP evidence for different strategies employed by nativespeakers and L2 learners in sentence processing. Journalof Neurolinguistics, 22, 123–134.

Hagoort, P. (2003). Interplay between syntax and semanticsduring sentence comprehension: ERP effects of combiningsyntactic and semantic violations. Journal of CognitiveNeuroscience, 15, 883–899.

Hagoort, P., & Brown, C. M. (1999). Gender electrified: ERPevidence on the syntactic nature of gender processing.Journal of Psycholinguistic Research, 28, 715–728.

Hagoort, P., Brown, C. M., & Groothusen, J. (1993). Thesyntactic positive shift as an ERP measure of syntacticprocessing. Language and Cognitive Processes, 8, 439–484.

Hahne, A., & Friederici, A. D. (2001). Processing a secondlanguage: Late learners’ comprehension mechanisms asrevealed by event-related brain potentials. Bilingualism:Language and Cognition, 4, 123–141.

Hahne, A., Mueller, J. L., & Clahsen, H. (2006). Morphologicalprocessing in a second language: Behavioral andevent-related brain potential evidence for storage anddecomposition. Journal of Cognitive Neuroscience, 18,121–134.

Handy, T. C. (2005). Event-related potentials: A methodshandbook. Cambridge, MA: MIT Press.

Inoue, K., & Osterhout, L. (2012). Sentence processing as aneural seesaw. Ms., University of Washington, Seattle, WA.

Jackendoff, R. (2007). A parallel architecture perspective onlanguage processing. Brain Research, 1146, 2–22.

Jasper, H. H. (1958). The ten–twenty system of theInternational Federation. Electroencephalography andClinical Neurophysiology, 10, 371–375.

Johnson, J., Shenkman, K., Newport, E., & Medin, D. (1996).Indeterminacy in the grammar of adult language learners.Journal of Memory and Language, 35, 335–352.

Kaan, E. (2002). Investigating the effects of distance and numberinterference in processing subject–verb dependencies: AnERP study. Journal of Psycholinguistic Research, 31, 165–193.

Kaan, E., Harris, A., Gibson, E., & Holcomb, P. (2000). The P600as an index of syntactic integration difficulty. Language andCognitive Processes, 15, 159–201.

Kim, A., & Osterhout, L. (2005). The independence ofcombinatory semantic processing: Evidence from event-related potentials. Journal of Memory and Language, 52,205–225.

King, J. W., & Kutas, M. (1995). Who did what and when? Usingword- and clause-level ERPs to monitor working memoryusage in reading. Journal of Cognitive Neuroscience, 7,376–395.

Kolk, H. H. J., Chwilla, D. J., van Herten, M., & Oor, P. J. W.(2003). Structure and limited capacity in verbal workingmemory: A study with event-related potentials. Brain andLanguage, 85, 1–36.

Kuperberg, G. (2007). Neural mechanisms of languagecomprehension: Challenges to syntax. Brain Research,1146, 23–49.

Kutas, M., & Federmeier, K. D. (2000). Electrophysiologyreveals semantic memory use in language comprehension.Trends in Cognitive Sciences, 4, 463–470.

Kutas, M., & Hillyard, S. A. (1980). Reading senselesssentences: Brain potentials reflect semantic anomaly.Science, 207, 203–205.

Luck, S. J. (2005). An introduction to the event-related potentialtechnique. Cambridge, MA: MIT Press.

MacWhinney, B. (2005). A unified model of languageacquisition. In J. F. Kroll & A. M. B. De Groot (eds.),Hanbook of bilingualism: Psycholinguistic approaches, pp.49–67. Oxford: Oxford University Press.

MacWhinney, B., Bates, E., & Kliegl, R. (1984). Cue validityand sentence intepretation in English, German, and Italian.Journal of Verbal Learning and Verbal Behavior, 23, 127–150.

McDonald, J. L. (2000). Grammaticality judgments in a secondlanguage: Influences of age of acquisition and nativelanguage. Applied Psycholinguistics, 21, 395–423.

McDonald, J. L. (2006). Beyond the critical period: Processing-based explanations for poor grammaticality judgmentperformance by late second language learners. Journal ofMemory and Language, 55, 381–401.

McLaughlin, J., Osterhout, L., & Kim, A. (2004). Neuralcorrelates of second-language word learning: Minimalinstruction produces rapid change. Nature Neuroscience,7, 703–704.

McLaughlin, J., Tanner, D., Pitkänen, I., Frenck-Mestre,C., Inoue, K., Valentine, G., & Osterhout, L. (2010).Brain potentials reveal discrete stages of L2 grammaticallearning. Language Learning, 60, 123–150.

Moreno, E. M., & Kutas, M. (2005). Processing semanticanomalies in two languages: An electrophysiolog-ical exploration in both languages of Spanish–English bilinguals. Cognitive Brain Research, 22, 205–220.

Morgan-Short, K., Faretta, M., Brill, K., Wong, F., & Wong, P.(2012). Declarative and procedural memory as individual

Page 16: Individual differences reveal stages of L2 grammatical ...faculty.washington.edu/losterho/BLC_tanner_etal_2012.pdf · Keywords: N400, P600, second language acquisition, individual

http://journals.cambridge.org Downloaded: 17 Aug 2012 IP address: 128.118.88.243

Individual differences in L2 ERPs 15

differences in second language acquisition. Ms., Universityof Illinois at Chicago.

Morgan-Short, K., Sanz, C., Steinhauer, K., & Ullman, M. T.(2010). Second language acquisition of gender agreementin explicit and implicit training conditions: An event-related potentials study. Language Learning, 60, 154–193.

Morgan-Short, K., Steinhauer, K., Sanz, C., & Ullman, M.T. (2012). Explicit and implicit second language trainingdifferentially affect the achievement of native-like brainactivation patterns. Journal of Cognitive Neuroscience, 24,933–947.

Naiman, N., Fröhlich, M., Stern, H. H., & Todesco, A. (1996).The good language learner. Philadelphia, PA: MultilingualMatters.

Nakano, H., Saron, C., & Swaab, T. Y. (2010). Speech and span:Working memory capacity impacts the use of animacybut not of world knowledge during spoken sentencecomprehension. Journal of Cognitive Neuroscience, 22,2886–2898.

Neville, H. J., Nicol, J., Barss, A., Forster, K., & Garrett, M.(1991). Syntactically based sentence processing classes:Evidence from event-related brain potentials. Journal ofCognitive Neuroscience, 3, 151–165.

Nevins, A., Dillon, B., Malhotra, S., & Phillips, C. (2007). Therole of feature-number and feature-type in processing Hindiverb agreement violations. Brain Research, 1164, 81–94.

Newman, A. J., Tremblay, A., Nichols, E. S., Neville, H.J., & Ullman, M. T. (2012). The influence of languageproficiency on lexical semantic processing in native andlate learners of English. Journal of Cognitive Neuroscience,24, 1205–1223.

Nieuwland, M. S., & Van Berkum, J. J. A. (2005). Testingthe limits of the semantic illusion phenomenon: ERPsreveal temporary semantic change deafness in discoursecomprehension. Cognitive Brain Research, 24, 691–701.

Nieuwland, M. S., & Van Berkum, J. J. A. (2008). Theinterplay between semantic and referential aspects ofanaphor noun phrase resolution: Evidence from ERPs.Brain and Language, 106, 119–131.

Ojima, S., Matsuba-Kurita, H., Nakamura, N., Hoshino, T., &Hagiwara, H. (2011). Age and the amount of exposure toa foreign language during childhood: Behavioral and ERPdata on the semantic comprehension of spoken English byJapanese children. Neuroscience Research, 70, 197–205.

Ojima, S., Nakata, H., & Kakigi, R. (2005). An ERP studyof second language learning after childhood: Effects ofProficiency. Journal of Cognitive Neuroscience, 17, 1212–1228.

Osterhout, L. (1997). On the brain response to syntacticanomalies: Manipulations of word position and word classreveal individual differences. Brain and Language, 59,494–522.

Osterhout, L., Frenck-Mestre, C., Inoue, K., McLaughlin, J.,Tanner, D., & Herschensohn, J. (2012). Morphosyntacticlearning and second language acquisition: Evidence fromevent-related potentials. Ms., University of Washington.

Osterhout, L., & Holcomb, P. J. (1992). Event-related brainpotentials elicited by syntactic anomaly. Journal of Memoryand Language, 31, 785–806.

Osterhout, L., & Holcomb, P. J. (1995). Event-related brainpotentials and language comprehension. In M. D. Rugg& M. G. H. Coles (eds.), Electrophysiology of mind:Event-related brain potentials and cognition, pp. 171–215.Oxford: Oxford University Press.

Osterhout, L., McLaughlin, J., Kim, A., Greewald, R., & Inoue,K. (2004). Sentences in the brain: Event-related potentialsas real-time reflections of sentence comprehension andlanguage learning. In M. Carreiras & C. Clifton (eds.),The on-line study of sentence comprehension: Eyetracking-ERPs, and beyond, pp. 271–308. New York: PsychologyPress.

Osterhout, L., McLaughlin, J., Pitkänen, I., Frenck-Mestre, C., &Molinaro, N. (2006). Novice learners, longitudinal designs,and event-related potentials: A means for exploring theneurocognition of second language processing. LanguageLearning, 56, 199–230.

Osterhout, L., & Mobley, L. (1995). Event-related brainpotentials elicited by failure to agree. Journal of Memoryand Language, 34, 739–773.

Osterhout, L., & Nicol, J. (1999). On the distinctiveness,independence, and time course of the brain responses tosyntactic and semantic anomalies. Language and CognitiveProcesses, 14, 282–317.

Pakulak, E., & Neville, H. J. (2010). Proficiency differencesin syntactic processing of monolingual native speakersindexed by event-related potentials. Journal of CognitiveNeuroscience, 22, 2728–2744.

Prat, C. S. (2011). The brain basis of individual differencesin language comprehension abilities. Language andLinguistic Compass, 5, 635–649.

Robinson, P. (2002). Individual differences and instructedlanguage learning. Amsterdam: John Benjamins.

Rossi, S., Gugler, M. F., Friederici, A. D., & Hahne, A. (2006).The impact of proficiency on syntactic second-languageprocessing of German and Italian: Evidence from event-related potentials. Journal of Cognitive Neuroscience, 18,2030–2048.

Sabourin, L., & Haverkort, M. (2003). Neural substratesof representation and processing of a second language.In R. van Hout, A. Hulk, F. Kuiken & R. Towell(eds.), The lexicon–syntax interface in second languageacquisition, pp. 175–195. Amsterdam & Philadelphia, PA:John Benjamins.

Sabourin, L., & Stowe, L. A. (2008). Second languageprocessing: When are first and second languagesprocessed similarly? Second Language Research, 24, 397–430.

Sabourin, L., Stowe, L. A., & de Haan, G. J. (2006). Transfereffects in learning a second language grammatical gendersystem. Second Language Research, 22, 1–29.

Schwartz, B. D., & Sprouse, R. (1996). L2 cognitive statesand the full transfer/full access model. Second LanguageResearch, 12, 40–72.

Severens, E., Jansma, B. M., & Hartsuiker, R. J. (2008).Morphophonological influences on the comprehension ofsubject–verb agreement: An ERP study. Brain Research,1228, 135–144.

Skehan, P. (1989). Individual differences in second-languagelearning. New York: Arnold.

Page 17: Individual differences reveal stages of L2 grammatical ...faculty.washington.edu/losterho/BLC_tanner_etal_2012.pdf · Keywords: N400, P600, second language acquisition, individual

http://journals.cambridge.org Downloaded: 17 Aug 2012 IP address: 128.118.88.243

16 Darren Tanner, Judith McLaughlin, Julia Herschensohn and Lee Osterhout

Steinhauer, K., White, E. J., & Drury, J. E. (2009). Temporaldynamics of late second language acquisition: Evidencefrom event-related brain potentials. Second LanguageResearch, 25, 13–41.

Tanner, D. (2011). Agreement mechanisms in native andnonnative language processing: Electrophysiological cor-relates of complexity and interference. Ph.D. dissertation,University of Washington.

Tanner, D., Inoue, K., & Osterhout, L. (2012). Brain-based individual differences in on-line L2 sentencecomprehension. Ms., Pennsylvania State University.

Tokowicz, N., & MacWhinney, B. (2005). Implicit and explicitmeasures of sensitivity to violations in second languagegrammar – An event-related potential investigation. Studiesin Second Language Acquisition, 27, 173–204.

Ullman, M. T. (2001). The neural basis of lexicon and grammarin first and second language: The declarative/proceduralmodel. Bilingualism: Language and Cognition, 4, 105–122.

Ullman, M. T. (2004). Contributions of memory circuits tolanguage: The declarative/procedural model. Cognition, 92,231–270.

Ullman, M. T. (2005). A cognitive neuroscience perspective onsecond language acquisition: The declarative/proceduralmodel. In C. Sanz (ed.), Mind and context in adultsecond language acquisition, pp. 141–178. Washington,DC: Georgetown University Press.

van de Meerendonk, N., Kolk, H. H. J., Vissers, C. T. W. M., &Chwilla, D. J. (2010). Monitoring in language perception:Mild and strong conflicts elicit different ERP patterns.Journal of Cognitive Neuroscience, 22, 67–82.

van Hell, J. G., & Tokowicz, N. (2010). Event-relatedbrain potentials and second language learning: Syntacticprocessing in late L2 learners at different L2 proficiencylevels. Second Language Research, 26, 43–74.

Vos, S. H., Gunter, T. C., Kolk, H. H. J., & Mulder, G. (2001).Working memory constraints on syntactic processing: Anelectrophysiological investigation. Psychophysiology, 38,41–63.

Weber, K., & Lavric, A. (2008). Syntactic anomaly elicits alexico-semantic (N400) ERP effect in the second languagebut not the first. Psychophysiology, 45, 920–925.

Weber-Fox, C. M., & Neville, H. J. (1996). Maturationalconstraints on functional specializations for languageprocessing: ERP and behavioral evidence in bilingualspeakers. Journal of Cognitive Neuroscience, 8, 231–256.

Wickens, T. (2002). Elementary signal detection theory. Oxford:Oxford University Press.

Zayas, V., Greenwald, A., & Osterhout, L. (2010). Unitentionalcovert motor activations predict behavioral effects:Multilevel modeling of trial-level electrophysiologicalmotor activations. Psychophysiology, 48, 208–217.