language assessment of bilingual arabic-german heritage and … · 2020-05-20 · language...

Language Assessment of Bilingual Arabic-German Heritage

and Refugee Children: Comparing Performance

on LITMUS Repetition Tasks

Lina Abed Ibrahim, Cornelia Hamann, and István Fekete

1

1. Introduction and theoretical background

In the last four decades, bilingualism and multilingualism have been in the

focus of linguistic research with manifold aims, such as the exploration of a

human disposition for multilingualism or of the factors determining trajectories

and outco mes o f (second) language learning. In Ger many , research questions

were heavily influenced by socio-political developments as the country has a

long history of immigration and experience with heritage and refugee

children in the educational system. The first studies on the untutored acquisition

of German as a second language (L2) concerned adult learners (Clahsen, Meisel

& Pienemann, 1983) while subsequent work focused on language development in

the second generation, mostly children growing up with two languages from birth

(2L1) or learning German at kindergarten age (eL2) (Meisel, 1990).

Regardless of the arrival of two major waves of refugees (from the Balkans in

the 1990s and from Syria in 2015) and the heterogeneity of bilingual populations

as to age and mode of acquisition, the majority of the studies interested in

linguistic development investigate children and adults who are 2L1 or eL2

speakers. Most linguistic long-term research projects up to the present base their

findings on such individuals or groups (see Chilla, 2008; Gagarina, 2017; Hamann

& Abed Ibrahim, 2017; Rothweiler, 2006; Tuller et al., 2018 among many others).

The two language assessment tools developed in Germany, standardized for

monolingual and bilingual children, Russian SKRUK (Gagarina et al., 2010) and

LiSe-DaZ (Schulz & Tracy, 2011), likewise are normed for these populations

only. The same holds for most research on academic success of bilinguals in

German schools with many studies investigating reading and writing skills based

on 2L1 and eL2 children (Chilla, Krupp & Wulff, 2019; Gantefort & Roth, 2008).

* Lina Abed Ibrahim, Cornelia Hamann and István Fekete: University of Oldenburg.

Corresponding author: Lina Abed Ibrahim, [email protected]. This research was

carried out within the BiliSAT- project financed by DFG grants to S. Chilla (CH 1112/4-

1) and C. Hamann (HA 2335/7-1). We are grateful to Solveig Chilla for her support with

data collection and analysis. Our special thanks go to the participating children and their

parents and collaborating teachers and SLTs.

© 2020 Lina Abed Ibrahim, Cornelia Hamann, and István Fekete. Proceedings of the 44th Boston University Conference on Language Development, ed. Megan M. Brown and

Alexandra Kohut, 1-17. Somerville, MA: Cascadilla Press.

Due to the influx of almost a million refugees in September 2015, many

thousands of children and adolescents with different ages of arrival, different

schooling experience, different language backgrounds, different asylum status

and different itineraries had to be integrated into the German school system which

widely differs from state to state, see von Maurice & Roßbach (2017). Since

knowledge of the German language is undoubtedly a necessary condition for

academic and later professional success, these refugee children learn German in

different programs and with different curricula all over the country (Ahrenholz,

Fuchs & Birnbaum, 2016). Importantly, irrespective of different models of

schooling in the 16 Federal States of Germany, the general design already

established in the 1980s, based more or less on study of simultaneous or early

successive bilingualism, remains: refugee children are expected to acquire oral

and literacy L2 skills within six to eighteen months before they are to participate

in regular classes. Adding to the problem is that most schools establish separate

language classes irrespective of the students’ age or overall development (for

details see Massumi & v. Dewitz 2015).

The short time (6-18 months) allotted to L2-acquisition is clearly problematic

since more pessimistic estimates of the time needed to catch up in important areas

of language are found in the recent literature (see Schönenberger, Rothweiler &

Sterner, 2012; Paradis & Jia, 2016). First results comparing bilingual children

born in Germany (primary school age) with refugees (10-17 years of age) with

respect to their vocabulary development and their ability to write an essay show a

big gap in language skills between the two groups (Montanari, 2017). This study

also shows that it is not easy to choose and apply language assessment tools and

that the lack of bilingual norms is an additional challenge when using standardized

methods. The above considerations will have made it quite evident that systematic

studies on the language development of refugees who enter the school system

when they are older than seven years are needed urgently not only for guiding the

choice of assessment tools but ultimately for shaping expectations and models of

participation.

It is at this point that our investigation makes a contribution. We compare a

group of Syrian refugees with different educational backgrounds to younger

Arabic heritage speakers, i.e. 2L1 or eL2 speakers. In the wake of COST Action

IS0804, cross-linguistically valid tools were developed. Since non-word and

sentence repetition have high accuracy for diagnosing Developmental Language

Disorder (DLD) in monolinguals (Conti-Ramsden et al., 2001), these LITMUS

tools (Language Impairment Testing in Multilingual Settings, Armon-Lotem et

al., 2015) included Nonword Repetition Tasks (NWRT; Grimm et al., 2014) and

Sentence Repetition Tasks (SRT, Marinis & Armon-Lotem, 2015 Hamann et al.,

2013). Both these LITMUS-tools have recently been shown to identify DLD in

bilingual populations and proved to have reasonable to good –in combination even

excellent - diagnostic accuracy for the identification of DLD in 2L1 and eL2

children (Abed Ibrahim & Fekete, 2019; Armon-Lotem & Meir, 2016; Chiat &

Polišenská, 2016; Hamann & Abed Ibrahim, 2017; Tuller et al., 2018). This is an

2

encouraging result since the tasks are easy and fast to administer and could thus

serve as a first assessment in many contexts, including schools.

Repetition tasks have been associated with verbal working memory (VWM)

and verbal short-term memory (VSTM) in so far as NWRTs are usually taken to

measure VSTM (Archibald & Gathercole, 2006; Bishop, Adams & Norbury,

2006). Other authors have shown, however, that such tasks measure language

skills, especially when they address phonological or morphosyntactic complexity

(Gallon, Harris & van der Lely, 2007 for NWRT, Polišenská, Chiat & Roy, 2014;

or Klem et al. 2014 and Meir, 2017 for SRT). Recently, Abed Ibrahim & Hamann

(2017) confirmed that linguistic complexity has a decisive influence on

performance on both German SRT and NWRT. Moreover, a detailed investigation

of the factors influencing performance of bilingual children on SRT in French has

shown that differences in performance of the typically developing (TD) group and

the group with DLD cannot only be explained by differences in VSTM or VWM

but crucially depend on syntactic complexity (Zebib et al., 2019). In line with

these results from previous studies, it can be stated that the German LITMUS-

NWRT and SRT as well as the Arabic SRT, which was developed in parallel to

the LITMUS French SRT (see Henry et al., to appear), were constructed as

language measures and include structures that are known to be difficult for

children with DLD.

In work by Tuller et al. (2018), performance in these two tasks was evaluated

with respect to background information, gathered with the Questionnaire for

Parents of Bilingual Children (PaBiQ; Tuller, 2015). Such background

information allows exploring factors such as Early Development, socio-economic

status (SES), age of onset of L2 (AoO) and length of exposure to L2 (LoE), which

have been discussed in the literature as influencing L1 and L2 performance. Tuller

et al. (2018) established that Early Development, a risk factor for DLD including

age of first words and first sentences, and not language exposure and use,

generally explained more of the variance in the performance in the German

NWRT and SRT.

Clearly, in a refugee context, other factors such as interrupted education, living

and study conditions and trauma incurred during flight could influence

performance in language tasks. Thus, questions about schooling, access to

language courses, itinerary, means of transport and past and present living

conditions were added to the PaBiQ for the purposes of this study.

In this paper, the investigation includes a group of refugees. On the one hand,

we want to ensure that a child shows typical language acquisition and can

therefore be expected to encounter no extraordinary difficulties in her second

language development. On the other hand, we want to make sure that, if there is

an impairment, the child will be diagnosed and will have access to language

therapy. This is even more important as the German school system offers special

support for children with language and learning difficulties, and thus requires

language assessment of all at-risk children. This aim can be achieved by assessing

the home language of a child, which, in the case of refugees, should be the

dominant language and should show a possible impairment. Comparing L1

3

development in refugee and heritage children in more depth will enable us to

judge whether certain L1 tests are fair for both groups, and in particular for

heritage children, who might not (or no longer) be dominant in their L1 after

extensive L2-exposure in kindergarten or school. Heritage children might thus be

at a disadvantage in L1 tests given the problems with L1-maintenance in heritage

speakers (Montrul 2008). We will therefore analyze performance in a

standardized Arabic language assessment tool, as well as in a LITMUS SRT in

Arabic (Zebib et al., to appear).

As teachers or speech and language therapists capable of assessing the L1 in

bilinguals are rare, application of German (L2) assessment tools is inevitable.

Since the LITMUS-SRTs and NWRTs have shown promising results for heritage

children, this study asks whether the same tests render reliable and fair

assessments of language abilities also for refugee children and explore how much

language exposure is required to achieve fair results.

In addition, we want to know which factors influence variance of performance.

Since it has been discussed that performance in L2 repetition tasks can be

influenced by AoO, LoE, and SES, but crucially by morphological and lexical

knowledge (see Chiat & Polišenská, 2016 on properties of differently constructed

NWRTs and Tuller et al., 2018 for an investigation of SRTs), we want to

investigate in how far such factors, but also quality of input, current use and WM

factors, might influence performance. In particular, we want to explore whether

the tests allow a fair and quick classification after an exposure of more than a year.

Final points of investigation are in how far early and present language

exposure/use in L1 or L2 determine later language learning and lead to an

advantage in L2 development.

2. Method

2.1. Participants

This study compares 11 2L1 and eL2 children (5;6-9;0) with Arabic as L1

(Lebanese) to 11 refugee children from Syria (7;7-11;6) who all had their first

exposure to German in special language classes with accompanying formal and

literacy instruction in the L2 and had been attending such classes or schools for at

least 18 months. All participants scored above percentile rank 9 (IQ score ≥ 80

according to Wechsler’s IQ scale) in the German version of Raven’s Colored

Progressive Matrices (Bulheller & Häcker, 2002), and were not at risk of DLD.

The participants of the heritage group had previously been classified as typically

developing children using a comprehensive assessment procedure outlined in

Hamann and Abed Ibrahim (2017), including standardized tests in the L1 and L2,

respecting dominance effects on test performance (Thordardottir, 2015). For the

refugee participants, we do not refer classification to L2-tests, but rely

on a standardized L1-test given their relatively short exposure to the L2.

Participants were recruited in kindergartens, schools, community associations

or places of worship from different federal states, representing a spectrum of

schooling in the L2. The refugee children in our study are typical for the diversity

4

of programs since they are resident in different parts of Germany. Bremen, for

example offers a preparatory language support class before children switch to the

regular class. North Rhine-Westphalia offers language support classes and,

simultaneously, courses in regular education. Many primary school children

between 6 and 8 years of age attend regular classes from the beginning and receive

additional L2 support in separate courses. Table 1 gives a participant overview.

Table 1. Participant overview

(Mean, SD and range) Heritage speakers

(n=11)

Syrian Refugees

(n=11)

Age at testing (months) 88,4 (12,8), 70-108 114 (19,5), 91-138

Age of Onset (months) 39 (13,6), 24-75 91,5 (21,7), 66-120

Length of Exposure, (months) 53,2 (19,7), 32-97 22,8 (3,7), 18-27

Socio-economic status, SES

(mother´s education in years) 13,7 (2,2), 9-18 15,3 (5,3), 8-22

Length of L1- schooling (months) 0,63 (0,5), 0-2 11,63 (11,5), 0-32

Gender 3 M, 8 F 7 M, 4 F

2.2. Procedures and assessment tools

In addition to standardized L1 and L2 tests, all children performed German

versions of the LITMUS NWRT and SRT, as well as the Arabic LITMUS-SRT.

Relevant background information was gathered with the Questionnaire for Parents

of Bilingual Children (PaBiQ; Tuller, 2015).

We also chose two working memory measures, Forward Digit Span (FDS,

WISC-IV, Petermann & Petermann, 2011), associated with VSTM, and

Backward Digit Span (BDS) involving storage and processing and assessing

VWM.

2.2.1. Standardized L1 and L2 tests

The lack of age adequate tests and standardized tools posed a serious problem

for background assessment for this investigation, which includes children up to

the age of 12 years. Nevertheless, the following standardized language tests were

used. For assessing Arabic, we used the ELO-L (Zebib et al., 2017), standardized

and normed in Lebanon on large samples of children growing up in the kind of

institutional bilingualism typical for Lebanon. It was adapted to other varieties of

Arabic by linguistically trained native speakers of these varieties including the

Syrian Arabic version used here. Since the age-range of the norming population

is 3;0-7:11, norm-referencing was not applicable to a subset of our sample.

We encountered similar difficulties for some standardized L2 tests. For

assessment of vocabular y, we use d WWT (Glück, 2011), which provides norms

5

for monolingual children (5;6 -10;11). For morphosyntax, we used the LiSe-DaZ

(Schulz & Tracy, 2011). The test is normed for eL2 bilinguals as well as

monolinguals. Norms for bilinguals are available for the ages 3;0 till 7;11. Despite

the possibility of norm extension for bilingual children older than 7;11 (Grimm &

Schulz, 2014), norms cannot be adapted in any way for our population of refugees

given the minimum LoE required for the norms of each age range. For the older

group of refugees we also used the TROG-D (Test for Reception of Grammar –

German, Fox, 2009). The TROG-D can be reliably used as a test for language

development in the age range of 3;0 – 9;11. For L1, we expect ceiling effects in

the ELO-L for our older refugee population showing their basic familiarity with

Arabic whereas older heritage children may stagnate in their L1 due to intensified

exposure to German in school. For L2, it can be expected that, due to inadequate

exposure, tools normed for younger populations can still be challenging for older

refugees. In both cases, we consider it meaningful if we find that a child scores

lower than dominance adjusted norm in the oldest age range with an available

norm.

Since there is no norming sample matching our oldest children, raw values are

additionally used for meaningful group comparisons for the L1 and L2

standardized tests in line with other researchers (Montanari, 2017).

2.2.2. The German LITMUS NWRT

The NWRT chosen here (Grimm et al., 2014) relies on increasing

phonological complexity, not increasing numbers of syllables (dos Santos &

Ferré, 2018; Grimm & Schulz, 2020). Using the most common vowels and

consonants of the world’s languages, it presents one-, two- and three-syllable non-

words of different phonological complexity with complex onsets or codas and

combinations thereof (see Abed Ibrahim & Fekete, 2019; Grimm & Schulz, 2020

for more detailed descriptions and analysis of the task’s properties). The items are

presented to the child in an appealing PPT in pseudo-randomized order through

headphones. The task takes about 5-10 minutes to administer and is scored

according to whole item accuracy. Minimally different vowels and, crucially,

voicing of consonants are nonetheless not counted as errors in order not to

disadvantage bilingual children.

2.2.3. The German and the Arabic LITMUS SRTs

The German SRT (Hamann et al., 2013) and the Lebanese Arabic SRT (Henry

et al., in press) target children between 5;6 and 9 ;0. This version of the Arabic SRT

was adapted (lexically and phonologically) to Syrian Arabic by Syrian speakers

and Arabic linguists at the University of Oldenburg.

Both SRTs include complex structures described as difficult for children with

DLD in the cross-linguistic literature. In addition to simple sentences varying in

agreement and/or tense, Marinis & Armon-Lotem (2015) recommend to include

object (which) questions, passives, finite complement clauses, (object) relative

6

clauses and topicalizations, which are all included in the German version (for

more details and examples see Hamann & Abed Ibrahim, 2017). Not all these

structures are available in all languages, however, so that the version of the Arabic

SRT used in this study includes simple perfective and imperfective sentences,

object which-questions, non-finite and finite complement clauses as well subject

and object relatives, but no passives or topicalizations (see Zebib et al. in press

for examples and a more detailed overview). The German SRT additionally

included the sentence bracket, which is considered a milestone in the acquisition

of German as early L2. An overview of the common structures in Arabic and

German LITMUS-SRTs is given in Table 2.

The current versions of the German and the Arabic SRT include 45 and

36 sentences respectively. The stimuli are presented via a child friendly PPT in

randomized order. Administration of the tasks takes 5-10 minutes. The tasks can

be rated by “identical repetition, SRT_Id” counting only exact repetition as

correct, disregarding only phonological errors, or it can be rated by “target

structure, SRT_Tar”, where mastery of the structure is counted as correct even if

the child substituted lexical items or used the incorrect gender or, in some

instances, incorrect case if it is not crucial for the target structure.

Table 2. Common structures in German and Arabic LITMUS-SRTs

target structure less complex more complex

monoclausal (SVO) present

(imperfective)

past

(perfective)

Object wh-question Who-questions (German) Which NP-questions

Clausal embedding - finite embed. + finite embed.

Relative clause Subj. relatives Obj. relatives

2.3. Data Analysis

All standardized tests, as well as the German LITMUS-NWRT and the

German and Arabic LITMUS-SRTs were administered as per description (cf.

Hamann & Abed Ibrahim, 2017; Tuller et al., 2018, Zebib et al. to appear). As

already indicated, norm-referencing was possible only for certain age groups in

the different standardized tests. Thus basic comparisons will be performed on raw

values (percent-correct scores) as well.

IBM SPSS 26 (IBM Corp. Released, 2019) and R-Studio (R-Studio Team,

2015) were used to conduct the statistical analyses. To determine factors

influencing performance, we developed a series of linear mixed-effects (LME)

regression models (Pinheiro & Bates, 2000), employing the lme4 R-package

(Bates et al., 2015). An advantage of LME models is that they allow for the

comparison of significant predictors in terms of their contribution or weight in

explaining the dependent measure, just as in the case of a simple multiple

regression model. To that end, we report on the Beta-estimates per significant

predictor.

7

3. Results

3.1. Standardized L1-Test (ELO-L)

For the L1, the refugee group showed higher lexical skills, both for receptive

(raw scores: U = 12.0, p = .001, r =.685, Z-scores: U =10.0, p < .001, r = .709)

and expressive vocabulary (raw scores: U = 17.5, p <. 01, r = .604, Z-scores: U =

16.0, p < .01, r = .624). Furthermore, the refugee children outperformed their

heritage peers on morphosyntax production (raw scores: U = 6.00, p < .001, r =

.765, Z-scores: U = 17.0, p < .01, r =.610). Interestingly, both groups performed

on par in morphosyntax comprehension (raw scores: U = 55.0, p = .748, Z-scores:

U = 49.5, p =.478) and in phonology if raw scores (but not Z-scores) are

considered. Mean raw scores (% correct) and Z-scores for the ELO-L subtests are

given in Figure 1a and Figure 1b.

0102030405060708090

100

LexR LexP MorphR MorphP Phon

ELO-L (% correct)

Heritage Refugee

*** ***

***

Fig. 1a. ELO-L (raw scores)

-3

-2

-1

0

1

2

LexR LexP MorphR MorphP Phon

ELO-L (z-scores)

Heritage Refugee

***

*** *

***

- -scores)

3.2. Standardized L2-Tests

With regard to L2-lexical abilities, both groups performed poorly in both

expressive and receptive subparts of the WWT. In fact, no significant group-

differences emerged and both performed within the language impaired range if

Fig. 1b. ELO L (Z

standardized Z-scores are considered, even in receptive vocabulary (Heritage: M

= -2.83, SD =1.42) vs. Refugee: M = -2.57, SD = 2.14).

8

Considering morphosyntax, while heritage children were able to reach age

appropriate norms in the LiSe-DaZ, more than half of the refugee children (6/11)

would be classified as at risk for DLD by the test with only children with more

than 24-27 months of L2-exposure reaching appropriate norms. Similar results

were observed for performance of the refugee group on TROG-D applying

dominance-adjusted cut-offs: all children with an LoE<24 months (5/11) were

identified as at risk of DLD.

3.3. LITMUS Repetition Tasks

3.3.1. Arabic LITMUS-SRT

As can be seen in Figure 2, heritage children were not disadvantaged by Arabic

LITMUS-SRT unlike in most of the standardized L1-measures of the ELO-L.

Both groups generally scored better when the measure “target structure” was

applied. Although the heritage group displayed more variance on the task,

especially on SRT_Ar_Tar (Heritage: M = 85.01, SD = 15.9); Refugee M = 94.4,

SD = 10.9), no significant between-group differences were found for either

SRT_Ar_Id or SRT_Ar_Tar.

100

0

80

60

40

20

12

18

18

SRT_AR_IdSRT_AR_Tar

Heritage RefugeeFig. 2. LITM US-Ar_SRT: %correct identical rep. & target structure

3.3.2. German LITMUS NWRT and LITMUS-SRT

No significant group differences emerged between heritage and refugee

children on LITMUS_NWRT. As for the German-SRT (Figure 3), the heritage

group performed generally better and showed less variance on both SRT_Id

(Heritage: M = 60.2, SD = 24.9; Refugee M = 49.5, SD = 31.3) and SRT_Tar

(Heritage: M = 78.9, SD = 17.8; Refugee M = 65.6, SD = 23.9). However, these

differences did not reach statistical significance. When children’s performance on

both SRT measures is plotted against LoE to the L2 (see Figure 4), it becomes

apparent that most refugee children with less than 24 months of exposure perform

below the cut-off scores separating typically developing from language impaired

children in Hamann and Abed Ibrahim (2017). Note that these cut-offs were

determined for a population of younger bilinguals (5;6-9;0) including a subset of

the present heritage sample.

9

100

0

80

60

40

20

6

910 NWRT

SRT_IdSRT_Tar

Heritage Refugee

Fig. 3. %correct: LITMUS-NWRT, SRT_Id & SRT_Tar

0

10

20

30

40

50

60

70

80

90

100

0 10 20 30 40 50 60 70 80 90 100

%co

rre

ct

LoE (months)

SRT_Id SRT_Tar

cut-off (SRT_Tar): 52.2%

cut-off (SRT_Id): 41.25%

Fig. 4. % correct SRT_Id & SRT_Tar vs. LoE_L2

3.4. Predictors of performance on LITMUS repetition tasks

Given the small linguistic load in the NWRT compared to the SRTs and the

results from previous studies about robustness of this measure against exposure

variables (Abed Ibrahim & Fekete, 2019; Tuller et al., 2018), analysis of

individual variation in the current paper is limited to performance in the LITMUS-

SRTs.

Background, WM (FDS & BDS) and linguistic variables previously discussed

in the literature were probed as potential predictors using nonparametric

Spearman-correlations. Only variables yielding significant bivariate correlations

with the dependent performance variables SRT_Ar_Id, SRT_Ar_Tar, SRT_Id and

10

SRT_Tar were selected as fixed factors for a series of linear mixed-effects (LME)

regression models. Although mean age differed significantly between the two

groups, it did not show significant correlations with the dependent measures.

T hus, it was n ot assessed as a ran dom facto r. The variable SES was correlated

with several dependent measures. Therefore, it was entered as a random factor

into all models to partial out its effect. Further factors such as L1/L2 schooling

were also included as random factors. To avoid mod el-overfitting, the number of

fixed factors per model was limited to three. For the sake of simplicity, we mainly

report on the models with significant fixed factors.

3.4.1. Predictors of performance on t he Arabic LITMUS-SRT

As for non-linguistic background variables, length of L1-schooling, and

current L1 u se as well as L1-linguistic richness significantly predicted

performance on both meas ures SRT_Ar_Id and SRT_Ar_Tar (LME models 3.1,

3.2, 3.3). T he only linguistic variable that emerged as a significant predictor of

performance on both measures of Arabic SRT was L1-receptive morphosyntax

(LME model 3.4). Notably, no significant effects were ob served for, AoO_L2,

LoE_L2, L1-vocabular y, morphosyntax in production or phonology.

Table 3. LM

EM

s analy

sis: Predictors of performa

nce on Arabic-SRT

Dependent variable SRT_Ar_Id SRT_Ar_Tar

Background variables

Model 3.1: Fixed f actors

(LoE_L2, current L1 use &

length_L1_schooling)

Random factors: SES

Length_L1_schooling:

β = .777, SE = .360,

T(13.5) = 2.15, p = .049*

Length_L1_schooling:

β = .736, SE = .301,

T(13.1) = 2.44, p = .029*


(AoO_L2, current L1 use &

linguistic richness L1)

Random factors:

SES+L1-schooling

a. Linguistic richness L1:

β = 5.04, SE = 1.89, T(15.9)

= 2.66,

p = .017*

b. Current L1 use:

β = -4.35, SE = 1.41,

T(14.6)= -3.09,

p = .007**


β = 3.58, SE = 1.62,

T(16.0) = 2.20,

p = .042*

b. Current L1 use:

β = -3.16, SE = 1.21,

T(15.1)= -2.60,

p = 0.01 9*

Model 3.3 : Fixed f actors

(LoE_L2+current L1

use+linguistic richness L1)

Random factors:

SES+L1-schooling


β = 5.30, SE = 1.89, T(15.9)

= 2.79, p = .012*

b. Current L1 use:

β = -4.10, SE = 1.43,

T(15.2)= -2.85, p = .011*

Current L1 use:

β = -3.42, SE = 1.22,

T(15.4) = -2.80,

p = .013*

WM and linguistic variables


(FDS_span+BDS_span

+MorphR_EL O-L)

Random factors:SES

MorphR_EL O-L:

β = 2.88, SE = 0.67, T(15.5)

= 4.28,

p < .001***

MorphR_EL O-L:

β = 2. 56, SE = 0.45,

T(15.2) = 5.58,

p < .001***

Significance codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

11

3.4.2. Predictors of performance in the German SRT

As can be seen in LME models 4.1-4.3, the only background variable that

emerged as a significant predictor for performance in both SRT_Id and SRT_Tar

was current L2 use.

As for WM and L2 language measures, expressive vocabulary skills emerged

as a significant predictor for performance on SRT_Id, while a combination of FDS

and L2 expressive vocabulary accounted for performance on SRT_Tar (LME

model 4.4). With respect to L2-morphosyntax, the performance in the subtest

assessing German sentence complexity level (ES “Entwicklungsstufe”, i.e.

developmental stage) combined with FDS was predictive of performance on

SRT_Tar but not on SRT_Id (LME model 4.5). Plotting performance in

SRT_Tar against the developmental level showed that only children who reached

developmental level 4, i.e. produced embedded clauses, were capable of

performing above the pathology cut-offs established for younger bilinguals in

Hamann and Abed Ibrahim (2017).

Table 4. LM

EM

s analysis: Predictors of performance on German

-SRT

Dependent variable SRT_Id SRT_Tar

Random factor: SES

Model 4.1: Fixed factors



Current L2 use:

β = 6.96, SE = 2.36,

T(17.9)= 2.94,

p = .008*

Current L2 use:

β = 4.39, SE = 1.93, T(17.7)

= 2.27, p = .036*



L2_schooling_months)

Current L2 use:

β = 5.86, SE = 2.11,

T(17.9) = 2.77,

p = .012*

Current L2 use:

β = 4.35, SE = 1.68, T(17.9)=

2.58,

p = .018*


(LoE_L2, current L2 use &


Current L2 use:

β =5.34, SE = 2.29,

T(17.97) = 2.32,

p =.031*

n.s.

WM and linguistic variables


(FDS_span, BDS_span &

LexP_L2)

LexP_L2:

β = 2.48, SE = 1.07,

T(12.6) = 2.32, p = .03*

a. FDS_span :

β = 6.88, SE = 2.59,

T(10.9) = 2.65, p = .022*

b. LexP_L2:

β = 2.40, SE = .72, T(13.0)

= 3.32, p = .005**


(FDS_span, BDS_span &

ES)

n.s.

a. LIS_ES:

β = 21.603, SE = 9.84,

T(14.8) = 2.19,

p = .044*

b. FDS_span :

β = 10.3, SE = 3.39,

T(14.1)= 3.04, p = .008**

Significance codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

12

As for TROG-D, all refugee children (5/11), who were identified as at risk of

DLD applying dominance-adjusted cut-offs, performed below the respective

pathology cut-offs in SRT_Id and SRT_Tar.

4. Discussion and conclusion

This study confirms previous research showing that the German LITMUS-

NWRT is suitable for all groups of bilingual children (Grimm and Schulz, 2020;

Hamann & Abed Ibrahim, 2017) whereas the German LITMUS-SRT leads to

accurate results only after longer exposures. The German LITMUS-SRT thus very

reliably assesses heritage children (see also Abed Ibrahim & Fekete, 2019) but

can at best be indicative during the first 24 months of exposure in case of refugee

children. When other morphosyntactic measures were investigated, it turned out that

only children who were able to produce subordinate clauses in the subtest on

sentential complexity of the LiSe-DaZ, were capable of performing reasonably in

the L2 SRT. This is in line with previous studies showing that eL2 learners acquire

the German sentence structure in main clauses fast, but take longer to acquire verb

placement in subordinate clauses. These findings clearly demonstrate that refugee

children, during the first 24 months of exposure will have difficulties with tests

that include such milestones of L2 acquisition (Clahsen et al., 1983). This is

unfortunate since assessing such structures allows individual and detailed

diagnosis and support, and the problem concerns not only this SRT but also the

LiSe-DaZ and TROG-D. It could be established that the same children diagnosed

as DLD on TROG-D performed poorly on SRT. This, again, is no surprise, since

the TROG-D, like the SRT, requires comprehension of Wh, subordination,

relative clauses and passives. This aligns with previous findings, also on

spontaneous speech data, in that refugee children perform in the range of children

with DLD (Chilla, 2008; Håkansson & Nettelbladt, 1993; Schöler, Fromm &

Kany, 1998).

These findings are complemented by the results on L1 assessments. Here, the

refugee children have an advantage over the heritage children in most

standardized measures. They perform on par with the heritage children on the

Arabic LITMUS-SRT, however. Interestingly, the only linguistic measure

predicting performance on both measures, identical repetition and target structure,

is the performance in L1-receptive morphosyntax. Since performance was on par

for this measure in both groups, we can conclude that the Arabic SRT is fair, not

only for refugee children, but also for heritage children at school age. Our analyses

also showed that neither working memory, nor L2-exposure variables (AoO_L2,

LoE_L2) predicted performance in the Arabic SRT. The latter results indicate that

heritage children are not disadvantaged and that the Arabic LITMUS-SRT can be

applied with both groups. This is encouraging as to the language competence not

only of the refugee but also of the heritage children - even though it seems to

demonstrate that reliable language assessment of bilinguals in the domain of

13

morphosyntax will have to include L1-measures, particularly in refugee

populations.

Using L2 and L1 measures for bilinguals has long been best practice in speech-

language pathology but is usually not feasible in schools. Using the German

LITMUS-NWRT for a first assessment may thus provide an easy and welcome

alternative. Combinations of the tests will therefore allow a fast estimate of typical

development. The German LITMUS-SRT also reliably measures morpho-

syntactic competence, which, unfortunately, does not develop as fast as many

schooling and integration models presuppose. This, in particular, is a result which

needs to be discussed widely and which should be used to correct the often too

optimistic expectations of teachers and educators.

References

Abed Ibrahim, Lina, & Fekete, István. (2019). What Machine Learning Can Tell Us About

the Role of Language Dominance in the Diagnostic Accuracy of German LITMUS

Non-word and Sentence Repetition Tasks. Frontiers in Psychology, 9, 27-57.

Abed Ibrahim, Lina, & Hamann, Cornelia (2017). Bilingual Arabic-German and Turkish-

German children with and without specific language impairment: comparing

performance in sentence and nonword repetition tasks. In Proceedings of BUCLD 41,

1-17. Somerville: Casscadilla P ress.

Ahrenholz, Bernt, Fuchs, Isabel, & Birnbaum, Theresa (2016). dann haben wir natürlich

gemerkt der übergang ist der knackpunkt. Modelle der Beschulung von

Seiteneinsteigern in der Praxis. BiSS Journal, 5.

Archibald, Lisa. M. D., & Gathercole, Susan E. (2006). Short-term and working memory

in specific language impairment. International Journal of Language & Communication

Disorders, 41 (6), 675-693.

Armon-Lotem, Sharon, de Jong, Jan, & Meir, Natalia (Eds.) (2015). Assessing Multilingual

Children. Disentangling bilingualism from language impairment. Bristol: Multilingual

Matters.

Armon-Lotem, Sharon, & Meir, Natalia (2016). Diagnostic accuracy of repetition tasks for

the identification of specific language impairment (SLI) in bilingual children: evidence

from Russian and Hebrew. International Journal of Language & Communication

Disorders, 51 (6), 715-731.

Bishop, Dorothy VM., Adams, Caroline V., & Norbury, Courtenary F. (2006). Distinct

genetic influences on grammar and phonological short‐term memory deficits: evidence

fro m 6‐year‐old twins. Genes, Brain, and Behavior, 5(2), 158-169.

Bulheller, S., & Häcker, H. (2002). Coloured progressive matrices—Deutsche Bearbeitung

und Norminierung [German adaptation and norms]. Frankfurt: Swets & Zeit linger.

Chiat, Shula, & Polišenská, Kamila (2016). A framework for crosslinguistic nonword

repetition tests: effects of bilingualism and socioeconomic status on children’s

performance. JSLHR, 59(5), 1179-1189.

Chilla, Solveig (2008). Erstsprache, Zweitsprache, Spezifische Sprachentwicklungs-

störung? Eine Untersuchung der deutschen Hauptsatzstruktur durch sukzessiv-

bilinguale Kinder mit türkischer Erstsprache. Hamburg: Dr. Kovač

Chilla, Solveig, Krupp, Stephanie, & Wulff, Nadja (2019). Konzepte fur den

Deutscherwerb neu zugewanderter Jugendlicher im deutschen Bildungssystem:

Lehrwerke oder adaptive Baukasten − Welchen Anforderungen mussen Lehr- und

14

Lern materialien genugen? In K. Kovács, & A.F. Dombi (Eds.), National and

international tendencies of education and training. Szeged: Szegedi Egyetemi Kiadó

Juhász Gyula Felsőktatási Kiadó, 325-344.

Clahsen, Harald, Meisel, Jürgen M., & Pienemann, M. (1983). Deutsch als Zweitsprache:

Der Spracherwerb ausländischer Arbeiter. Tübingen: Narr.

Conti-Ramsden, Gina, Botting, Nicola, & Faragher, Brian (2001). Psycholinguistic

Markers for Specific Language Impairment. Journal of Child Psychology and

Psychiatry, 42, 741-748.

Dos Santos, Christophe, & Ferré, Sandrine (2018). A Nonword Repetition Task to assess

bilingual children’s phonology. Language Acquisition, 25(1), 1-14.

Fox, A. (2009). TROG-D. Test zur Überprüfung des Grammatikverständnisses. Idstein:

Schulz-Kirchner.

Gagarina, Natalia (2017). Monolingualer und bilingualer Erstspracherwerb des

Russischen: ein Überblick. In N. Wulff, Nadja, & Kai Witzlack-Makarevich (Eds.),

Handbuch des Russischen in Deutschland: Migration – Mehrsprachigkeit –

Spracherwerb (pp. 393-410). Berlin: Frank & Timme.

Gagarina, N., Klassert, A., & Topaj, N. (2010). Sprachstandstest Russisch für

mehrsprachige Kinder. ZAS Papers in Linguistics 54- Sonderheft. Berlin: ZAS.

Gallon, Nichola, Harris, John & van der Lely, Heather (2007). Non-word repetition: an

investigation of phonological complexity in children with Grammatical SLI. Clinical

Linguistics & Phonetics, 21(6), 445-455.

Gantefort, Christoph, & Roth, Hans-Joachim (2008). Ein Sturz und seine Folgen. Zur

Evaluation von Textkompetenz im narrativen Schreiben mit dem FÖRMIG-Instrument

‘Tulpenbeet’. Evaluation im Modellprogramm FÖRMIG. FÖRMIG Edition, 4, 29-50

Glück, Christian Wolfgang (2011). Wortschatz- und Wortfindungstest für 6- bis 10-

Jährige: WWT 6–10, 2nd Edn. Munich: Elsevier, Urban & Fischer.

Grimm, Angela, Ferré, Sandrine, dos Santos, Christophe, & Chiat, Shula (2014). Can

nonwords be language-independent? Cross-linguistic evidence from monolingual and

bilingual acquisition of French, German, and Lebanese. In Symposium Language

Impairment Testing in Multilingual Setting (LITMUS): Disentangling bilingualism and

SLI, July 13-19. Amsterdam: IASCL.

Grimm, Angela, & Schulz, Petra (2014). Specific language impairment and early second

language acquisition: the risk of over- and underdiagnosis. Child Indic. Res., 7, 821–

841. doi:10.1007/s12187-013-9230-6

Grimm, Angela & Schulz, Petra (2020). Phonology and semantics: markers of SLI in

bilingual children at age 6? In Grohmann, K. & Armon-Lotem, S. (eds.), COST in

Action. Amsterdam: Benjamins.

Håkansson, Gisela, & Nettelbladt, Ulrika (1993). Developmental sequences in L1 (normal

and impaired) and L2 acquisition of Swedish. International Journal of Applied

Linguistics, 3(2), 131-157.

Hamann, Cornelia & Abed Ibrahim, Lina (2017). Methods for identifying specific

language impairment in bilingual populations in Germany. Frontiers in

Communication 2(19), 1-19. doi:10.3389/fcomm.2017.00016.

Hamann, Cornelia, Chilla, Solveig, Ruigendijk, Esther, & Abed Ibrahim, Lina (2013). A

German Sentence Repetition Task: Testing Bilingual Russian/German Children. Poster

presented at the COST meeting in Krakow, May 2013.

Henry, Guillemette, Tuller, Laurice, Prévost, Phillipe, & Zebib, Rasha (to appear), Les

outils LITMUS-libanais: synthèse et regard croisé. In R. Zebib (Ed.), Plurilinguisme

et Troubles Spécifiques du Langage au Liban. Beirut: Presses universitaires de

l’Université Saint Joseph.

15

IBM Corp. Released (2019). IBM SPSS Statistics for Windows, Version 26.0. Armonk, NY:

IBM Corp.

Klem, Marianne, Melby-Lervåg, Monica, Hagtvet, Bente, Lyster, Solveig-Alma,

Gustafsson, Jan-Eric, & Hulme, Charles (2015). Sentence repetition is a measure of

children’s language skills rather than working memory limitations. Developmental

Science, 18(1), 146–154.

Marinis, Theo, & Armon-Lotem, Sharon (2015). Sentence Repetition. In S. Armon-Lotem,

J. de Jong, & M. Neir (Eds.), Methods for assessing multilingual children:

Disentangling bilingualism from language impairment; 95-125. Clevedon:

Multilingual Matters.

Massumi, Mona & von Dewitz, Nor a (2015). Neu zugewanderte Kinder und Jugendliche

im deutschen Schulsystem. Mercator-Institut fur Sprachforderung und Deutsch als

Zweitsprache und vom Zentrum fur LehrerInnenbildung der Universitat zu Koln. Koln.

Meir, Natalia (2017). Effects of Specific Language Impairment (SLI) and bilingualism on

verbal short-term memory. Linguistic Approaches to Bilingualism, 7(3), 301–330.

https://doi.org/10.1075/lab.15033.mei

Meisel, Jürgen (Ed.) (1990). Two First Languages. Early grammatical development in

bilingual children. Dordrecht: Foris.

Montanari, Elke (2017). Beschulung von neu in das niedersächsische Bildungssystem

zugewanderten Schülerinnen und Schülern der Sekundarstufe 1. Stiftung Universität

Hildesheim, Zentrum für Bildungsintegration, 1-40.

Montrul, Silvina (2008). Incomplete acquisition in bilingualism. Amsterdam: John

Benjamins.

Paradis, Johanne, & Jia, Ruiting (2016). Bilingual children’s long-term outcomes in

English as a second language: language environment factors shape individual

differences in catching up with monolinguals. Developmental Science, 20(1), 1-15.

doi:10.1111/desc.12433.

Petermann, Franz, & Petermann, Ulrike (2011). Wechsler Intelligence Scale for Children–

Fourth Edition. Frankfurt a. M.: Pearson Assessment.

Pinheiro, José, & Bates Douglas (2000). Mixed-Effects Models in S and S-PLUS. New

York: Springer.

Polišenská, Kamila, Chiat, Shula, & Roy, Penny (2014). Sentence repetition: what does the

task measure? International J. of Lang. and Communication Disorders, 50, 106-118.

Rothweiler, Monika (2006). The acquisition of V2 and subordinate clauses in early

successive acquisition of German. In C. Lleó (Ed.), Interfaces in Multilingualism.

Acquisition and representation. Hamburg Studies on Multilingualism 4, 91-113.

Amsterdam: Benjamins.

Schöler, Hermann, Fromm, Waldemar, & Kany, Werner (Eds.) (1998). Spezifische

Sprachentwicklungsstörung und Sprachlernen. Erscheinungsformen, Verlauf,

Folgerungen für Diagnostik und Therapie. Heidelberg: C. Winter.

Schönenberger, Manuela, Rothweiler, Monika, & Sterner, Franziska (2012). Case marking

in child L1 and early child L2 German. In K. Braunmüller, & C. Gabriel (Eds.),

Multilingual Individuals and Multilingual Societies, 3-22.

Schulz, Petra & Tracy, Rosemarie (2011). Linguistische Sprachstandserhebung – Deutsch

als Zweitsprache (LiSe-DaZ). Göttingen: Hogrefe Verlag.

Thordardottir, Elin (2015). Proposed diagnostic procedures for use in bilingual and cross-

linguistic contexts, In Sharon Armon-Lotem, Jan de Jong, & Natalia Meir

(Eds.), Assessing Multilingual Children: Disentangling Bilingualism From Language

Impairment, 331-358. Bristol: Multilingual Matters.

doi:10.21832/9781783093137-014

16

Tuller, Laurice (2015). Clinical use of parental questionnaires in multilingual contexts. In

S. Armon-Lotem, J. de Jong, N. Meir, (Eds.), Assessing Multilingual Children:

Disentangling Bilingualism from Language Impairment, 301-330. Bristol:

Multilingual Matters.

Tuller, Laurice, Hamann, Cornelia, Chilla, Solveig, Ferré, Sandrine, Morin, Eléonore,

Prévost, Phillipe, dos Santos, Christophe, Abed Ibrahim, Lina, & Zebib, Rasha (2018).

Identifying language impairment in bilingual children in France and in Germany.

International Journal of Language and Communication Disorders, 53 (4), 888-904.

Von Maurice, Jutta, & Roßbach, Hans-Günther (2017). The educational system in

Germany. In Annette Korntheuer, Paul Pritchard, D. Maehler (Eds.), Structural

Context of Refugee Integration in Canada and Germany, 15, 49-53. GESIS, Köln:

Leibniz Institute for Social Sciences.

Zebib, Rasha, Henri, Guillemette, Khomsi, Abdelhamid, Messara, Camille, & Hreich,

Edith Kouba (2017). Batterie d’Evaluation du Langage Oral chez l’enfant libanais

(ELO-L). Kerserwan: LTE.

Zebib, Rasha, Prévost, Phillipe, Tuller, Laurice & Henry, Guillemette (eds.) (in

press). Plurilinguisme et Troubles Spécifiques du Langage au Liban. Beyrouth:

Presses universitaires de l’Université Saint Joseph.

Zebib, Rasha, Tuller, Laurice, Hamann, Cornelia, Abed Ibrahim, Lina & Prévost, Phillipe

(2019). Syntactic complexity and verbal working memory in bilingual children with

and without Specific Language Impairment. First Language, 1-24. doi:10.1177/01427

17

Proceedings of the 44th annualBoston University Conference on Language Development

edited by Megan M. Brown and Alexandra Kohut

Cascadilla Press Somerville, MA 2020

Copyright information

Proceedings of the 44th annual Boston University Conference on Language Development© 2020 Cascadilla Press. All rights reserved

Copyright notices are located at the bottom of the first page of each paper.Reprints for course packs can be authorized by Cascadilla Press.

ISSN 1080-692XISBN 978-1-57473-057-9 (2 volume set, paperback)

Ordering information

To order a copy of the proceedings or to place a standing order, contact:

Cascadilla Press, P.O. Box 440355, Somerville, MA 02144, USAphone: 1-617-776-2370, [email protected], www.cascadilla.com

language assessment of bilingual arabic-german heritage and … · 2020-05-20 · language...

Documents