an investigation into c orpus-based l earning about l anguage i n the p rimary-school: cllip

18
An investigation into Corpus-based learning about language in the primary-school: CLLIP Corpus evidence of the features of children’s literature

Upload: bond

Post on 09-Jan-2016

18 views

Category:

Documents


0 download

DESCRIPTION

Corpus evidence of the features of children’s literature. An investigation into C orpus-based l earning about l anguage i n the p rimary-school: CLLIP. The CLLIP Project: Background. CLLIP: C orpus-based L earning about L anguage I n the P rimary-school ESRC-funded project - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: An investigation into C orpus-based  l earning about  l anguage  i n the  p rimary-school:  CLLIP

An investigation intoCorpus-based learning about language in

the primary-school: CLLIP

Corpus evidence of the features of children’s literature

Page 2: An investigation into C orpus-based  l earning about  l anguage  i n the  p rimary-school:  CLLIP

The CLLIP Project: Background

CLLIP:Corpus-based Learning about

Language In the Primary-school ESRC-funded project Exploring potential for using corpus

evidence with primary school children (9-11 year olds) for learning about language (L1)

Page 3: An investigation into C orpus-based  l earning about  l anguage  i n the  p rimary-school:  CLLIP

Linguistic analysis of CLLIP corpus

CLLIP corpus is a collection of the texts in the British National Corpus that were written for a child audience

The corpus contains imaginative fiction, factual prose and other texts

Linguistic analysis was conducted on the imaginative fiction texts only

Page 4: An investigation into C orpus-based  l earning about  l anguage  i n the  p rimary-school:  CLLIP

Project research question: 1

1.      Does linguistic analysis of the corpus data confirm, extend or challenge the descriptions of English lexis and syntax which are identified as teaching targets in the National Curriculum and the National Literacy Strategy?

1a. Does any such analysis suggest a need for further research on the basis of a larger dedicated corpus of writing for children?

Page 5: An investigation into C orpus-based  l earning about  l anguage  i n the  p rimary-school:  CLLIP

Corpora: CLLIP and comparison

CLLIP corpus: imaginative fiction written for child audience, from the BNC

31 texts

Comparison corpus (hereafter ‘Comp’): imaginative fiction written for an adult audience, from the BNC

315 texts Newspaper texts from the BNC

114 texts

Page 6: An investigation into C orpus-based  l earning about  l anguage  i n the  p rimary-school:  CLLIP

Purpose of the linguistic analysis

To determine the characteristic features of the language of imaginative fiction written for children

To compare and contrast the language of these texts with the language of imaginative fiction written for adults, and also the language of newspapers

Page 7: An investigation into C orpus-based  l earning about  l anguage  i n the  p rimary-school:  CLLIP

Questions

What is distinctive about the discourse of the CLLIP corpus?

What similarities and differences are there in the overall word frequencies and of POSgrams in the three corpora?

Is there a difference in the uses of certain lexical items between the child and adult fiction corpora?

A POSgram is a sequence of parts of speech, such as an article followed by an adjective followed by another adjective then a noun (eg a bright red car; the last chocolate biscuit). In this study, we look at 6-grams (sequences of six parts of speech)

Page 8: An investigation into C orpus-based  l earning about  l anguage  i n the  p rimary-school:  CLLIP

Frequency of Parts of Speech

0.00

5.00

10.00

15.00

20.00

25.00

Comparison of POS categories for 3 corpora (expressed in percentages)

CLLIP 7.53 5.29 7.95 5.62 2.29 2.63 15.29 4.53 6.95 4.53 1.59 4.23 1.03 1.56 1.74 14.51

Comparison 7.82 5.91 7.68 5.72 2.69 2.74 16.60 3.89 7.51 3.89 1.68 4.23 0.88 1.87 1.65 13.74

Newspaper 9.73 7.96 4.45 4.83 1.28 2.28 23.15 7.57 8.92 3.49 1.81 3.64 0.27 1.32 1.35 10.44

ArticleAdjectiv

eAdverb

Conjunction

Possessive

Determiner

NounProper noun

Preposition

Pronoun

Infinitive to

Verb 'be'

Verb 'do'

Verb 'have'

Modal verb

Lexical verb

For each part of speech you can see 3 columns. The first two columns (left and middle) are for the CLLIP and Comp corpora respectively. What is remarkable is the similarity between the two for most parts of speech. There are many more nouns proportionally in the Newspaper corpus, while there are more lexical verbs in the fiction corpora.

Page 9: An investigation into C orpus-based  l earning about  l anguage  i n the  p rimary-school:  CLLIP

Frequency data

CLLIP – 22.0%; Comp – 22.4%; News – 23.5%

The top ten most frequent tokens for the CLLIP and Comp corpora are remarkably similar, particularly the top 4. Note the greater frequency of ‘of’ in the News corpus, which is related to the higher number of nouns – in expressions such as ‘the resignation of’. The figures at the top show the percentage of the overall frequency that the top ten account for in each corpus

Page 10: An investigation into C orpus-based  l earning about  l anguage  i n the  p rimary-school:  CLLIP

Frequency - adjectives

CLLIP – 14.6%; Comp – 11.3%; News – 11.9%

Once again, a remarkable similarity exists between the top 11 adjectives for the fiction corpora, while the Newspaper corpus contains many adjectives that refer to social attributes. The figures at the top indicate that the top 11 adjectives in the CLLIP corpus do a larger amount of ‘work’ than those for the other two corpora

Page 11: An investigation into C orpus-based  l earning about  l anguage  i n the  p rimary-school:  CLLIP

Frequency - nouns

CLLIP – 8.3%; Comp – 7.8%; News – 6.7%

Page 12: An investigation into C orpus-based  l earning about  l anguage  i n the  p rimary-school:  CLLIP

POSgram information

This table shows the most frequent 6-POS grams for each corpus. For each corpus, the sequence preposition + article + noun + of + article + noun is most common, followed by preposition + article + noun + preposition [not ‘of’] + article + noun in the two fiction corpora

Page 13: An investigation into C orpus-based  l earning about  l anguage  i n the  p rimary-school:  CLLIP

Prep+art+[ ]+of+art+noun

51%

This slide shows the nouns that most frequently fill the third slot in the preposition + article + noun + of + article + noun sequence. This shows that the sequence most commonly indicates spatial or temporal relations in the fiction corpora while in the newspaper corpus it can also express causal relations. The top six nouns in the CLLIP corpus account for 51% of the 6 POS grams of this sequence.

Page 14: An investigation into C orpus-based  l earning about  l anguage  i n the  p rimary-school:  CLLIP

Body parts: NECK

Do nouns in the CLLIP corpus more typically refer to physical entities in the world than the equivalent noun in the Comp corpus? The two righthand columns show the percentage of uses of the word ‘neck’ that are used to refer to part of a piece of clothing, or used in an idiomatic sense. The adult corpus contains only a marginally higher percentage of idiomatic uses.

Page 15: An investigation into C orpus-based  l earning about  l anguage  i n the  p rimary-school:  CLLIP

Neck

CLLIP: ‘stick your neck

out’ Little physical

contact Intimacy with

animals Neck as site of

pain

Comp: ‘breathing down

your neck’ Lots of physical

contact Intimacy between

humans Neck as site of

desire, tenderness, place for ornamentation

Page 16: An investigation into C orpus-based  l earning about  l anguage  i n the  p rimary-school:  CLLIP

Finger

CLLIP Figurative – 13% Jab, prod, lay, run,

put Accusing,

admonishing Used for drawing,

for indicating the need for silence and for pulling triggers

Comp Figurative – 19% Put, raise, point, run,

jab, wag Furtive, tentative,

negligent Used for

communicating, for feeling [contours & textures], for wearing rings

Page 17: An investigation into C orpus-based  l earning about  l anguage  i n the  p rimary-school:  CLLIP

in time – CLLIP

We looked at uses of ‘in time’ in the CLLIP corpus. The dominant meaning is immediate, and characters are concerned to accomplish something before the expiry of an implied deadline, externally imposed. A childly perspective seems often to imply staying on the right side of trouble or sanction.

Page 18: An investigation into C orpus-based  l earning about  l anguage  i n the  p rimary-school:  CLLIP

in time – Comp

‘In time’ in the Comp corpus is used in several senses.i: ‘in the fullness of time’, time on a large scale, which the speaker can perceive from a distanceii: ‘within an appropriate period of time’iii: others, as in the last line, where ‘in’ and ‘time’ have more separate meanings than is usual in the phrase