ways of searching for the zeitgeist of modernity - a corpus-based approach to modern fiction

12
Ways of searching for the Zeitgeist of Modernity - a corpus-based approach to modern fiction Ilina Doykova Shumen University, Shumen (Bulgaria) [email protected]

Upload: quilla

Post on 23-Feb-2016

41 views

Category:

Documents


0 download

DESCRIPTION

Ways of searching for the Zeitgeist of Modernity - a corpus-based approach to modern fiction. Ilina Doykova Shumen University, Shumen (Bulgaria) [email protected]. Statistical analysis. Simple things may characterise different styles average sentence length average word length - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Ways of searching for the Zeitgeist of Modernity - a corpus-based approach to modern fiction

Ways of searching for the Zeitgeist of Modernity - a corpus-based approach to modern fiction

Ilina DoykovaShumen University, Shumen (Bulgaria)

[email protected]

Page 2: Ways of searching for the Zeitgeist of Modernity - a corpus-based approach to modern fiction

Statistical analysis

• Simple things may characterise different styles

– average sentence length– average word length– vocabulary richness – vocabulary growth (homogeneity of text)

• More complex analyses give a more interesting picture

– specific syntactic structures– degree of modification in NPs– types of verbs (e.g. verbs of persuasion, speech verbs, action verbs, descriptive verbs)– distribution of pronouns (1st/2nd/3rd person)– themes, beliefs, etc.– authorship

• Especially when used comparatively

Page 3: Ways of searching for the Zeitgeist of Modernity - a corpus-based approach to modern fiction

Linguistic Tools: WordSmith and Wmatrix

Useful features:

+ Tagging = identifies and labels PoS+ WordList = generates word-frequency lists+ Concordance = lists occurrences of a word in context and its immediate environment, gives access to collocates

• Identify syntactic use of word• Identify range of meanings • Identify relative frequency of different uses/meanings

+ KWIC (key word) = identification of key words through a comparison with a reference corpus+ Word Clouds = semantic tagsets in 21 domains

• Listings can be customised to show what you want more clearly:sort according to next or previous wordshow more or less context

highlight important information

Page 4: Ways of searching for the Zeitgeist of Modernity - a corpus-based approach to modern fiction

MethodologyWord Frequency List (Wmatrix)

Page 5: Ways of searching for the Zeitgeist of Modernity - a corpus-based approach to modern fiction

WordSmith frequency list of predicative adjectives, Modern British Women Fiction Writers Corpus

Page 6: Ways of searching for the Zeitgeist of Modernity - a corpus-based approach to modern fiction

Key words list and dispersion plot(ALONE in MBWFW corpus)

Consistency analysis indicates whether a word is found consistently across lots of different texts or only in a narrow set of texts, or a specific text

Page 7: Ways of searching for the Zeitgeist of Modernity - a corpus-based approach to modern fiction

Lemmatized results for relational pairsWordSmith and Wmatrix

Page 8: Ways of searching for the Zeitgeist of Modernity - a corpus-based approach to modern fiction

Investigation of semantic domains through semantic tagging (Wmatrix)

Page 9: Ways of searching for the Zeitgeist of Modernity - a corpus-based approach to modern fiction

Key Domain clouds (for Wmatrix only)

• The larger the word, the greater its “keyness” or uniqueness as compared to the BNC Written Sampler of imaginative texts.

Page 10: Ways of searching for the Zeitgeist of Modernity - a corpus-based approach to modern fiction

Comparison of linguistic software

Page 11: Ways of searching for the Zeitgeist of Modernity - a corpus-based approach to modern fiction

Research and language learning

Word frequency knowledge in present-day language textbooks (grammatical, collocational, semantic) is frequency-based;

Real usage corpora represent actual, not prescribed usage;

Translation find the best equivalent;

Grammar investigate on word classes, specific syntactic structures;

Teaching collocations‘trouble and strife’, ‘the elephant in the room’; ‘blue murder’

Decoding specific content (sexist, racist or ideological, etc. )

Authorship identification of true authorship

Analysis of texts written in any language and any alphabet

Page 12: Ways of searching for the Zeitgeist of Modernity - a corpus-based approach to modern fiction

References

[1] Biber, Douglas et al. (1998). Corpus Linguistics: Investigating Language Structure and Use. Cambridge: CUP, 1998. [2] Campbell, R.S., & Pennebaker, J.W. (2003).The secret life of pronouns: Flexibility in writing style and physical health. Psychological

Science, 14, 60-65, 2003.[3] Leech, G. N. and Scott M. (1981). Style in Fiction. London: Longman, 1981. [4] Rayson, Paul. (2009). Wmatrix. A Web-based Corpus Processing Environment, Computing Department, Lancaster University, 2009. [5] Rayson, P., Archer, D., Piao, S. L., McEnery (2004). UCREL Semantic Analysis System (USAS), 2004. (http://ucrel.lancs.ac.uk/usas/)[6] Scott, M. (2012). WordSmith Tools, Version 6, Liverpool: Lexical Analysis Software, 2012 (

http://www.lexically.net/wordsmith/index.html).[7] Seizova-Nankova,T. (2012). Primary school education and computer-based language study, BETA Papers, 2012. [8] Seizova-Nankova,T. (in print). Developing collocational competence. A case study. 12th International language, Literature and Stylistics

Symposium, Edirne, Trakya University, Turkey. [9] Semino, E. and Scott, M. (2004). Corpus Stylistics: Speech, writing and thought presentation in a corpus of English writing, Routledge,

2004.[10] Sinclair, J. (2007). The Search for Units of Meaning. In Corpus Linguistics: Critical Concepts in Linguistics. Vol. 3. Routledge, 2007.[11] Yasunori Nishina. (2007). A Corpus-Driven Approach to Genre Analysis: The Reinvestigation of Academic, Newspaper and Literary

Texts”, ELR Journal, 1 (2), 2007, (http://ejournals.org.uk/ELR/article/2007/2 (accessed 27 June 2013)).[12] UCREL Home Page, Lancaster, UK. 1993-2013. 23 April, 2013, (http://www.comp.lancs.ac.uk/research/)

Electronic text resources• http://www.stylist.co.uk/books,• http://www.newyorker.com,• http://narrativemagazine.com,• http://www.one-story.com,• http://www.teachingenglish.org.uk/teaching-resources,• http://www.guardian.co.uk/books, • http://gutenberg.net.au/