€¦  · web view“alison macdonald had always loved cold falmouth with her long, dark trees. it...

43

Upload: others

Post on 06-Apr-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: €¦  · Web view“Alison MacDonald had always loved cold Falmouth with her long, dark trees. It was a place where she felt safe. She was a Hopeful, cute, whiskey drinker with
Page 2: €¦  · Web view“Alison MacDonald had always loved cold Falmouth with her long, dark trees. It was a place where she felt safe. She was a Hopeful, cute, whiskey drinker with

Titolo della tesi titolo della tesi titolo della tesi titolo (arial 28)

Laureando (arial 12) Relatore Nome Cognome Nome Cognome

1/34

Page 3: €¦  · Web view“Alison MacDonald had always loved cold Falmouth with her long, dark trees. It was a place where she felt safe. She was a Hopeful, cute, whiskey drinker with

Cloud technologies for Language Teaching and Learning. Tools and methods of Text Analysis (NLP) to assess assignments in adult English Writing courses.

FEATURES TESTED IN PYTHON (NLTK AND TEXTBLOB) FOR THE ANALYSIS OF TEXTS WRITTEN IN ENGLISH 0

POS DIVISION 2

DISTINCTIONS BETWEEN ADJECTIVES AND VERBS ( NLTK WordLemmatizer) 3

SEARCH WORD MEANING (TEXTBLOB) 4

FIXING TYPING AND SPELLING ERRORS (WITH TEXTBLOB) 5

TRANSLATION FROM ENGLISH INTO ITALIAN -AND VICEVERSA- (TEXTBLOB) 6

WORD COUNT (NLTK AND TEXTBLOB) 7

WORD AND PHRASE DIVISION (TEXTBLOB) 7

OTHER TOOLS FOR THE AUTOMATIC (OR SEMI-AUTOMATIC) CORRECTION OF TEXTS: 8

CRITERIA AND TYPE OF ERRORS: 10

ESSAY ANALYSIS EXAMPLES (BASED ON LEVELS): 15

DATA ANALYSIS 23

RESULTS AND OBSTACLES FACED 27

INTERLANGUAGE INTERFERENCE AND AUTOMATIC CORRECTION 30

2/34

Page 4: €¦  · Web view“Alison MacDonald had always loved cold Falmouth with her long, dark trees. It was a place where she felt safe. She was a Hopeful, cute, whiskey drinker with

FEATURES TESTED IN PYTHON (NLTK AND TEXTBLOB) FOR THE ANALYSIS OF TEXTS WRITTEN IN ENGLISH

Is there a way to analyze English-language texts and evaluate them automatically in order to simplify the correction process? If one wanted, for example, to count repetitions, correct errors, calculate the improvement margins and all the other necessary procedures used to judge a written text, is it possible to automate all of these processes?

In order to take apart, analyze and correct an English written text, we need to take advantage of some of the Python features related to the Natural Language Processing (NLP), specifically NLTK and Textblob.

The text taken as model is the following:

“Alison MacDonald had always loved cold Falmouth with her long, dark trees. It was a place where she felt safe. She was a Hopeful, cute, whiskey drinker with pretty legs and blonde hair. Her friends saw her as a caring and loving person. Once, she had even saved an owl that was stuck in a drain. That’s the sort of woman she was. Alison walked over to the window and reflected on her grey surroundings. The Drizzle calmed her down. Then she saw something in the distance, or Rather someone. It was the figure of Brad Ferguson. Brad was a Ruthless knight with strong arms and soft lips. Alison gulped. She was not prepared. As Alison stepped outside and Brad came closer, she could see the fierce glint in his eyes.”

This passage -at first glance- is rich in adjectives, it includes almost and exclusively the Past Simple verb tense, it contains short sentences and it’s free of grammatical errors. As a first step, it is necessary to break down the text into tokens, that is, into individual parts. Then we simply use the NLTK command for the tokens:

3/34

Page 5: €¦  · Web view“Alison MacDonald had always loved cold Falmouth with her long, dark trees. It was a place where she felt safe. She was a Hopeful, cute, whiskey drinker with

As displayed by the image, the text is now divided into tokens (punctuation marks are also present).

POS DIVISIONIn case one needed to comprehend in which category each of these tokens belongs to or recognize what kind of words are present in the text -verbs, nouns, adjectives and so on- there’s the NLTK Parts Of Speech (POS) functionality.

4/34

Page 6: €¦  · Web view“Alison MacDonald had always loved cold Falmouth with her long, dark trees. It was a place where she felt safe. She was a Hopeful, cute, whiskey drinker with

Now the text is divided into parts of the speech. Each word (or token) is followed by an acronym and each acronym symbolizes a part of the speech. The list of acronyms is the following:

CC Coordinating ConjunctionCD cardinal digitDT determinerEX existential there ("there is" or "there exists")FW foreign wordIN preposition/subordinating ConjunctionJJ Adjective 'big'JJR Adjective, comparative 'bigger'JJS Adjective, superlative 'Biggest'LS list markerMD modal could, willNN noun, singular 'desk'NNS noun plural 'desks'NNP proper noun, singular 'Harrison'NNPS proper noun, plural 'americans'PDT predeterminer 'all the kids'

5/34

Page 7: €¦  · Web view“Alison MacDonald had always loved cold Falmouth with her long, dark trees. It was a place where she felt safe. She was a Hopeful, cute, whiskey drinker with

POS possessive ending parent’sPRP personal pronoun I, he, shePRP$ possessive pronoun my, his, hersRB adverb very, silently,RBR adverb, comparative betterRBS adverb, superlative bestRP particle give upTo go to the store.UH interjection errrrrrrrrrmVB verb, basic form takeVBD verb, past tense TookVBG verb, gerund/present participle takingVBN verb, past participle takenVBP verb, sing. present, non-3d takeVBZ verb, 3rd person sing. present takesWDT wh-determiner whichwh-pronoun who, whatWP$ possessive wh-pronoun whoseWRB wh-abverb where, when

Now that the various parts of the speech are clear, one can jump into the real analysis of the text.

DISTINCTIONS BETWEEN ADJECTIVES AND VERBS ( NLTK WordLemmatizer)

If for example, one wanted to analyze the word “loving”, what would come of it? An adjective, translated in Italian with “amorevole” or the verb “to love” in its continuous form in -ing?

The WordLemmatizer of NLTK helps to understand just that: it can in fact go back to the basic form of a verb or in our case understand if the word “loving” is understood by NLTK as adjective or as the continuous form of the verb in -ing.

6/34

Page 8: €¦  · Web view“Alison MacDonald had always loved cold Falmouth with her long, dark trees. It was a place where she felt safe. She was a Hopeful, cute, whiskey drinker with

By entering the word “loving” with the command WordLemmatizer, the adjective “loving” returns. This means that the term is understood by NLTK as an adjective (which it is in the text).

If one wanted to go back to the base form of the verb “loving”, it’s as easy as to add a ”v” after the word to analyze, and the verb “to love” will pop up:

This procedure can be done with any verb, for example the past Simple of “gave” which, once analyzed, goes back to the basic form “to give”:

SEARCH WORD MEANING (TEXTBLOB)

As for TextBlob, the tested features are very interesting. It is possible to take a word from the text and know its meaning immediately. By analyzing the word “whiskey” using Word(“chosen word”).definitions, the following meaning will appear: “a liquor made from fermented mash of grain”:

Or again, let’s take an adjective like “Ruthless”, after testing the word the definition given back is “without mercy or pity”:

So the conclusion is that, when in doubt, this functionality is extremely useful to understand whether a term exists or whether it is appropriate for the context or not.

7/34

Page 9: €¦  · Web view“Alison MacDonald had always loved cold Falmouth with her long, dark trees. It was a place where she felt safe. She was a Hopeful, cute, whiskey drinker with

FIXING TYPING AND SPELLING ERRORS (WITH TEXTBLOB)

If it’s necessary to correct a text that presents various typing errors -which make the correction process way more difficult- TextBlob offers a specific functionality that allows to do just that:

As seen from the image above, the phrase inserted deliberately with different typos has been corrected automatically. The only “mistake” that has been corrected differently than expected is the word “manu” meant as “many”; but instead corrected as “man”.

The tool is, however, less precise than expected. It can be noticed in the example below. The phrase “even though I knew nothing about philosophy, I had to research some great philosophers to understand human thought” was inserted with many typos that, however, were not corrected all or not in the way initially intended.

● “Eeven” was corrected well in “even”● “Thhough” was corrected in “through” and not in “though”● “Noting” was not corrected in “nothing”, probably because the word already

exists as a variation of the verb “to note”● “abouut” was corrected well in “about”● “Philosofy” was not corrected in “philosophy”● “reserch” was corrected well in “research”● “Greaat” was corrected well in “great”● “Filosophers” was corrected well in “philosophers”● “Understind” was corrected well in “understand”● “Hooman” was corrected with the word “Woman” and not with “Human”● “Thouhgt” was corrected well in “thought”

8/34

Page 10: €¦  · Web view“Alison MacDonald had always loved cold Falmouth with her long, dark trees. It was a place where she felt safe. She was a Hopeful, cute, whiskey drinker with

Of course, we cannot expect the tool “to know” what was meant in the first place; what was intended to write, it is obvious for example that the word “noting” was not corrected in “nothing” because the first word exists and in this case affects the human interpretation of terms.

Most typos have been corrected well, but it is up to the reader to understand whether a word matches the meaning of the text or not.

TRANSLATION FROM ENGLISH INTO ITALIAN -AND VICEVERSA- (TEXTBLOB)

TextBlob also provides automatic translations from any language into any language, since it takes over from the Google translator tool. This functionality is particularly useful if one wants to automatically have a word match in their native language, without having to open a separate tool or window with an external translator.

If a sentence is taken from the text and the Italian translation tool is used, the sentence will result as it follows:

The translation is fairly accurate. The concerns found are the following:

● In English the term “friends” is generic and is not specific of any gender, yet in Italian it has been translated with “amiche” the female version of “friends”.

● The word “stuck” would have found a better match in “incastrato” but “bloccato” is sufficiently close to the original meaning.

● The translation of the term “drain” into “scarico” is not completely wrong, just slightly out of context.

It is also possible to translate from Italian into English. The example used below has been translated correctly by the tool tested in TextBlob.

9/34

Page 11: €¦  · Web view“Alison MacDonald had always loved cold Falmouth with her long, dark trees. It was a place where she felt safe. She was a Hopeful, cute, whiskey drinker with

WORD COUNT (NLTK AND TEXTBLOB)If in correcting a written text the repetition of a certain term is noticed and it becomes necessary to count how many times the term has been used, both TextBlob and NLTK offer the same type of functionality.

As for NLTK, the word-search is very precise and specific: in the analyzed sentence, once the recurrence of the word “we” is found, it becomes necessary to specify the upper or lower case in the first letter. This search precision can slow down the whole process. In the example below, “We” with uppercase finds 6 matches, and with lowercase just 1.

TextBlob does not differentiate between upper and lower case, speeding up the process of searching for the term:

This tool is also very useful for the analysis of a text written in English because it allows to customize the search based on the desired vocabulary.

WORD AND PHRASE DIVISION (TEXTBLOB)In order to analyze a text in the best possible way, it is always useful to divide it into words -which has already been done through tokens or POS- and phrases.

The division into words and phrases can be done with Texblob, by also remembering to space between the opened quotation marks, so that it recognizes both the single words and the single phrases separated by full stops:

10/34

Page 12: €¦  · Web view“Alison MacDonald had always loved cold Falmouth with her long, dark trees. It was a place where she felt safe. She was a Hopeful, cute, whiskey drinker with

OTHER TOOLS FOR THE AUTOMATIC (OR SEMI-AUTOMATIC) CORRECTION OF TEXTS:

For the sake of this Analysis, other tools were used to create a correction pattern/differentiation. I wanted to understand how different correction tools worked on different issues. The tools analyzed are the following

1. VOYANT1: Voyant Tools is a web-based text reading and analysis environment. It is a scholarly project that is designed to facilitate reading and interpretive practices for digital humanities students and scholars as well as for the general public.

What can be done with Voyant:

❖ Use it to learn how computers-assisted analysis works. ❖ Use it to study texts found on the web or texts that have been carefully edited.❖ Use it to add functionality to online collections, journals, blogs or websites so

others can see through texts with analytical tools.❖ Use it to add interactive evidence to essays that will be published online. Add

interactive panels right into the research essays (if they can be published online) so readers can recapitulate results.

❖ Use it to develop tools using functionality and code.

2. SpaCy2: spaCy is a free, open-source library for advanced Natural Language Processing (NLP) in Python.

If you’re working with a lot of text, you’ll eventually want to know more about it. For example, what’s it about? What do the words mean in context? Who is doing what to whom? What companies and products are mentioned? Which texts are similar to each other? spaCy is designed specifically for production use and helps you build applications that process and “understand” large volumes of text. It can be used to build information extraction or natural language understanding systems, or to pre-process text for deep learning.

❖ Tokenization Segmenting text into words, punctuation marks etc.

1Official Voyant tools website: https://voyant-tools.org/docs/#!/guide/about2 Official SpaCy website: https://spacy.io/usage/spacy-101

11/34

Page 13: €¦  · Web view“Alison MacDonald had always loved cold Falmouth with her long, dark trees. It was a place where she felt safe. She was a Hopeful, cute, whiskey drinker with

❖ Part-of-speech (POS) Tagging Assigning word types to tokens, like verb or noun.

❖ Dependency Parsing Assigning syntactic dependency labels, describing the relations between individual tokens, like subject or object.

❖ Lemmatization Assigning the base forms of words. For example, the lemma of “was” is “be”, and the lemma of “rats” is “rat”.

❖ Sentence Boundary Detection (SBD) Finding and segmenting individual sentences.

❖ Named Entity Recognition (NER) Labelling named “real-world” objects, like persons, companies or locations.

❖ Entity Linking (EL) Disambiguating textual entities to unique identifiers in a Knowledge Base.

❖ Similarity Comparing words, text spans and documents and how similar they are to each other.

❖ Text Classification Assigning categories or labels to a whole document, or parts of a document.

❖ Rule-based Matching Finding sequences of tokens based on their texts and linguistic annotations, similar to regular expressions.

❖ Training Updating and improving a statistical model’s predictions.❖ SerializationSaving objects to files or byte strings.

3. Grammarly3: Grammarly’s AI-powered products help people communicate more effectively. Millions of users rely on Grammarly every day to make their messages, documents, and social media posts clear, mistake-free, and impactful. Grammarly is an Inc. 500 company with offices in San Francisco, New York, Kyiv, and Vancouver.

As you type, Grammarly checks your text for hundreds of common and advanced writing issues. The checks include common grammatical errors, such as subject-verb agreement, article use, and modifier placement, in addition to contextual spelling mistakes, phonetic spelling mistakes, and irregular verb conjugations. Grammarly also provides synonym suggestions to make your writing more readable and precise.

Grammarly automatically detects grammar, spelling, punctuation, word choice and style mistakes in your writing. It’s easy to use:

Copy and paste any English text into Grammarly’s Editor or install Grammarly’s free browser extension for Chrome, Safari, Firefox, and Edge, and Grammarly will help you write correctly on nearly every site on the web.Grammarly’s algorithms flag potential issues in the text and suggest context-specific corrections for grammar, spelling and usage, wordiness, style, punctuation, and even

3Official Grammarly website: grammarly.com

12/34

Page 14: €¦  · Web view“Alison MacDonald had always loved cold Falmouth with her long, dark trees. It was a place where she felt safe. She was a Hopeful, cute, whiskey drinker with

plagiarism. Grammarly explains the reasoning behind each correction, so you can make an informed decision about whether, and how, to correct an issue.

CRITERIA AND TYPE OF ERRORS:Let’s now focus on the practical and dynamic aspects of this analysis. In order to automatically correct a text and simplify all the processes, it is necessary to establish the criteria that are considered essential for the success of the correction:

• The input that has to be given to the students for the creation of written text.

• The initial level of the student creating the text.

• Errors that need to be avoided for that particular level.

• Final verification and evaluation.

When teaching English, another criteria that is important to remember is a generic Grammar/Vocabulary/Functional Language benchmark that establishes what parameters are necessary for which level. The benchmark would look something like this:

A1.1 Absolute Beginner: doesn’t know the language at all.

A1.2 Upper Beginner - the following criteria is expected:

grammar vocab functional language

to bethere is / there areto havepluralspronounspresent simple (3rd, negative)wh- questionsorder of adjectives and nouns

numbers 1-20nationalitydates, time

greetingstell the timeasking about personal infodaily routine

A2.1 Lower Elementary - the following criteria is expected:

grammar vocab functional language

present continuous (present meaning)past simple (anche di to be)imperative form

numbers 21-infinite

introducing yourself and what you dotalking about past experiencesdescribing people

13/34

Page 15: €¦  · Web view“Alison MacDonald had always loved cold Falmouth with her long, dark trees. It was a place where she felt safe. She was a Hopeful, cute, whiskey drinker with

can / can’t

A2.2 Upper Elementary - the following criteria is expected:

grammar vocab functional language

to have / to have gotadverbs of frequencycan / couldwant / would likepresent continuous (future meaning)past simple (irregular verbs)prepositions (time & place)will

weathertransports

ability / inabilityinviting / refusinglikes / dislikesrequesting / offeringpromises / offers

B1.1 Lower Intermediate - the following criteria is expected:

grammar vocab functional language

comparatives & superlativesquantifierspresent perfectfuture tensesmust & have to & shouldneed

describing an object (goods)making comparisonstalking about future arrangements, timetables, intentions, forecastgiving / justifying personal opinionstalking about obligation/needtalking about necessity (objective / subjective)

B1.2 Intermediate - the following criteria is expected:

grammar vocab functional language

relative clausespresent perfect simple and continuousif clauses (zero, 1st)may / might / could

job titles describing something you don’t know the name oftalking about probability / possibility

B2.1 Upper Intermediate - the following criteria is expected:

14/34

Page 16: €¦  · Web view“Alison MacDonald had always loved cold Falmouth with her long, dark trees. It was a place where she felt safe. She was a Hopeful, cute, whiskey drinker with

grammar vocab functional language

past perfect simplemust vs might / may / couldpassive formused to / be used to / get used to

Anglo-Saxon verbs (get, set, let, keep)phrasal verbs?

travelling (getting around the city)deductions & speculations

B2.2 Lower Proficient - the following criteria is expected:

grammar vocab functional language

if clauses (2nd)preposizionite (adjecting / verb + prep)past perfect simple and continuous

writing emails (formal vs informal)travelling (airport/station)/restaurant/shoppingexpress possibility & uncertainty

B2.3 Proficient - the following criteria is expected:

grammar vocab functional language

if clauses (3rd)reported speecheither / neither / soconnectors (contrast/addition/cause and effect)

language interferencetalking about impossibilitycause and effect languageparaphrasing

C1.1 Advanced - the following criteria is expected:

grammar vocab functional language

would + infinitive (past habits)if causes (mixed)would ratherwishhad betterpast modalsverb patterns

arranging meetings & appointments (phrasal verbs)expressing annoyanceexpressing present & past regrets

15/34

Page 17: €¦  · Web view“Alison MacDonald had always loved cold Falmouth with her long, dark trees. It was a place where she felt safe. She was a Hopeful, cute, whiskey drinker with

Error correction can be seen as a form of feedback given to students on their practical use of English as a second language. No teacher can deny the fact that correcting mistakes made by students when writing is one of the most difficult tasks in acquiring a language. So, every teacher should consider some of the following problems when correcting errors: the difference between an oversight and a mistake; how many errors should actually be corrected; at what stages the teacher should correct the error and how. A crucial point is knowing the nature of foreign or second language learning, that is how a second language is learned. It is therefore essential to investigate what happens in the minds of human beings during the mental process leading to the acquisition of a language.

When native speakers make mistakes, they can identify and correct them immediately because they have an almost complete knowledge and understanding of their mother tongue language structure. Students who are at a certain stage of learning English as a second language, not only make mistakes, but face an incomplete understanding of the target language and are not always able to correct the mistakes they make. Therefore, learning errors often reflect a lack of basic language knowledge.

Mistakes are necessary to the learning process, but why do students make mistakes and why do they find it so difficult to correct their mistakes? Researchers dealing with the acquisition of the second language agree that one of the main causes of errors and mistakes is the direct transfer from one’s native language to the target language (Italian and English in this case).

In a nutshell it is easier for a student who is learning English as a second language, to translate a thought mechanically from their Italian language to English. But Italian and English are two very different languages.

Mistakes can be made for different reasons, here are a few:

● Interlanguage interferences: errors are caused by the interference of a person’s mother tongue. This type of error occurs during the second language learning process at a stage where students have not yet really acquired the necessary knowledge to properly translate a structure from one language to another (e.g. “My book is in the library” library intended as bookshelf, but the false friend library=biblioteca and not libreria, causes the student to make the linguistic interference error.)

● Generalization: simplifying and generalizing language structures. A thought is formed inside a student’s mind and then necessarily put into written language. While translating it, generalization is a simplifying process. (e.g. “having nightmares is a very common phenomenon that occurs in every individual’s lifetime” becomes generalized into “have nightmare is very common for a person in his life”).

16/34

Page 18: €¦  · Web view“Alison MacDonald had always loved cold Falmouth with her long, dark trees. It was a place where she felt safe. She was a Hopeful, cute, whiskey drinker with

● Induced Errors: The teacher has not been sufficiently clear in explaining a language rule and the student is led to make mistakes.

● Ignorance of rules exceptions: many grammatical or syntactic rules are not always immutable or fixed in contexts. Often a rule -which is certain in most cases- may present one or more exceptions the student is not aware of.

● Incomplete application of the rules: the meaning of the sentence may be correct, but the learning process results are not sufficient to use a fully developed structure (eg."have you a pen?" instead of "Do you have a pen?”).

● Incorrect hypothesis: not knowing the rule or not remembering the processes for the implementation of the latter, leads the student to be not completely aware of the mistake made, because it is based on a hypothesis that is wrong.

During the language learning process, students benefit from the mistakes they make by getting feedback so that they can make new attempts that later will lead them to achieve the desired goals.

Examining a corpus of 12 essays written by 12 Italian students (researchers of the Torvergata University of Rome), with a B1 starting level, the following errors occur: singular/plural form, choice of words, verbal concordance, prepositions, false friends, order and structure of sentences. The errors found are further classified in: grammatical, semantic, lexical, structural. The results show that most of the students' mistakes are due to the transfer from Italian: the students rely on their native language to express their ideas. Evaluation processes show that participants understand the different types of errors.

ESSAY ANALYSIS EXAMPLES (BASED ON LEVELS):

Let’s now proceed with the actual analysis of four of the 10 texts written by the students.

These are four students, three males and a female (age 25-30+) who are researchers at Torvergata University of Rome. Their starting level is quite high, B1.1 and after 40+ hours of English course, a great improvement was made.

17/34

Page 19: €¦  · Web view“Alison MacDonald had always loved cold Falmouth with her long, dark trees. It was a place where she felt safe. She was a Hopeful, cute, whiskey drinker with

They were asked to write a brief description of who was, in their opinion, the most important person in the world today (100 word composition).

The texts are the following:

In the first example grammatical errors are not found.There’s Generalization and Interlanguage Interferences: some phrases and constructions are clearly translated directly from a thought conceived in italian:

● -”In generale” is used often in Italian to express that something is generically universal, not so much in English. “In general” is used twice in this text. Interlanguage Interference and Repetition of constructions

● -The student used “my own” than corrected into “my personal most important person”. This is considered to be redundant and incorrect. Construction error

● -The same mistake (own-personal) is made again at the end of the text.

18/34

Example 1: female

Example 2: male (30+)

Page 20: €¦  · Web view“Alison MacDonald had always loved cold Falmouth with her long, dark trees. It was a place where she felt safe. She was a Hopeful, cute, whiskey drinker with

In Example 2 there are some errors found: Generalization, some Interlanguage Interferences, some construction imperfections, and grammar mistakes.

● “Actually” is better positioned after the subject “I” or with a comma. Interlanguage and/or punctuation

● “of the world” was changed into “in”. Interlanguage Interference● “For this reason” is translated from a direct italian thought (Mother tongue

interference). Interlanguage Interference● “infact” was corrected with “in fact”. Spelling error● “[...] made strong decisions about many themes...” would be best as “[...]made

strong decisions on many themes[…]”. Grammar mistake● “climate changes” is plural, therefore it doesn’t require the article “the”.

Grammar mistake● “LGBT people” was corrected into “LGBT community” Interlanguage● for this “reason” was corrected with the plural. Grammar

In Example 3 there are some Interlanguage Interference, Generalization and a couple of Grammar mistakes:

● “An important person is one who have [...]” was corrected into “has” for the third person concordance in the Present Simple tense. Grammar mistake

● “take” decisions is best with “make”. Grammar and Interlanguage mistake● “[...]President of United States[...]” was missing all the determinative articles.● button is written with a U and not an O. Grammar mistake

19/34

Example 3: male

Page 21: €¦  · Web view“Alison MacDonald had always loved cold Falmouth with her long, dark trees. It was a place where she felt safe. She was a Hopeful, cute, whiskey drinker with

● “[...]Could start a World War just pushing a button[...]” was a direct translation from an italian thought construction. “by” was added during correction. Interlanguage Interference

● “[...]People which have[...]” is not an ideal construction, “who” would be more appropriate. Grammar mistake

● “[...]to share own ideas[...]” was corrected into “to share their own ideas”. Grammar mistake

● In the phrase “[...]One of the most important person[...]” the word “person” was corrected with “people”. Grammar mistake

● “[...]Climate problem[...]” is very simplified and generic and doesn’t work as well as other constructions such as “Climate changes” , “The problems concerning climate changes” or “Issues deriving from Climate changes” Interlanguage Interference

Example number 4 shows a few problems with Generalization, Interlanguage Interference and one or two imprecisions with grammar. Overall the product doesn’t show many errors:

● “[...]I don’t think this[...]” was corrected into “I don’t think so” Interlanguage Interference

● “Became” was corrected with “Become” Spelling error● “[...]Thanks to Internet[...]” was corrected with “Thanks to the Internet”.

Grammar mistake (with Interlanguage)● The construction “[...]Take inspiration on what he[...]” wants a different

preposition: “from” instead of “on”. Grammar mistake● “[...]what he do[...]” was corrected in “what he does”. Grammar mistake

20/34

Example 4: male

Page 22: €¦  · Web view“Alison MacDonald had always loved cold Falmouth with her long, dark trees. It was a place where she felt safe. She was a Hopeful, cute, whiskey drinker with

● “[...]In and out the green field[...]” would work best corrected into “Inside and outside of the green field” Interlanguage Interference

● “a hero” not “an hero”. Grammar mistake (with Interlanguage)

These four examples were taken from real tests made on real students who decided to improve their level of English. These texts were done after approximately 20+ hours of English course and an improvement was noted in every single one of them: there weren’t many grammar mistakes and the errors made were almost all to be included into the “mental constructions” scheme.

It’s also important to remember that the beginning level of these student was already very high; a B1.1 level requires a general understanding of how grammar works and common mistakes are not allowed at that point.

What would be the analysis on a different level, a lower one?

Let’s take a corpus of six written essays sent to me (via email this time, pdf files) by a group of Space Engineers (age 40+): their level is an A1.2 Upper Elementary and the task to complete was a 100 word essay on what their typical day was like. This corpus was collected at the beginning of their English Course after approximately 6-7 hours.

Example number 5 (done by a Male 40+ years of age) shows a variety of different mistakes: Grammar, Sentence Structure, Punctuation, Spelling and Interlanguage Interference:

● “My typical day is like this” Interlanguage Interference● “I wash myself” was corrected with “I wash up”. Grammar mistake (with

Interlanguage)● “I go at job” was corrected with “I go to work”. Grammar mistake (with

Interlanguage)● “In Rome there is” punctuation error● “very much traffic” was corrected with “a lot of traffic”. Grammar mistake (with

Interlanguage)

21/34

Page 23: €¦  · Web view“Alison MacDonald had always loved cold Falmouth with her long, dark trees. It was a place where she felt safe. She was a Hopeful, cute, whiskey drinker with

● “arrive at job” was corrected with “get to work”. Grammar mistake (with interlanguage)

● “collegues” was corrected with “colleagues”. Spelling error● “I came” was corrected with “I come”. Spelling error● “to home” was corrected with “home”. Grammar mistake (with interlanguage)● “I have a shower” was corrected with “I take a shower”. Construction error

The following text (Example 6) was done by a Female (age 40+):

There is a variety of different mistakes present: mainly Grammar, Construction and Interlanguage:

● “the lunch” doesn’t require the determinative article. Grammar mistake (with Interlanguage

● “children that go to school” Construction error● “I accompany my children” is too formal and was corrected with “I take my

children” Interlanguage Interference● “I do it very happy” was corrected with “happily”. Grammar and Construction

error● “I talk at the phone” was corrected with “on”. Grammar mistake● “Come back at home” was corrected with just “home”. Construction error with

Interlanguage Interference● “with my car” was corrected in “by car” or “on my car”. Construction error● “I take my children from sports” is a generic, incorrect sentence. It was

corrected with “I get my children from their afternoon activities”. Grammar error and Construction error

● “I and my husband” was corrected with “My husband and I” or “Me and my husband”. Construction error

Example 7 below, was done by a male (age 40+):

22/34

Page 24: €¦  · Web view“Alison MacDonald had always loved cold Falmouth with her long, dark trees. It was a place where she felt safe. She was a Hopeful, cute, whiskey drinker with

It show different types of mistakes: mainly Spelling, Grammar, Construction and Interlanguage Interference.

● “repetetive” Spelling error● “every thing” was corrected with “everything” Spelling error● “very quick” was corrected with “quickly” Grammar error● “I dress” was corrected with “I get dressed” Construction error caused by

Interlanguage● “envoices” was corrected with “invoices” Spelling error● “After the job” was corrected with “after work” Interlanguage Interference● “the dinner” doesn’t require an article in this instance. Construction (this error

was done twice).● “I feel sleep” was corrected with “sleepy”. Construction error caused by

Interlanguage

Example 8 is a text written by a Female (age 40+):

The example shows different mistakes:

● “in the morning I eat breakfast” a comma is missing. Punctuation error● “Everyday” was corrected with “every day”. Spelling error● “my mother house” was corrected with “my mother’s house”. Grammar error ● “she make” was corrected with “she makes”. Grammar error

23/34

Page 25: €¦  · Web view“Alison MacDonald had always loved cold Falmouth with her long, dark trees. It was a place where she felt safe. She was a Hopeful, cute, whiskey drinker with

● “I have not a car” was corrected with “I don’t have a car”. Interlanguage Interference

● “it’s almost 2 hours until I arrive at work” was corrected with “It takes me almost 2 hours to get to work” Construction error

● “Sometime” was corrected with “Sometimes”. Spelling error● “that live” was corrected with “that lives”. Grammar error● “or I play on my phone or I read a book” was corrected with “I either play on

my phone or I read a book”. Construction error caused by Interlanguage

Example 9 was written by a Male(age 40+):

The text above show a variety of very interesting errors: Grammar, Construction and Interlanguage Interference mainly and are listed below:

● “After” was corrected with “Then” three times. Interlanguage Interference● “Attack at work” was corrected with “I start working”. Interlanguage

Interference● “A poor dish” was corrected with “a frugal meal”. Construction error● “I listen to the music” doesn’t need an article and was corrected with “I listen

to music”. Grammar error● “at the 4:36 pm” doesn’t require an article. Probable distraction ● “I come at home” was corrected with the elimination of the article. Grammar

error● “After either I” is better expressed by changing the subject-conjunction

position “Then I either”. Construction error● “I see with my girlfriend” doesnt need the preposition “the”. Grammar error● “I play the my guitar” doesn’t need the article “the”. Grammar error with

Interlanguage Interference

Finally, Example number 10 is that of a Male (age 40+):

24/34

Page 26: €¦  · Web view“Alison MacDonald had always loved cold Falmouth with her long, dark trees. It was a place where she felt safe. She was a Hopeful, cute, whiskey drinker with

The text above shows Grammar, Punctuation, Construction, Interlanguage Interference and Spelling errors:

● “Everyday” was corrected with “Every day”. Spelling error● “I finish to work” was corrected with “working”. Grammar error● “For breakfast” is missing a comma. Punctuation error● “Fette biscottate” was not translated. Generic Error● “With jam homemade with strawberries” was corrected into the correct

construction “With homemade strawberry jam”. Construction error● “For lunch depend” was corrected with “For Lunch, it depends”. Punctuation

and Language Interference● the construction “or-or” doesn’t exist in English so it was corrected with

“either-or”. Interlanguage Interference● “Collegues” was corrected with “colleagues”. Spelling error● “Take my lunch from home” was corrected with “I bring my own lunch”.

Interlanguage Interference● “For dinner” is missing a comma. Punctuation error● “The vegetables” doesn’t need the article. Grammar error● “Browse Internet” is missing the article “the”. Grammar error

DATA ANALYSISNow let’s dive into the Data analysis of these Examples. The following charts were made in order to visually explain how every correction tool (NLTK/TEXBLOB - VOYANT - SPACY - GRAMMARLY) behaves once presented with the errors detected in the text:

Example 1 Construction Grammar Punctuation Spelling Other

2 0 0 0 1

25/34

Page 27: €¦  · Web view“Alison MacDonald had always loved cold Falmouth with her long, dark trees. It was a place where she felt safe. She was a Hopeful, cute, whiskey drinker with

X - - - X Nltk/Texblob

X - - - X VOYANT

X - - - X SPACY

X - - - X GRAMMARLY

Example 2 Construction Grammar Punctuation Spelling Other

0 3 1 1 4

- ✅ X ✅ X Nltk/Texblob

- ✅ X X X VOYANT

- ✅ X ✅ X SPACY

- ✅ ✅ ✅ X GRAMMARLY

Example 3 Construction Grammar Punctuation Spelling Other

0 6 0 0 3

- ✅ - - X Nltk/Texblob

- ✅ - - X VOYANT

- ✅ - - X SPACY

- ✅ - - X GRAMMARLY

Example 4 Construction Grammar Punctuation Spelling Other

0 4 0 1 4

- ✅ - ✅ X Nltk/Texblob

- ✅ - ✅ X VOYANT

- ✅ - ✅ X SPACY

- ✅ - ✅ X GRAMMARLY

Example 5 Construction Grammar Punctuation Spelling Other

1 5 1 2 6

X ✅ X ✅ X Nltk/Texblob

X ✅ X ✅ X VOYANT

X ✅ X ✅ X SPACY

X ✅ ✅ ✅ X GRAMMARLY

26/34

Page 28: €¦  · Web view“Alison MacDonald had always loved cold Falmouth with her long, dark trees. It was a place where she felt safe. She was a Hopeful, cute, whiskey drinker with

Example 6 Construction Grammar Punctuation Spelling Other

6 4 0 0 3

X ✅ - - X Nltk/Texblob

X ✅ - - X VOYANT

X ✅ - - X SPACY

X ✅ - - X GRAMMARLY

Example 7 Construction Grammar Punctuation Spelling Other

3 1 0 3 3

X ✅ - ✅ X Nltk/Texblob

X ✅ - ✅ X VOYANT

X ✅ - ✅ X SPACY

X ✅ - ✅ X GRAMMARLY

Example 8 Construction Grammar Punctuation Spelling Other

2 3 1 2 2

X ✅ X ✅ X Nltk/Texblob

X X X ✅ X VOYANT

X ✅ X ✅ X SPACY

X ✅ ✅ ✅ X GRAMMARLY

Example 9 Construction Grammar Punctuation Spelling Other

2 4 0 0 3

X ✅ - - X Nltk/Texblob

X ✅ - - X VOYANT

X ✅ - - X SPACY

X ✅ - - X GRAMMARLY

Example 10 Construction Grammar Punctuation Spelling Other

1 3 3 2 4

X ✅ X ✅ X Nltk/Texblob

X ✅ X ✅ X VOYANT

X ✅ X ✅ X SPACY

27/34

Page 29: €¦  · Web view“Alison MacDonald had always loved cold Falmouth with her long, dark trees. It was a place where she felt safe. She was a Hopeful, cute, whiskey drinker with

X ✅ ✅ ✅ X GRAMMARLY

The chart for Example 1 shows three mistakes, two Construction errors and one Other (Interlanguage Interference).4

Even though the text in Example 1 is pretty well done, these mistakes are still very noticeable for an English Teacher and None of the tools were able to detect these mistakes.

The chart for Example 2 shows three Grammar mistakes, one Punctuation mistake, one Spelling mistake and 4 “Other” (Interlanguage Interference). All the tools recognized the Grammar Error; only Grammarly was able to detect the Punctuation Error; every tool except for VOYANT detected the Spelling mistakes; None of the tools was able to detect the four Interlanguage Interferences.

The chart for Example 3 shows six Grammar mistakes and three Other (Interlanguage Interference). All the tools were able to correct the grammar mistakes but none of them were able to correct the Interlanguage Interferences.

The chart for Example 4 shows four Grammar mistakes, one Spelling mistakes and four Other (Interlanguage Interference). All of the tools were able to find and correct the Grammar mistakes and the Punctuation one, but none of them were able to find and correct the four Language Interferences.

Examples 5 to 10 refer to the texts written by the second group of students, those who appear to have more difficulties with the language due to a lower starting Level (A1.2).

The chart for Example 5 shows one Construction error, five Grammar errors, one Punctuation error, two Spelling errors and six Other (Interlanguage Interferences).Grammar mistakes were found and corrected by all of the tools taken under examination; the Punctuation mistake was corrected only by Grammarly; the Spelling mistakes were found and corrected by all tools. The Construction error and the six Interlanguage Interferences were not found nor corrected by the tools.

The chart for Example 6 shows six Construction errors, four Grammar errors and three Other (Interlanguage Interference). When confronted with the automatic correction tools, all the Grammar correction are detected and corrected and all the Construction and Interlanguage ones, are not.

The chart for Example 7 shows three Construction errors, one Grammar error, three Spelling error and three Other (Interlanguage Interference). While using all the

4 See chapter “Essay Analysis” to find exact mistake references in the text.

28/34

Page 30: €¦  · Web view“Alison MacDonald had always loved cold Falmouth with her long, dark trees. It was a place where she felt safe. She was a Hopeful, cute, whiskey drinker with

correction tools, the Grammar mistake was easily corrected as well as the Spelling ones. The Construction and the Interlanguage mistakes were not found nor corrected by any of the correction tools taken under examination.

The chart for Example 8 shows two Construction errors, three Grammar mistakes, one Punctuation error, two Spelling errors and two Other (Interlanguage Interferences). None of the Construction mistakes were found or corrected; the three Grammar mistakes were found and corrected by most of the tools except for VOYANT; the Punctuation mistake was corrected by Grammarly only; the Spelling mistakes were corrected by all of the tools and the Interlanguage Interferences were not found nor corrected by any of the tools.

The chart for Example 9 shows two Construction errors, four Grammar errors and three Other (two Interlanguage Interferences and one Probable Distraction mistake). None of the Construction mistakes were corrected by the correction tools, the four Grammar mistakes were easily found and corrected by all the tools. The two Interlanguage Interferences were not found nor corrected by the tools and the Probable Distraction mistake was corrected by the tools but not considered to be a Distraction error.

The chart for Example 10 shows one Construction error, three Grammar mistakes, three Punctuation mistakes, two Spelling mistakes and four Other (Interlanguage Interferences). The Construction mistake was not corrected by any of the tools that, instead, were able to detect and correct the Grammar errors. The Punctuation errors were corrected by Grammar only; the Spelling errors were corrected by all of the tools and the four Interlanguage Interferences were not found nor corrected by any of the tools.

RESULTS AND OBSTACLES FACEDNow that the data analysis of the texts has been reviewed and there’s a clear understanding of how the automatic correction tools work to detect, analyse and correct the mistakes, it’s necessary to list and enunciate what kind of results and obstacles an observer faces along the way.

When a physical teacher corrects a written text, there are many variables that must be taken under examination: what mistakes are absolutely forbidden for that person’s level? What mistakes can be forgiven because maybe due to the distraction factor? If a sentence is grammatically correct but it’s constructed in a strange way or it’s an obvious Interlanguage Interference, does it count as “mistake” or it being grammatically correct exonerates the student?

29/34

Page 31: €¦  · Web view“Alison MacDonald had always loved cold Falmouth with her long, dark trees. It was a place where she felt safe. She was a Hopeful, cute, whiskey drinker with

These questions are the key point of this investigation. The human mind is very complex and a written text in English done by an Italian student is full of different context factors. A programmed correction tool cannot detect the mistakes that are not in its database and for now it’s very complicated to program a tool that is able to “think” like a physical teacher and has the same awareness a human being has.

Let’s take Example 9 again, because it embodies these obstacles well:

A Teacher, when reading this text has this scenario in front of them: the student (a Male age 40+) who described his typical day, often made clear and immediate translations of the thoughts and constructions that came into his mind in italian, something that is quite obvious for someone who corrects the text and knows both languages.

For example, the phrase “Attack at work at 7:45” is a literal translation of the italian construction “Attacco a lavoro / a lavorare alle 7:45” that can, in no way possible, be translated literally in English. “Attaccare a lavorare” is a great example of an Interlanguage Interference because the student used the same construction for both languages thinking it would be correct. Grammatically speaking, there’s nothing wrong with this sentence thus none of the correction tools were able to detect this mistake. But, if an English mother tongue or teacher reads this construction, it won’t make any sense to them: who attacks what? What does “Attacking” mean in this context and why is it used when describing work activities? Is someone fighting someone else?

Some of the correction tools used in this analysis, behaved well when confronted with Grammar mistakes or Spelling errors. All of them have a programmed “ability” to detect simple incorrect Grammar mistakes or Spelling errors5. But let’s take, for example the sentence in Example 9 that states: “My working day ends at the 4:36 pm”, all of the tools registered “the 4:36 pm” as a mistake. The error is undoubtedly there: there’s no need to use an article before time, but this is the third time the student listed time tables (“I wake up at 6:30”; “Attack at work at 7:45”) and he didn’t make the same mistake in those other two occurences. This means that this is merely a distraction error: the student didn’t make the same mistake every single time, but he just did it once, at the end of the text.

5 See chapter “Fixing Typing and Spelling Errors (with TEXTBLOB)

30/34

Page 32: €¦  · Web view“Alison MacDonald had always loved cold Falmouth with her long, dark trees. It was a place where she felt safe. She was a Hopeful, cute, whiskey drinker with

If the whole correction were to be automatized, the tool used would detect this distraction error as a Grammar mistake and would potentially subtract points to the essay. A teacher could do the same if they wanted to, but it’s up to their sensitivity and awareness whether or not to count it as a real mistake or a distraction one.

When analyzing the different correction tools, a message by Grammarly showed up:

This message suggests that if the Premium Version of Grammarly is purchased, the correction tool would be more precise with three additional writing issues (Word Choice). There is very low probability that the three additional writing issues are the ones discussed until now, mainly because it’s unlikely the three Word Choices would change whole structure sentences. Those are likely suggestions for better synonyms.

For this analysis the Premium Version of Grammarly was not purchased but it will be further analyzed to determine whether or not the issues with Interlanguage or Construction can be solved through this tool.

31/34

Page 33: €¦  · Web view“Alison MacDonald had always loved cold Falmouth with her long, dark trees. It was a place where she felt safe. She was a Hopeful, cute, whiskey drinker with

INTERLANGUAGE INTERFERENCE AND AUTOMATIC CORRECTION

As previously discussed, Interlanguage is a great example of how an automatic correction fails when presented with the issue of targeting and changing a sentence that is grammatically correct but looks strange and/or doesn’t make much sense if at all.

Interlanguage can be tritely divided into three phases6:

Initial or pre-basic phase: the learner starts to produce the outputs using a fewkeywords and an elementary grammar structure, consisting of a few functional elementswith pragmatic value: assertion, high frequency conjunctions and adverbs, a few personal pronouns and denial. The desinences are used in an unsystematic way, there is no real awareness of the word, that will simply be repeated without being analysed: nominal organisation of the statements (without verb) that are short and elementary and revolve around a few keywords, some references of the situational context and prosody. Basic phase: the learner has a basic variety that is always characterized bypragmatism. The utilization of the verb as the nucleus of the sentence begins to appear, as well as a richer vocabulary, the lack of word-function (articles, prepositions), and of verbal or nominal morphology. The phrases’ order puts the most informative part at the end of the latter. Subordinates are rare.

Post-basic phase: it is characterized by verbal flexion and by the appearance ofmorphology. In this phase, more complex interlanguages develop: there areintermediate varieties, showing fragility in the more marked areas of the second language; advanced varieties that have always less deviations compared to the target language; and finally varieties that could be considered “almost” native.

Now, in order for the correction tool to be able to detect and correct these kind of issues, a programmer should be aware of the different variations and use them as a given input for the tool to work through.

CONCLUSIONS

6 “Interlanguage” Salinker 1972

32/34

Page 34: €¦  · Web view“Alison MacDonald had always loved cold Falmouth with her long, dark trees. It was a place where she felt safe. She was a Hopeful, cute, whiskey drinker with

BIBLIOGRAPHIC AND WEB REFERENCES

33/34