language assessment tsl3123 notes

12
TOPIC 1 – OVERVIEW OF ASSESSMENT (CONTEXT, ISSUES AND TRENDS) DEFINITIONS OF TERMS 1. Test A subset of ASSESSMENT intended to measure a test-takers language proficiency, knowledge, performance or skills. A technique of assessment. 2. Assessment A comprehensive process of planning, collecting, analyzing, reporting and using information on students over time. Appraising the level of magnitude of a person’s attributes. Consists of tests, projects, portfolios, anecdotal information and student self- reflection. 3. Evaluation Evaluation ≠ testing Involves the interpretation of information. Involves conveying the marks or test results into performance report for test- takers. 4. Measurement Assigning numbers or values to certain attributes of objects, events or people according to a rule-governed system. Measurement must be conducted according to explicit rules and procedures e.g. test specifications, criteria and marking procedures. Process of quantifying the observable performance of classroom learners. Test scores are measurements. Conveying the meaning of those scores is evaluation. Evaluation, however can also take place WITHOUT measurement (e.g. responses in classroom by learners) Trends and Issues – read Module. TOPIC 2 – ROLE AND PURPOSES OF ASSESSMENT IN TEACHING AND LEARNING REASONS/PURPOSE OF ASSESSMENT To determine the effective teaching strategies to be used in the classroom. To improve classroom practice and instructions. To provide information to children, parents and administrators. To measure students’ achievement – identify students’ strengths and weaknesses. Identify the difficult topic in a given unit – need to reteach. Used to measure proficiency, place students into levels of course, diagnose students’ strengths and weaknesses. ASSESSMENT OF LEARNING @ SUMMATIVE LEARNING A summary of learning. Measure, record and report on student’s level of achievement in regards to specific learning expectations. Tells teacher the current status of the students’ learning. Provide focus to improve on students’ achievement. ASSESSMENT FOR LEARNING @ FORMATIVE ASSESSMENT/DIAGNOSTIC ASSESSMENTS Conducted to improve students learning in teaching and learning process. Find out alternative teaching strategies Find out students’ understanding of the instructions. ASSESSMENT TYPES FORMAL INFORMAL Public exams (UPSR, PMR, SPM, STPM) Year-end exams / semester exams. Monthly test Projects Question and Answer Quizzes Students’ self-reflection

Upload: peterus-balan

Post on 19-Jan-2017

790 views

Category:

Education


1 download

TRANSCRIPT

Page 1: Language assessment tsl3123 notes

TOPIC 1 – OVERVIEW OF ASSESSMENT (CONTEXT, ISSUES AND TRENDS)

DEFINITIONS OF TERMS

1. Test

A subset of ASSESSMENT intended to measure a test-takers language

proficiency, knowledge, performance or skills.

A technique of assessment.

2. Assessment

A comprehensive process of planning, collecting, analyzing, reporting and

using information on students over time.

Appraising the level of magnitude of a person’s attributes.

Consists of tests, projects, portfolios, anecdotal information and student self-

reflection.

3. Evaluation

Evaluation ≠ testing

Involves the interpretation of information.

Involves conveying the marks or test results into performance report for test-

takers.

4. Measurement

Assigning numbers or values to certain attributes of objects, events or people

according to a rule-governed system.

Measurement must be conducted according to explicit rules and procedures

e.g. test specifications, criteria and marking procedures.

Process of quantifying the observable performance of classroom learners.

Test scores are measurements. Conveying the meaning of those scores is evaluation.

Evaluation, however can also take place WITHOUT measurement (e.g. responses in

classroom by learners)

Trends and Issues – read Module.

TOPIC 2 – ROLE AND PURPOSES OF ASSESSMENT IN TEACHING AND LEARNING

REASONS/PURPOSE OF ASSESSMENT

To determine the effective teaching strategies to be used in the classroom.

To improve classroom practice and instructions.

To provide information to children, parents and administrators.

To measure students’ achievement – identify students’ strengths and weaknesses.

Identify the difficult topic in a given unit – need to reteach.

Used to measure proficiency, place students into levels of course, diagnose students’

strengths and weaknesses.

ASSESSMENT OF LEARNING @ SUMMATIVE LEARNING

A summary of learning.

Measure, record and report on student’s level of achievement in regards to specific

learning expectations.

Tells teacher the current status of the students’ learning.

Provide focus to improve on students’ achievement.

ASSESSMENT FOR LEARNING @ FORMATIVE ASSESSMENT/DIAGNOSTIC ASSESSMENTS

Conducted to improve students learning in teaching and learning process.

Find out alternative teaching strategies

Find out students’ understanding of the instructions.

ASSESSMENT TYPES

FORMAL INFORMAL

Public exams (UPSR, PMR, SPM, STPM)

Year-end exams / semester exams.

Monthly test

Projects

Question and Answer

Quizzes

Students’ self-reflection

Page 2: Language assessment tsl3123 notes

SIX TYPES OF INFORMATION TESTS PROVIDE ABOUT STUDENTS (HENNING, 1987)

Diagnosis and feedback

Screening and selection

Placement

Program evaluation

Providing research criteria

Assessment of attitudes and socio-psychological differences

CLASSIFICATION OF TESTS (ALDERSON, CLAPHAM & WALL, 1995)

Proficiency Tests

Not based on a particular curriculum or language program.

Assess overall language ability of students at varying levels.

Describes what students are capable of doing in a language.

E.g. ETS, Cambridge ESOL, American TOEFL, IELTS.

Achievement Tests

Similar to progress tests

Administered at mid-and-end point of the semester or academic year.

Content of achievement tests is generally based on the specific course content.

Diagnostic Tests

Seek to identify those language areas in which students need further help.

Information gained is crucial for further course activities and providing students

with remediation.

Aptitude Tests

Measure general ability or capacity to learn a foreign language a priori.

Predicts success in that undertaking.

Progress Test

Measure the progress the students are making towards defined course or program

goals.

Teacher-produced, narrowed focus.

Cover smaller amount of material and assess fewer objectives.

Placement Tests

Assess students’ level of language ability for placement in an appropriate course or

class.

Indicates the level at which a student will learn most effectively.

Create groups homogenous in level.

Page 3: Language assessment tsl3123 notes

TOPIC 3 – BASIC TERMINOLOGY IN TESTING

NORM-REFERENCE TEST

In NRTs individual test-taker’s score is interpreted in relation to

o a mean (average score),

o median (middle score),

o standard deviation (extent on variance in scores)

o and/or percentile rank

Purpose: to place test-takers along a mathematical continuum in rank order.

Scores commonly reported back to test-taker in the form of a numerical score.

Administered to compare an individual performance with his peers or a group/

In SBE, NRTs are used for summative evaluation e.g. year-end examination for streaming

and selection.

CRITERION-REFERENCE TEST (CRT)

A collection of information about student progress or achievement in relation to a

specified criterion.

Standards serve as the criteria or yardstick for measurement.

Provide:

o Feedback to test-takers mostly in the form of grades, on specific course or

lesson objectives.

o Information on students’ mastery based on outcomes or objectives as

specified in the syllabus.

Advantage: allows test-takers to make inferences about how much language proficiency

they originally had and their successive gains over time.

NRT CRT

Definition/Purpose Test that measures student achievement as compared to others. Determine performance difference among individual and groups.

Approach that provides information on student’s mastery based on a specified criterion Determine learning mastery based on specified criterion.

Test item / Frequency

Easy to difficult – discriminates examinees’ ability. Continuous assessment in classroom.

Guided by minimum achievement in the related objectives. Continuous assessment.

Appropriateness / Examples

Summative evaluation UPSR, PMR, SPM, STPM

Formative evaluation. Monthly test, coursework, project.

OBJECTIVE TEST

A test that consist of right or wrong answers or responses.

Can be marked objectively.

Popular because easy to prepare and quick to mark, providing a quantifiable and

concrete result.

Focus more on specific facts than general ideas and concepts.

Types:

o MCQ

o True/false items

o Matching items

o Fill-in the blanks items.

MCQ

o Limit beneficial washback.

o Enable cheating

o Challenging to write successful items.

o Strictly limits what can be tested.

o Test only recognition knowledge.

o Encourage guessing.

SUBJECTIVE TEST

Evaluated by giving an opinion based on agreed criteria.

Include essay, short-answer questions, vocabulary tests, and take-home tests.

Provides more opportunity to test-takes to show/demonstrate their understanding and

in depth knowledge and skills in the subject matter.

Test takers might provide some acceptable answers that might not be predicted.

Enable students to be more creative and critical.

E.g. extended-response items, restricted-response items, essay.

Page 4: Language assessment tsl3123 notes

TOPIC 4 – BASIC PRINCIPLES OF ASSESSMENT

RELIABILITY

The degree to which an assessment tool produces stable and consistent results.

A reliable test is stable, consistent and dependable.

Same test, given to same test-takers on two circumstances, should yield same results.

Lack of reliability threats the item’s validity.

A reliable test is

o Consistent in its conditions across two or more administrations

o Gives clear directions for scoring / evaluation

o Has uniform rubrics for scoring / evaluation.

o Unambiguous to test-takers.

Two types of reliability

o Rater reliability (markers)

- Inter-rater reliability – degree of similarity between 2 tester without

influencing one another

- Intra-rater reliability – consistency within the raters.

o Test administration reliability

- Deals with interference and conditions during the test administration.

- Outside interference – noise, temperature, variations in

photocopying, lighting, conditions of chairs and tables.

Factors affecting the reliability of a test:

o Test length factor

- Longer tests are more reliable – avoids guessing.

o Teacher – student factor

- Teacher’s encouragement.

- Familiarity of test formats.

o Environmental factors

- Lighting

- Ventilation

- Comfortable setting

o Test administration factor

- Clear instruction

- Enough time

o Marking factors

- Markers’ objectivity and perception of the task of marking itself.

VALIDITY

Whether the assessment really measures what it claims to measure.

Types of validity

Face validity A test should look right (students take it seriously)

Should measure the knowledge or skills it is made to measure.

Affects how the students see and respond to the test (serious or otherwise)

Content validity Are all the content learnt presented in the test?

Can use the table of test specification o Skills o Level o Topics with all the items

Construct validity Does the test measure what it claims to? o Linguistic construct – fluency, proficiency. o Oral test – speed and rhythm.

Concurrent validity The use of another reputable test and compare with test set – measure students’ performance.

Predictive validity Predict result on a public exam from the trial paper

Pre-SPM to SPM

High predictive validity = predictable results in latter measure.

PRACTICALITY

Is the assessment tool practical to be used, and could be set up within possible means?

If carried out, would the effort of setting it up worth the objectives of having the test in

the first place?

Could the results be easily interpreted?

OBJECTIVITY

Refers to the ability of the examiners to be objective and non-partisan or non-bias.

Scores must be awarded not under the influence of examiners’ emotions or

opinions or skills.

Page 5: Language assessment tsl3123 notes

WASHBACK EFFECT

Impact that tests have on teaching and learning.

Students’ mistakes can indicate areas of improvement.

Compliment students’ achievements.

More important tests have greater washback effects.

AUTHENTIC

Assessment should reflect the actual feature of the target language task.

Students’ are motivate when tasks involve real world situations and contexts.

INTERPRETABILITY

Involves how meaning is assigned to the scores.

Take into consideration these factors:

o Reliability.

o Validity

o Scores, norms and related technical features.

o Administration and scoring variation.

TOPIC 5 – DESIGNING CLASSROOM LANGUAGE TEST

Stages of Test Planning

Determining Who is to be tested?

What is the purpose of testing

What is to be tested?

What is the scope of the test?

How detailed and accurate must the results be?

Planning Set a specification for the test (Table of Specifications etc.)

Includes information on content, format and timing, criteria, level of performance and determine them.

Writing Construct test items.

Collaborate with colleagues to find faults in the test paper.

Preparing Prepare the test items based on principles and techniques.

Reviewing Should not be reviewed straight away after construction.

Other teachers / native speakers need to review the test paper prepared.

Pre-testing Test on a similar target group to analyse individual items as well as whole test.

Validating Validate the test using formula:

𝐼𝐹 (𝑖𝑡𝑒𝑚 𝑓𝑎𝑐𝑖𝑙𝑖𝑡𝑦) = Σ𝑐 (𝑛𝑜. 𝑜𝑓 𝑐𝑜𝑟𝑟𝑒𝑐𝑡 𝑎𝑛𝑠𝑤𝑒𝑟𝑠)

𝑁(𝑛𝑜. 𝑜𝑓 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠)

𝐼𝐹 (𝑖𝑡𝑒𝑚 𝑑𝑖𝑓𝑓𝑖𝑐𝑢𝑙𝑡𝑦) = Σ𝑤 (𝑛𝑜. 𝑜𝑓 𝑤𝑟𝑜𝑛𝑔 𝑎𝑛𝑤𝑒𝑟𝑠)

𝑁 (𝑛𝑜. 𝑜𝑓 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠)

0 to 1 (<0.37 – difficult, >0.63 – easy)

Page 6: Language assessment tsl3123 notes

PREPARING A TEST SPECIFICATION / BLUEPRINT

Outlines the test construct and how it would look like.

Covers:

o Content description

o Item types (MCQ, cloze, etc.)

o Tasks (essay, short passage)

o Skills included

o Scoring

o Reporting

BLOOM’S TAXONOMY

Cognitive Dimension

Definition Key-words

Remembering Recalling information

Retrieving knowledge from long term memory

Recognizing Naming Listing

Describing

Understanding Explaining ideas or concepts

Construct meaning from instructional messages.

Interpreting Summarizing

Classifying Explaining

Applying Using information in another familiar situation.

Apply a procedure to a familiar task.

Implementing Carrying out

Executing

Analyzing Break materials into parts to explore understandings and relationships.

Comparing Organizing

Deconstructing

Evaluating Justifying a decision or making judgments based on criteria and standards.

Checking Hypothesizing

Critiquing Experimenting

Evaluating

Creating Generating new ideas, putting elements together to form coherent whole or new pattern.

Designing Constructing

Planning

SOLO (STRUCTURE OF THE OBSERVED LEARNING OUTCOME) TAXONOMY

Describes how a learner’s performance develops from simple to complex levels.

5 stages:

o Quantitative phase

Pre-structural

Uni-structural

Multi-structural

o Qualitative phase

Relational

Extended abstract

Stages Descriptions Sample Verbs

Prestructural Incompetence Fail Incompetent

Misses the point

Unistructural One relevant aspect Identify Name

Follow simple procedure

Multistructural Several relevant independent aspects. Combine Describe

Enumerate Perform serial

skills List

Relational Integrated into a structure Analyse Apply

Argue / compare Criticize Relate

Extended abstract

Generalized into a new domain Create Formulate Generate

Hypothesize

Page 7: Language assessment tsl3123 notes

GUIDELINES FOR CONSTRUCTING TEST ITEMS

Building test items must take into consideration following items:

Aims of the test

Range of the topics to be tested

Range of skills to be tested.

Test format

Level of difficulty

International and cultural considerations

TEST FORMATS

Depends on the test itself

UPSR - paper 1 and paper 2

TOEFL – internet based test and paper based test

IELTS – four language skills

MUET – similar to IELTS

TOPIC 6 – ASSESSING LANGUAGE SKILLS CONTENT

Page 8: Language assessment tsl3123 notes

OBJECTIVE AND SUBJECTIVE TEST

Select type tests – objective (MCQ, TRUE/FALSE)

Supply type tests – subjective (ESSAY, DICTATION)

Types of tests according to students’ expected response

o Selected response

True false

Matching

Multiple choice

o Constructed response

Fill in

Short answer

Performance test

o Personal response

Conferences

Portfolios

Self and peer assessments

TYPES OF TEST ITEMS TO ASSESS LANGUAGE CONTENT

Discrete Point Test

o Examine one element at a time

o Language is seen to be made up of smaller units and it may be possible to test

language by testing each unit at a time.

o Example – MCQ

Integrative Test

o Requires the candidates to combine many language elements in the

completion of a task.

o A simultaneous measure of knowledge and ability of a variety of language

features, modes or skills.

o Example – essays

Communicative Test

o Testing involves many aspects that revolves around communicative elements

and meaningful content.

o Involves performance, authenticity and scored on real-life outcomes (usually

behavioral)

TOPIC 7 – SCORING, GRADING AND ASSESSMENT CRITERIA

APPROACHES TO SCORING

1. OBJECTIVE APPROACH / METHOD

Relies on quantified methods of evaluating students’ writing

o Establish standardization – limit length of assessment (word count)

o Identify elements to be assessed – identify mistakes (spelling error,

grammar mistakes, vocabulary, etc)

o Operationalize the assessment – assign a weight score to each error,

according to distortion of readability or flow error.

o Quantify the assessment using Correctness score:

o 𝑤𝑜𝑟𝑑 𝑙𝑖𝑚𝑖𝑡

𝑠𝑢𝑚 𝑜𝑓 𝑒𝑟𝑟𝑜𝑟 𝑠𝑐𝑜𝑟𝑒𝑠

2. HOLISTIC APPROACH

Reader reacts to the students’ compositions as a whole and a single score is

awarded to the writing.

Each score will be accompanied with a general descriptor of abilities (rubric)

Primary trait scoring – particular functional focus is selected which is based on the

purpose of the writing and grading is based on how well the student is able to

express that function.

Emphasizes functional and communicative ability rather than ability and accuracy.

3. ANALYTIC APPROACH

A familiar approach for many evaluators.

Raters assess students’ performance on a variety of categories which are

hypothesized to make up the skill of writing

Example:

Components Weight

Content 30%

Organization 40%

Vocabulary 30%

Points assigned to each component reflects the importance of the components in

question.

Page 9: Language assessment tsl3123 notes

COMPARING THE THREE APPROACHES

Approach Advantages Disadvantages

Holistic Quickly graded

Provide public standard that is

understood by the teachers

and students alike

Relatively higher degree of

rater reliability

Applicable to assessment of

many different topics

Emphasize students’ strengths

rather than weaknesses.

Single score may actually

mask differences across

individual compositions

Does not provide a lot of

diagnostic feedback.

Analytic It provides clear guidelines in

grading in form of the various

components

Allows graders to consciously

address important aspects of

writing

Writing ability is unnaturally

split up into components.

Objective Emphasizes the students’

strengths rather than their

weaknesses.

Still some degree of

subjectivity involved.

Accentuates negative

aspects of learners’ writing

without giving credit for

what they can do well.

TOPIC 9 – REPORTING OF ASSESSMENT DATA

PURPOSES OF REPORTING

To obtain information concerning a particular behaviour or characteristic.

Based on those information, several types of decisions can be made.

o Instructional decisions

o Grading decisions

o Diagnostic decisions

o Selection decisions

o Placement decisions

o Counselling & guidance decisions

o Program or curriculum decisions

o Administrative decisions

REPORTING METHODS

Student achievement and progress can be reported by comparing:

Norm-referenced assessment and reporting

o Assessing and reporting students’ achievement and progress in comparison to

other students.

Criterion-referenced assessment and reporting

o In comparison to a predetermined criteria.

Outcome-based approach

o Provide information that students can be working towards syllabus outcomes

anywhere along the continuum of learning.

PRINCIPLES OF EFFECTIVE AND INFORMATIVE ASSESSMENT AND REPORTING

Has clear, direct links with outcomes

Is integral to teaching and learning

Is balanced, comprehensive and varied.

Is valid.

Is fair.

Engages the learner.

Values teacher judgment.

Time efficient and manageable

Page 10: Language assessment tsl3123 notes

Recognizes individual achievement and progress

Involves a whole school approach

Actively involves parents

Conveys meaningful and useful information.

TOPIC 10 – ISSUES AND CONCERNS RELATED TO ASSESSMENT IN MALAYSIAN

PRIMARY SCHOOLS

EXAM ORIENTED SYSTEM

Current education system is too examination oriented and over-emphasized rote

learning with institutions of higher learning fast becoming diploma mills.

Too focused on public examinations results as important determinants of students’

progression to higher levels of educations or occupational opportunities.

PTS (defunct) > UPSR > PMR > SPM/STAM /STPM

Summative national examinations itself should not have any negative impacts on

students – the issue is that they do not currently test the full range of skills that the

education system aspires to produce.

LP has started some reforms – PBS (SCHOOL BASED ASSESSMENT)

o School assessment – assesses students learning. Test questions and marking

schemes are developed, administered, scored and reported by school teachers

based on LP guidance.

o Central assessment – developed wholly by LP itself.

o Psychometric assessment – aptitude tests and personality inventory to assess

students’ skills, interests, aptitude and attitude and personality. Assess

students’ innate ability and acquired abilities.

o Physical, sports and co-curricular activities assessment – refers to

assessments of student performance and participation in physical and health

education, sports, uniformed bodies, etc.

o This new format enables broader assessment range over longer period of time.

o Teacher can focus more on delivering meaningful learning as stipulated in the

curriculum.

SCHOOL-BASED ASSESSMENT

Traditional assessment methods no longer adequate in today’s setting.

School-based assessment have immense potential in terms of flexibility and validity, but

have potential drawback in terms of reliability, quality control, and assurance.

Recommended because of the gains in validity which can be seen when students’

performance on assessed tasks can be judged in a greater range and context and more

than is possible in time limited tests.

Page 11: Language assessment tsl3123 notes

Malaysian SBA context

o Standard –referenced assessment

o Holistic

o Integrated

o Balanced

o Robust

Components

o Academic

School assessment (Performance Standards)

Centralized Assessment

o Non-academic

Physical activities, co-curricular,

Psychometric tests

ALTERNATIVE ASSESSMENTS

Procedures that differ from traditional notions.

PORTFOLIOS

Contain:

o Introductory section

o Academic works section

o Personal section

o Assessment section

SELF ASSESSMENT AND PEER ASSESSMENT

Remind learners that they are not working in isolation.

Help create a community of learners

Improve the product

Improve the process

Help learners to be reflective

Stimulate metacognition

TOPIC 8 – ITEM ANALYSIS AND INTERPRETATION

Page 12: Language assessment tsl3123 notes