data in empirical research some fundamental issues daniel gile 1d gile data in empir res

35
Data in empirical research Some fundamental issues Daniel Gile [email protected] www.cirinandgile.com 1 D Gile Data in empir res

Upload: christopher-sparks

Post on 08-Jan-2018

226 views

Category:

Documents


0 download

DESCRIPTION

Reminder: Data, the foundation of progress in CSA (2) So the quality of research is limited by the quantity and quality of the data on which it is based In many cases, it is difficult to: - Collect valid, relevant data - Measure the data in a way that will help advance towards finding an answer to the research question(s) - Extrapolate from the data that can be collected on part of the environment or population to which the research question(s) apply to the whole population If the data are not valid or representative of the population, no reliable inferences on the population can be made If cannot measure them adequately, they are of limited use 3D Gile Data in empir res

TRANSCRIPT

Page 1: Data in empirical research Some fundamental issues Daniel Gile  1D Gile Data in empir res

Data in empirical research Some fundamental issues

Daniel [email protected] www.cirinandgile.com

1D Gile Data in empir res

Page 2: Data in empirical research Some fundamental issues Daniel Gile  1D Gile Data in empir res

Reminder: Data, the foundation of progress in CSA (1)

In HSA, scholars can observe reality, and then speculate and theorize with much freedom

The norms of caution and rigorous inferencing make this impossible in CSA

In CSA theoretical speculation is acceptable - As a starting point for further empirical exploration- As a basis for theory construction, but the theory will need to be tested empirically- As tentative ideas to explain findings

But unlike the situation in HSA, in CSA,all progress is by definition based on data and their analysis

2D Gile Data in empir res

Page 3: Data in empirical research Some fundamental issues Daniel Gile  1D Gile Data in empir res

Reminder: Data, the foundation of progress in CSA (2)

So the quality of research is limited by the quantity and quality of the data on which it is based

In many cases, it is difficult to: - Collect valid, relevant data- Measure the data in a way that will help advance towards finding an answer to the research question(s)- Extrapolate from the data that can be collected on part of the environment or population to which the research question(s) apply to the whole population

If the data are not valid or representative of the population, no reliable inferences on the population can be made

If cannot measure them adequately, they are of limited use

3D Gile Data in empir res

Page 4: Data in empirical research Some fundamental issues Daniel Gile  1D Gile Data in empir res

Collecting data – Access and indicators

Access to the data is often problematic:Cost, confidentiality, difficult to detect…

Cost and complexity of technical equipmentPhysical access to the locationPermission to observe/record…

But more fundamentallyHow do you gain access to the content of dreams?

How do you gain access to mental processes?How do you gain access to skills for observation?

You cannot observe them directlyWhat you generally observe (and measure) are indicators

In other words, data are not the phenomenon itself, but an indicator of the phenomenon – more later

4D Gile Data in empir res

Page 5: Data in empirical research Some fundamental issues Daniel Gile  1D Gile Data in empir res

Collecting data – Identifying target data

When collecting data on a phenomenon or an indicatorInot always easy to identify the target data from other

information picked up

When studying language skills and using errors and infelicities as an indicator,

How identify error and infelicities in linguistic data?

When studying translation tactics (decisions made when confronting a problem)How distinguish between the result of a tactic

and the result of insufficient skills?(e.g. omissions, small semantic changes)

5D Gile Data in empir res

Page 6: Data in empirical research Some fundamental issues Daniel Gile  1D Gile Data in empir res

Problems with data validity (1)

Reminder: Research explores various phenomena in RealityGenerally, data are not the phenomena themselves, but something believed to ‘correspond’ to them in some way

For instance, When studying voting behavior, the data used, e.g. the number of

ballots cast in favor of a certain candidate, are not the voting behavior itself. They are something that reflects voting behavior.

One could say that generally, data are indicatorsThough the term ‘indicator’ tends to be used to call ‘something’ that is even more remotely connected to the reality it is supposed to represent

Data are said to be valid if they correspond strongly to what they are supposed to correspond to.

6D Gile Data in empir res

Page 7: Data in empirical research Some fundamental issues Daniel Gile  1D Gile Data in empir res

Problems with data validity (2)

Data are valid if one or some of their features correspond strongly to what they are supposed to correspond to in the

object of study.

Such correspondence may be required for detection onlyi.e. if and only if a particular feature of the object of study exists,

the data take on a particular feature and vice-versa(the presence of particular objects on archaeological sites is

valid data to indicate skills/beliefs/rituals in the population which lived in these particular sites)

Quantitative correspondence may be required in other cases(e.g. measuring the amount of radioactivity, of a particular

chemical substance etc…)

7D Gile Data in empir res

Page 8: Data in empirical research Some fundamental issues Daniel Gile  1D Gile Data in empir res

Data validity – uncertain correspondence (1)

Voting statistics are a valid indicator of voting behaviorWhat about voting intentions as stated in interviews?

are they valid as an indicator of voting behavior?

They say something about voting behavior, but that something is not enough to determine how people are going to vote

Because :Some people may change their mindSome people do not speak the truth

8D Gile Data in empir res

Data Phenomenon

Page 9: Data in empirical research Some fundamental issues Daniel Gile  1D Gile Data in empir res

Data validity – uncertain correspondence (2)

One frequent problem with data validity is the uncertain correspondence between the data and the target phenomenon

e.g. Native speakers’ assessment of a non-native speaker’s mastery of their language

(How sensitive are they to errors and infelicities? What are their personal norms? What are their expectations?…)

Students’ assessment of their teachers(Personal bias, political correctness…)

Problems because of interference from affective factors + (often subconscious) desire to preserve self-image

Ex.: In Translation Studies, relative weight of quality components

This problem is particularly frequent in behavioral sciences

9D Gile Data in empir res

Page 10: Data in empirical research Some fundamental issues Daniel Gile  1D Gile Data in empir res

Data validity – partial correspondence (1)

Are police reports about sexual assaults a valid indicator of actual sexual assault activity in a given city?

Most police reports about sexual assaults probably report genuine sexual assaults, but there are many which are never reported

because the victims are afraid to report them or ashamed

So the data are valid for one part of the phenomenon only

10D Gile Data in empir res

Data

Phenomenon?

Page 11: Data in empirical research Some fundamental issues Daniel Gile  1D Gile Data in empir res

Data validity – partial correspondence (2)

When data are valid for one part of the phenomenon only, whereas exploration of the whole phenomenon is sought

How safe is it to extrapolate from info on part of the phenomenon only?

(This is distinct from the issue of representativeness, taken up later)

Example:A single test to test language proficiency?Language proficiency is multi-dimensional

(declarative knowledge, procedural knowledge, distinct skills like pronunciation, fluency, reading ability, listening comprehension ability, flexibility in using various

registers…)

11D Gile Data in empir res

Page 12: Data in empirical research Some fundamental issues Daniel Gile  1D Gile Data in empir res

Validity of other research environment components

The validity of the data/the indicator chosen is not the only validity issue in empirical research

As will be seen later, especially in experimental researchEcological validity can be an issue

TaskEnvironmentParticipants

12D Gile Data in empir res

Page 13: Data in empirical research Some fundamental issues Daniel Gile  1D Gile Data in empir res

Measurable data

Often, advancing towards an answer to the research question(s) requires some kind of measurement of data(intensity, magnitude, amount, frequency…)

In some cases, this is rather easy(thermometer, number of ballots cast, money/time spent…)

In other cases, it is difficult(intensity of feelings, ‘amount’ of deviation from a norm…)

13D Gile Data in empir res

Page 14: Data in empirical research Some fundamental issues Daniel Gile  1D Gile Data in empir res

Representative data (1)

Generally, it is not possible to have data on all the object of study(cost, time [including future], physical access…)

You can only access data on part of itThey may be valid and measurable,

but are they representative of the whole object of study?Or of part of it only?

14D Gile Data in empir res

PhenomenonData

Page 15: Data in empirical research Some fundamental issues Daniel Gile  1D Gile Data in empir res

Representative data (2)

If the phenomenon is very homogeneousIf the accessible part has the same relevant features as the whole

The data are said to be representativeIf not, you cannot legitimately make inferences from your sample

on the whole

15D Gile Data in empir res

PhenomenonData

Page 16: Data in empirical research Some fundamental issues Daniel Gile  1D Gile Data in empir res

Validity and Representativeness

They are not the same:Data can be valid, that is, provide reliable indications

on part of a phenomenon/object of study(for instance, on a sample of people from a population)

Without being representativeBecause it is possible that the characteristics of the sample are

different from the characteristics of the population(for instance, the average height of a population, if the sample of

people used has a high proportion of basket-ball players)

16D Gile Data in empir res

Page 17: Data in empirical research Some fundamental issues Daniel Gile  1D Gile Data in empir res

Priorities and strategiesValidity is particularly important

Scientifically legitimate inferences on a phenomenon can only be made if the data are valid

Representativeness is less of a problemProvided no generalization is asserted

Measurability can be importantIf only to measure the actual impact of a particular factor or

feature on the object of studySometimes, measurability can be constructed

(scales)But limited measurability does not mean nothing can be learned

about the object of study → Qualitative research

17D Gile Data in empir res

Page 18: Data in empirical research Some fundamental issues Daniel Gile  1D Gile Data in empir res

The effects of variabilityOne other important issue in empirical research is

variability

Variability can be intrinsic to the phenomenon(for instance in meteorological phenomena)

It can also be a feature of the data collectedDue to intrinsic variability in the phenomenon and/or

Heterogeneity in the phenomenon and/orVariability in the collection procedures

Its effects can be very large

18D Gile Data in empir res

Page 19: Data in empirical research Some fundamental issues Daniel Gile  1D Gile Data in empir res

D. Gile Variability 19

CASE STUDY (FICTION): THE EFFECT OF EXPERIENCE ON TRANSLATION QUALITY

• Suppose you want to investigate the effect of experience on translation quality

• Suppose that in reality, on average, there is a fast progression along the learning curve during the first 5 years, and over the next decades, translators continue to improve, but at a lower and lower speed

Page 20: Data in empirical research Some fundamental issues Daniel Gile  1D Gile Data in empir res

D. Gile Variability 20

“REAL” AVERAGE PERFORMANCE VS. EXPERIENCE

As measured by some valid indicator on a scale from 1 to 10

Exper. 0 yrs 5 yrs 10 yrs 15 yrs 20 yrs 25 yrs

Qual. 1 5 7 8 8.5 8.8

Page 21: Data in empirical research Some fundamental issues Daniel Gile  1D Gile Data in empir res

D. Gile Variability 21

“Real” average learning curve

0123456789

10

Quality

Page 22: Data in empirical research Some fundamental issues Daniel Gile  1D Gile Data in empir res

D. Gile Variability 22

Effects of attitude

- The translators’ attitude towards translation may influence the quality of their work.

- Attitudes may change over time- Suppose that attitudes are very positive in the

beginning, that they become negative after a while because translators are disappointed with market conditions, and that they gradually become more positive when they adapt to the situation.

Page 23: Data in empirical research Some fundamental issues Daniel Gile  1D Gile Data in empir res

D. Gile Variability 23

Experience vs. Attitude

Very positive to very negative to positive

Exp. 0 yrs 5 yrs 10 yrs 15 yrs 20 yrs 25 yrs

Attit. + + + + + - - - - + +

Page 24: Data in empirical research Some fundamental issues Daniel Gile  1D Gile Data in empir res

D. Gile Variability 24

The effect of attitude: two scenarios

Exp. 0 yrs 5 yrs 10 yrs 15 yrs 20 yrs 25 yrs

Large influ.

+3 +2 -3 -1 +1 +1

Small influ.

+0.3 +0.2 -0.3 -0.1 +0.1 +0.1

Page 25: Data in empirical research Some fundamental issues Daniel Gile  1D Gile Data in empir res

D. Gile Variability 25

The effect of attitude

0

2

4

6

8

10

12

0 5 10 15 20 25

RealLarge influSmall influ

Page 26: Data in empirical research Some fundamental issues Daniel Gile  1D Gile Data in empir res

D. Gile Variability 26

The effect of attitude

- In the small influence scenario, the output pattern is only changed marginally

- In the large influence scenario, it is changed considerably. In particular, real improvement seems to occur only after 10 years of experience.

Page 27: Data in empirical research Some fundamental issues Daniel Gile  1D Gile Data in empir res

D. Gile Variability 27

Controllability

- Experimenters may be able to control attitude, for instance by telling participants that the quality of their output is important, or that they will be assessed by peers, etc.

- But it is not realistic to assume they can control everything – the participants’ personality, fatigue, biorhythm, likes or dislikes of certain types of texts, themes, etc.

Page 28: Data in empirical research Some fundamental issues Daniel Gile  1D Gile Data in empir res

D. Gile Variability 28

The effect of uncontrolled variability

Assume a variability of up to ±30%, either intrinsic or from uncontrolled factors:

Exp. 0 yrs 5 yrs 10 yrs 15 yrs 20 yrs 25 yrs

Var. +30% -30% -30% +30% -20% -30%

Page 29: Data in empirical research Some fundamental issues Daniel Gile  1D Gile Data in empir res

D. Gile Variability 29

The effect of uncontrolled variability

0

2

4

6

8

10

12

0 5 10 15 20 25

Realw/ variab,

Page 30: Data in empirical research Some fundamental issues Daniel Gile  1D Gile Data in empir res

D. Gile Variability 30

The effect of variability

- With such variability, very common in empirical studies in translation and interpreting

(actually, in such studies variability is often of several hundred percent),

the underlying “true” pattern is severely distorted

- In particular, from the data, it seems that improvement occurs for 15 years, after which there is a steady decline in the quality of the translation output.

Page 31: Data in empirical research Some fundamental issues Daniel Gile  1D Gile Data in empir res

D. Gile Variability 31

Consequences and conclusion (1)

Variability is a major enemy of research, in that it is likely to hide ‘true’ trends and suggest false trends.

In experiments, some variability is counter-balanced by the use of control over relevant variables, both in sampling and in the control of environmental and independent variables

Variability is further reduced by strict design and implementation of the experimental procedure

Replications also reduce the effects of variability by providing data for different constellation of parameters

Page 32: Data in empirical research Some fundamental issues Daniel Gile  1D Gile Data in empir res

D. Gile Variability 32

Consequences and conclusion (2)

But in behavioral sciences, residual variability is often very large

If you plan to do experimental research, expect to find high variability, and do not be disappointed if this happens.

Unless you need to arrive at a ‘clear-cut result’, results that are not clear cut can also be of interest

They may show for instance that there is no regular, clear ‘superiority’ of one method or one

condition over another

so don’t let the probability of not reaching ‘significance’ stop you from doing the research.

Page 33: Data in empirical research Some fundamental issues Daniel Gile  1D Gile Data in empir res

D. Gile data in empir 33

The sensitivity of indicators/tools (1)

The concepts of ‘signal’ and ‘noise’:(from radio transmission)

In empirical research, when seeking to collect data, you need tools with a certain sensitivity

For instance, casual listeners will not necessarily spot traces of foreign accent or infelicities in a non-native speaker

Their sensitivity to these phenomena may be too lowAnd they will miss the ‘signal’ which is supposed to be detected

Other listeners may be too sensitive and mistake ‘native’ deviations from norms for signs of non-native language use(certain violations of rules of grammar, false cognates…)

Page 34: Data in empirical research Some fundamental issues Daniel Gile  1D Gile Data in empir res

D. Gile data in empir 34

Sensitivity of data collection tools (2)

a

b

c

At level a Not sensitive enough. Does not pick up the signal, or picks up part of it only At level b Appropriate sensitivity. Picks up the signal, not the noiseAt level c Too sensitive. Picks up the signal and the noise

Sensitivity

Page 35: Data in empirical research Some fundamental issues Daniel Gile  1D Gile Data in empir res

D. Gile data in empir 35

The sensitivity of indicators/tools (3)

Very high sensitivity which may pick up the ‘noise’(i.e. non-signal)

is all right if it is then possible to filter out the noise from the signal

But often, this is not possible,Because the noise is very similar to the signal

Other tactics may helpOne is triangulation,

i.e. using a different method to throw a different light on the phenomenon/data, including qualitative methods