validity and reliability of research instrument

47
VALIDITY AND RELIABILITY OF RESEARCH INSTRUMENT JEFFERSON S. VALDEZ MAEd – Educ.Mgt.

Upload: jeffersonsubiate

Post on 03-Oct-2015

114 views

Category:

Documents


4 download

DESCRIPTION

Research Instrument

TRANSCRIPT

  • VALIDITY AND RELIABILITY OF RESEARCH INSTRUMENT

    JEFFERSON S. VALDEZMAEd Educ.Mgt.

  • The measuring instruments are used for data gathering or collecting data and are important devices because the success or failure of a study lies on data gathered.

  • The significance of any research paper or its entirety, for that matter, can be put to waste if the instrumentation is questionable.

    As a researcher, you are thus, cautioned to exercise extra care in designing the data collection procedures that you will employ in your research, especially in choosing or constructing research instruments.

  • CRITERIA OF A GOOD INSTRUMENT

    VALIDITY (truthfulness)

    RELIABILITY (consistency & accuracy)

  • VALIDITYthe degree to which a test or measuring instrument measures what it intends to measure.

    it has to do with soundness, (what the test or questionnaire measure its effectiveness)

    degree to which a test measures what it really purports to measure

  • deals with the relationship of data obtained to the nature of the variable being studied.

    the determination of the degree of validity is through indirect measures.

  • TYPES OF VALIDITYContent validity Concurrent ValidityCriterion related ValidityConstruct ValidityPredictive Validity

  • 1. CONTENT VALIDITYthe degree to which an instrument measures an intended content area.

    the degree to which the test represents the essence, topics, and the test is designed to measure.

    reported in terms of non-numerical data unlike the other types of validity.

  • Ways to achieve a high degree of content validity:

    Documentary analysis or Pre-surveyDevelopment of a Table of Specifications (TOS)Consultation with ExpertsItem Writing

  • Documentary analysis or Pre-survey

    At this stage, the researcher familiarized herself with the theoretical constructs directly related to the test. The review of related literature and studies provide a comprehensive knowledge of the nature of the test criterion.

    Focus on the test used, purposes of the said tests, the areas covered, format, scaling techniques, etc.

  • A pre-survey may therefore start the development phase of the instrument that you are constructing.

  • Developing Evaluation Scale for a School System

    Administering two-item questionnaire to a small group of respondents can generate areas and items for the test you will construct. In developing an Evaluation Scale for a school system, for instance the researcher can ask these two questions to a group of teachers and/or school heads:

    1. What conditions in your school enhance childrens learning?2. What conditions in your schools reduce pupils learning?

    The lists of conditions generated can become a rich source for constructing the test items.

  • Development of a Table of Specifications (TOS)

    A detailed T0S includes areas or concepts, objectives, number of items and percentage or proportion of items in each area.

    It is advisable to make a 50 to 100% allowance in the construction of items.

  • Consultation with ExpertsThe researchers competence in judging the instrument is limited, at this point, it is advisable to consult the thesis advisers or with some authorities who have the expertise in making judgement about the representativeness or relevance of the entries made in the TOS.

  • Item WritingAt this stage, the creative talent of the writer should decide what type of items are supposed to construct: the type of instrument, scoring techniques.etc.

    The quality of the test items, therefore, depends on the considerable extent upon the researchers ability to produce ideas and translate into items and satisfy the TOS.

  • 2.CONCURRENT VALIDITYThe degree to which the test agrees or correlates with a criterion set up as an acceptable measure. The criterion is always available at the time of testing.

    Correspondence of one measure of a phenomenon with another of the same construct (administered at the same time)

    Two tools are used to measure the same concept and then a correlation analysis is performed. The tool which is already demonstrated to be valid is the gold standard with which the other measure correlate.

  • Example: A researcher wishes to validate a Biology achievement test he has constructed. He administers this test to a group of Biology students. The result of this test is correlated an acceptable Biology test which has been previously proven as valid. If the correlation is high, the Biology test he has constructed is valid.

  • 3. CRITERION RELATED VALIDITYIt is characterized by prediction of relation to an outside criterion and by checking a measuring instrument, either now or in the future, against some outcome or measure,

    The difficulty usually met in this type of validity is in selecting or judging which criterion should be used to validate the measure at hand

  • Example: If the criterion set for professionalism is in nursing is belonging to nursing organizations and reading nursing journals, then couldnt we count memberships and subscriptions to come up with a professionalism score?

  • 4. PREDICTIVE VALIDITY It is determined by showing how well predictions made from the test are confirmed by evidence gather at some subsequent time.

    The criterion measure against this type of validity is important because the outcome of the subjects is predicted.

    The ability of one measure to predict another future measure of the same concept.

  • Example:The researcher wants to estimate how well a student maybe able to do in graduate school courses on the basis of how well he has done on test he took in undergraduate courses.

    The criterion measure against which the test scores are validated and obtained are available after a long period of interval.

  • Example:If IQ predicts SAT and SAT predicts QPA, then shouldnt IQ predict QPA (we could skip SATs for admission decisions.

    If scores on parenthood readiness scale indicate levels of integrity, trust, intimacy and identity couldnt this test be used to predict successful achievement of the developmental tasks of adulthood?

  • 5. CONSTRUCT VALIDITYSometimes called as concept validity

    It is the extent to which a test measures the theoretical construct or trait. This involves such as those of understanding, appreciation and interpretation of data.

  • The main concern lies in the property being measured or the meaning of the test rather than the test itself.

    Examples are intelligence and mechanical aptitude tests.

  • Example: A researcher wishes to establish the validity of an IQ (Intelligence Quotient) using Weschler Adult Intelligence Scale (WAIS). He hypothesizes that students with high IQ also have high achievement and those with low IQ, low achievement. He therefore administers both WAIS and achievement tests to two groups of students with high and low IQ respectively. If the results show the same with the hypothesis, the test is valid.

  • RELIABILITYtest is dependable, self consistent and stable

    It is concerned with the consistency of the responses from moment to moment.

    The instrument yields the same results over repeated measures and subjects

  • Four methods in estimating the reliability of the good research instrument:

    Test-retest method Parallel-forms methodSplit-half methodInternal consistency method

  • 1. TEST RETEST METHODThe research instrument is administered twice to the same group of subjects and the correlation coefficient is determined.

    The limitations of this method are:

    1. When the time interval is short, the subjects may recall his previous responses and this tends to make correlation high.2. When the time interval is long, such factors as unlearning, forgetting, among others, may occur and may result in low correlation of the test..

  • 3. Regardless of the time interval separating the two administrations, other varying conditions such as noise, temperature, lightning and other factors may affect the correlation coefficient of the research instrument.

    A Spearman rank correlation of coefficient or Spearman rho is a statistics used to measure the relationship between paired ranks assigned to individual scores on two variables.

  • Formula: rs= 1 6ED2------------------ N3 N

    Where: rs = Spearman RhoED2 = sum of the squared differences between ranksN = total number of cases

  • Interpretation of Correlation Coefficient Value

    Correlation value (r)Interpretation0.00 - 0.20Negligible 0.21 - 0.40Low / slight 0.41 - 0.70Marked /moderate 0.71 - 0.90High 0.91 - 0.99Very High 1.00Perfect

  • 2. PARALLEL FORMS METHODThe test may be administered to the group of subjects and the paired observations are correlated.

    Two forms of the test must be constructed so that the content, type of item, difficulty, instructions for administration and many others, are similar but not identical.

  • The correlation between the scores obtained on paired observations of these two forms represents the reliability coefficient of the test.

    If the coefficient correlation (r) value obtained is high, the research instrument is reliable.

    The higher the reliability coefficient, the lower the variance. (70 or higher = acceptable)

  • Example: The item, Convert 7,000 grams to kilograms in Form A is parallel to Convert 7 kilograms to grams in Form B. Moreover, these forms should have approximately the same average and variability of scores.

    Form A : I am able to tell my partner how I feel.

    Form B: My partner tries to understand my feelings.

  • Assessment of Depression

    Version A : During the past 4 weeks, I have felt downhearted: 1- everyday, 2 some days, 3- never

    Version B: During the past 4 weeks, I have felt downhearted: 1- never, 2 some days, 3- everyday

  • Assessment of loneliness

    Version A: How often the past months have you felt alone in the world?1- everyday2- some days3- occasionally4- neverVersion B: During the past 4 weeks, how often have you felt a sense of lineliness?1- all of the time2- sometimes3- from time to time4- never

  • Equivalent or Non-equivalent rewording?Version 1: When your boss blames you for something you did not do, how often do you stick up for yourself?1 always2- sometimes3- never

    Version 2: When presented with difficult professional situations where a superior censures you an act for which you are not responsible, how frequently do you respond in assertive way?1 always2- sometimes3- never

  • 3.SPLIT- HALF METHODThis method may be administered once, but the test items are divided into two halves. The common procedure is to divide a test into odd and even items.

    The two halves of the test must be similar but not identical in content, number of items, difficulty, means and standard deviations.

  • Each student obtains two scores, one on the odd and the other on the even items in the test.

    The result is a realibility coefficient for half test, Spearman-Brown Formula for a whole test.

  • Formula: 2 (rht)rwt = ------------ 1 + rht

    Where :rwt is the reliability of a whole test; and rht reliability of a half test.

  • Interpretation of Correlation Coefficient Value

    Correlation value (r)Interpretation0.00 - 0.20Negligible 0.21 - 0.40Low / slight 0.41 - 0.70Marked /moderate 0.71 - 0.90High 0.91 - 0.99Very High 1.00Perfect

  • 4. INTERNAL CONSISTENCY METHODThis method is used with psychological tests which consists of dichotomously scored items.

    The examinee either passes or fails in an item.

  • A rating of 1 (one) is assigned for a pass and for 0 (zero) a failure.

    The method of obtaining is determined by Kuder- Richardson Formula 20. measuring the internal consistency or homogeneity of the measuring instrument.

  • Formula: N SD2 - Epiqirxx = ------ ------------------ N -1 SD2Where: N = number of items SD2= the variance on the scores on test defined as E (X Xm); ---------------- N -1 piqi= product of the proportion passing or failing in item i pi= proportion of individuals passing qi= proportion of individuals failing (qi = 1- pi)

  • Interpretation of Correlation Coefficient Value

    Correlation value (r)Interpretation0.00 - 0.20Negligible 0.21 - 0.40Low / slight 0.41 - 0.70Marked /moderate 0.71 - 0.90High 0.91 - 0.99Very High 1.00Perfect