r_m_handout

Upload: gaurav-gangwar-surya

Post on 14-Apr-2018

222 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/30/2019 R_M_Handout

    1/13

    Research Methodology Handout

    Parametric Tests

    Tests based on assumptions about population distributions and parameters. The assumptions for

    parametric tests:

    Interval or ratio level data (SPSS Scale data)

    Normal distribution or closely so

    Homogeneity of variance - the variance (standard deviation squared) should be similar in each

    group

    Samples randomly drawn from the population

    t -Test

    T test is a parametric test used for comparing samples means to see, if there is sufficient evidence to

    infer that the means of the corresponding population distribution also differ.

    Independent-Samples t - Test

    The independent sample t-test compares the means of two different samples. The two samples share

    some variable of interest in common but there is no overlap between the memberships of the two

    groups.

    Example the difference between males and females on an exam score.

    Activity To test whether there is a difference between males and females on total points earned

    (Data File used grades.sav).

    Paired Samples t Test

    The paired samples t test is usually based on groups of individuals, who experience both conditions of

    the variables of interest.

    Example students score on the first quiz v/s the same students score on the second quiz.

    Activity To compare the distribution of scores on quiz1 with scores on quiz2 (Data File used

    grades.sav).

    One Sample t-test

    A one sample t-test allows us to test whether a sample mean (of a normally distributed interval

    variable) significantly differs from a hypothesized (preset) value.

    Example Does a course offered to college seniors result in a GRE score greater than or equal to 1200.

  • 7/30/2019 R_M_Handout

    2/13

    ActivityTo determine if the percent values for the entire class differed significantly from 85 (Data

    File used grades.sav).

    Non-Parametric Tests

    Tests that make no assumptions about population parameters or distributions.

    MannWhitney U test (also called the MannWhitneyWilcoxon (MWW) or Wilcoxon rank-

    sum test)

    It is a non-parametric statistical hypothesis test for assessing whether two independent samples of

    observations have equally large values. It is one of the most well-known non-parametric significance

    tests.

    MannWhitney U test accomplishes essentially what a t-test does when the distributions of the twosamples deviate significantly from normal. If the distribution does not differ significantly from normal

    then, t-test should be used because it has greater power.

    Assumptions:

    1 All the observations from both groups are independent of each other,

    2 The responses are ordinal or continuous measurements

    Activity To determine whether women score higher than men on final exam (Data File used

    grades.sav).

    The Sign Test

    It utilizes pair wise comparisons of two different distributions to identify which is larger than which,

    and then from this information it determines if the two distributions differ significantly from each

    other.

    The sign test can be used to test the hypothesis that there is "no difference in medians" between the

    continuous distributions of two random variables X and Y, in the situation when we can draw paired

    samples from X and Y.

    Assumptions:

    Let Zi = Yi Xi for i = 1, ... , n.

    1 The differences Zi are assumed to be independent.

    2 Each Zi comes from the same continuous population.

    3 The values of Xi and Yi represent are ordered (at least the ordinal scale), so the

  • 7/30/2019 R_M_Handout

    3/13

    4 comparisons "greater than", "less than", and "equal to" are meaningful.

    Activity To determine whether the scores on quiz1 were significantly higher than the scores on

    quiz2 (Data File used grades.sav).

    Wilcoxon Matched-Pairs Signed-Ranks Test -

    Wilcoxon test is a nonparametric test that compares two paired groups. It calculates the difference

    between each set of pairs, and analyzes that list of differences.

    The difficulty with the sign test is that a difference between paired quizzes of 10 (10/1, 0 on the other)

    and the difference of 1 (e.g. 6/1, 5 on the other) will be coded identically (as the magnitude of the

    differences as a negative).

    Wilcoxon matched-pairs signed-ranks test incorporates information about the magnitude of the

    differences between paired values.

    ActivityTo compare scores on on quiz1 with scores on quiz2 (Data File used grades.sav).

    The Runs Test

    The runs test is used to see if the elements of a particular data set are randomly distributed. If the

    sequence HHTHTTHTTHTHTTTTHTH resulted from flipping a coin, does this sequence differ

    significantly from randomness? In other words are we flipping a biased coin unfortunately this

    procure works only with dichotomous data, it is not possible to test from instance if we are rolling a

    loaded die.

    The runs test is a non-parametric statistical test that checks a randomness hypothesis for a two-valued

    data sequence. More precisely, it can be used to test the hypothesis that the elements of the sequence

    are mutually independent.

    Runs tests can be used to test:

    The randomness of a distribution, by taking the data in the given order and marking with + the data

    greater than the median, and with the data less than the median; (Numbers equaling the median are

    omitted.)

    Whether a function fits well to a data set, by marking the data exceeding the function value with

    + and the other data with . For this use, the runs test, which takes into account the signs but not the

    distances, is complementary to the chi square test, which takes into account the distances but not the

    signs.

    ActivityTo see if males and females are distributed randomly (Data File used grades.sav).

    Kolmogorov-Smirlov One Sample Test

  • 7/30/2019 R_M_Handout

    4/13

    Is a nonparametric test for the equality of continuous, one-dimensional probability distributions that

    can be used to compare a sample with a reference probability distribution (one-sample KS test), or to

    compare two samples (two-sample KS test), this is designates to measure whether a particular

    distribution differs significantly from a normal distribution (skewness and kurtosis is equals to zero), a

    uniform distribution (values are distributed evely, such as the numbers 1-100 consecutively), apoisson distribution (the value equals the mean and the variance of the distribution; as becomes

    large, the distribution approximates normality), or an exponential distribution.

    Activity To test whether the final variable deviates significantly from normal (Data File used

    grades.sav).

    ANOVA

    Analysis of variance is a procedure used for comparing sample means to see if there is sufficient

    evidence to infer that the means of the corresponding population distributions also differ. Whereas, t-

    tests compare only two distributions, analysis of variance is able to compare many.

    For example: If we want to see among five ethnic groups, any groups score differed significantly

    form each other on the same quiz, it would require one- way analysis of variance to accomplish this.

    One Way ANOVA

    Using the One Way ANOVA command, we may have exactly one dependent variable (always

    continuous) and exactly one independent variable (always categorical).

    Activity To conduct a one-way analysis of variance to see if any of four ethnic groups differ on

    their quiz4 scores (Data File used grades.sav).

    Multivariate Analysis

    Multiple Regressions

    Multiple Regression a technique for estimating the value of the criterion variable (Y) from values on

    two or more other predictor variables (Xs). It employs the same rationale as simple regression and the

    formula is a logical extension of that for linear regression:

    Y = b0 + b1X1 + b2 X2 + b3 X3 +.... etc

    Types of Multiple Regressions

    1 Standard Multiple regression

  • 7/30/2019 R_M_Handout

    5/13

    2 Hierarchical Multiple Regression

    3 Stepwise Multiple Regression

    Decisions about the order of entry for predictors are made solely on statistical decision in the stepwise

    regression SPSS programme. Forward entry involves the entry of the IVs one at a time as selected bySPSS. Backward selection is the reverse of the forward entry commencing with the insertion of all

    IVs with SPSS deleting successively those that fail to meet certain critical significance values. SPSS

    allows you to choose either forward or backward entry by clicking on the Method box

    Key Terminology

    Predictor variable: A variable (IV) from which a value is used to estimate a value on another

    variable (DV)

    Criterion variable: A variable (DV) a value of which is estimated from a value of the predictor

    variable (IV)

    Coefficient of Determination (R Square): This represents the proportion of variation in the criterion

    variable (Y) which is explained (or accounted for) by variation in the predictor variable (X).

    Adjusted R Square: It is the modified version of R square which takes into account the number of

    independent variables in the equation and the sample size.

    Multicollinearity: Create a correlation matrix and inspect for high correlations of 0.90 and above as

    this implies the two variables are measuring the same variance and will over-inflate R. Therefore only

    one of the two variables is needed.

    Activity To run the regression procedure with a dependent variable of zhelp and independent

    variables of sympathy, severity, empatend and anger, using the Stepwise method (Data File used

    helping1.sav).

    Factor Analysis

    Factor analysis originated in psychometrics, and is used in behavioral sciences, social sciences,

    marketing, product management, operations research, and other applied sciences that deal with large

    quantities of data. The primary objective of factor analysis is data reduction ibn order to simplify and

    reduce noise by identifying basic underlying latent factors or components that explain a large portion

    of the variation in the data set parsimoniously.

    Types of factor analysis

    Explanatory Factor Analysis (EFA) aims to reduce large number of variables into a smaller

    number of factors and thereby identify the factor structure or model. This type is exploratory in

  • 7/30/2019 R_M_Handout

    6/13

    nature.

    Confirmatory Factor Analysis (CFA) aims to confirm theoretical predictions whether a

    specified set of constructs are influencing responses in a predicted way. It provides a way of

    confirming that the factor structure or model obtained through EFA study is robust.

    Types of factoring

    Principal component analysis (PCA): This is the most common form of factor analysis, it

    gives a linear combination of variables in a manner that the maximum variance is extracted from the

    variables. And then removes this variance and seeks a second linear combination which explains the

    maximum proportion of the remaining variance, and so on.

    Canonical factor analysis: It is also known Rao's canonical factoring, and is a different

    method of computing the same model as PCA. It seeks factors which have the highest canonical

    correlation with the observed variables and it is unaffected by arbitrary rescaling of the data.

    Common factor analysis: It seeks the least number of factors which can account for thecommon variance (correlation) of a set of variables. It is also called principal factor analysis (PFA) or

    principal axis factoring (PAF).

    Image factoring: It is based on the correlation matrix of predicted variables rather than actual

    variables, where each variable is predicted from the others using multiple regression.

    Alpha factoring: It is based on maximizing the reliability of factors, assuming variables are

    randomly sampled from a universe of variables. All other methods assume cases to be sampled and

    variables fixed.

    Key Terminology

    Factor loadings: The factor loadings are also known as component loadings and are the

    correlation coefficients between the individual variables and their respective factors.

    Communality: The communality measures the percent of variance in a given variable

    explained by all the factors jointly and may be interpreted as the reliability of the indicator.

    Eigenvalues/Characteristic roots: The eigenvalue for a given factor measures the variance in

    all the variables which is accounted for by that factor. Eigenvalues measure the amount of variation in

    the total sample accounted for by each factor.

    Factor scores: Also called component scores in PCA, factor scores are the scores of each case

    on each factor. It is a composite measure for each observation on each factor extracted in factor

    analysis. Factor scores may be used as variables in subsequent modeling.

    Kaiser criterion: The Kaiser rule is to drop all components with eigenvalues under 1.0.

    Scree plot: The scree test says to drop all further components after the one starting the elbow.

  • 7/30/2019 R_M_Handout

    7/13

    Varimax rotation: It is an orthogonal rotation of the factor axes to maximize the variance of

    the squared loadings of a factor on all the variables in a factor matrix. Each factor will tend to have

    either large or small loadings of any particular variable. A varimax solution yields results which make

    it as easy as possible to identify each variable with a single factor. This is the most common rotationoption.

    Quartimax rotation: It is an orthogonal alternative which minimizes the number of factors

    needed to explain each variable. This type of rotation often generates a general factor on which most

    variables are loaded to a high or medium degree. Such a factor structure is usually not helpful to the

    research purpose.

    Equimax rotation: It is a compromise between Varimax and Quartimax criteria.

    Direct oblimin rotation: Itis the standard method when one wishes a non-orthogonal

    (oblique) solution that is, one in which the factors are allowed to be correlated.

    Activity To conduct a factor analysis with the fifteen efficacy items (effic1 to effic15) with all the

    default options (plus the varimax rotation). (Data File used helping2.sav).

    Chi Square

    Chi Square is the most common and simple non-parametric test of significance investigating

    associations between categories of nominal variables where observations can be classified into

    discrete categories and treated as frequencies.

    For Example:

    - Is there a significant preference for one of three brands of toothpaste among a sample of

    children;

    - Is there a significant association between membership or not of a trade union among full-time

    and part-time employees;

    - Are there gender preferences for various types of investment category?

    USE OF CHI SQUARE

    Chi Square tests hypotheses about the independence (or association) of frequency counts in

    various categories. The hypotheses are:

    H0 where the variables are statistically independent or no statistical association, and

    H1 where the variables are statistically dependent or associated.

    For example H0 would state that there is no significant association between your gender and

    which toothpaste you prefer; or that union membership is independent of (not associated with)

    type of employment, i.e. that the cross-categories from each variable are independent of each

  • 7/30/2019 R_M_Handout

    8/13

    other.

    TWO FORMS OF CHI SQUARE

    There are two forms1. Goodness-of-Fit Chi Square

    2. Cross-tabulations (contingency tables)

    But to whichever of these uses chi square is put, the general principle remains the same.

    We compare the observed proportions in a sample with the expected proportions and apply the

    chi-square test to determine whether the difference between observed and expected

    proportions is likely to be a function of sampling error (non-significant - retaining the null

    hypothesis H0 ) or unlikely to be a function of sampling error (significant association - reject

    the null hypothesis and support alternate hypothesis - H1 ).

    GOODNESS OF FIT

    A goodness-of-fit test - how well does an observed distribution fit a hypothesized or theoretical

    distribution

    are some brands of frozen peas chosen by consumers more than others?;

    is absence through sickness regularly distributed through the working week or is sick

    leave more frequent on some days than other days?;

    are choices on a survey item with a three- point response scale of yes, no opinion,no, equally divided or is there a significant preference for one choice to the item?

    The formula for chi square is the summationfor each cell:

    Chi2 = (O - E)

    E

    Where:

    O = observed frequency - the data observed in our research/survey

    E = expected frequency, and

    = the summation over all the cells in the table

    CROSS-TABULATION

    This is a two-dimensional table showing frequencies in each combination of categories for two

    nominal variables - each of which can be divided into two or more sub-categories,e.g.

    preference for type of music (classical, jazz, country and western, rock) against age

  • 7/30/2019 R_M_Handout

    9/13

    group (below 21; 21 - 45; above 45)

    length of service in year groupings against job position level

    CONTINGENCY AND CROSS-TABULATION TABLES

    1. The 2 x 2 contingency table has two variables each divided into two categories only organized

    by rows and columns, i.e. 4 cells.

    2. Cross-tabulation tables have more than two rows and two columns, e.g. are investment types

    associated with age groups. But with increasing rows and columns, interpretation of results becomes

    more complex and sample sizes must be larger so that sufficient observed counts occur in each cell.

    RESTRICTIONS IN THE USE OF THE CHI SQUARE

    chi square is only appropriate for data that are classified as frequency of occurrence (counts)

    within categories (nominal data)

    it must only be used on frequencies, never on percentages categories must be mutually exclusive - each response can be classified into only one cell

    larger samples are needed when there are many categories within each variable.

    A rule-of-thumb is that the expected frequency in all cells should at least equal or be

    greater than 5.

    Fusing of categories is not really desirable, since it involves a reduction in the amount

    of information available.

    Activity (a) To test the hypothesis that the distribution of choices for soft drink is random, that is,

    there is no significant preference for any specific drink.(Data File used chi square).

    (b) To test the hypothesis that there is no significant relationship between gender and whether the

    person smokes or not.(Data File used chi square).

  • 7/30/2019 R_M_Handout

    10/13

    Data Files

    Grades.sav

    The data file is the raw data for calculating the grades in a particular class.

    Variable Description

    ID: six digit student id number

    Last Name: the last name of the student

    First Name: the first name of the student

    Gender: the gender of the student: 1=F, 2=M

    Ethnic: the ethnicity of the student: 1=native, 2=Asian, 3= Balck, 4=white,

    Year:

    5= Hispanic year in school: 1=1st year, 2= 2nd year, 3=3rd year,

    4=4th yearLowup : lower or upper division student: 1=lower, 2= upper

    Section: section of the class (1-3)GPA: cumulative GPA at the beginning of the courseEXTCR: whether or not the student did the extra credit project: 1= no, 2=yes

    Review: whether or not the student attended the review sessions : 1= no,

    2=yes

    Quiz1 to Quiz 5: scores out of 10 points on 5 quizzes through out the term

    Final: final exam worth 75 points.

  • 7/30/2019 R_M_Handout

    11/13

    Helping1.sav

    This data file is related to a study of helping behavior; it is real data derived from a sample of 81

    subjects. The following variables will be used in the description; all variables except z help (z score

    between -3 to +3) are measured on a little (1) to much (7) scale.

    Variable Description

    Z help: The dependent variable (a measure of total amount of time spent

    helping a friend with a problem)

    Sympathy: The feelings of sympathy aroused in the helper by the friends need.

    Anger: The feelings of anger or irritation aroused in the helper by the friends need.

    Efficacy: Self efficacy of the helper in relation to the friends need.

    Severity: Helpers rating of how severe the friends problem was.

    Empatend: Empathic tendency of the helper as measured by a personality test.

    Helping2.sav

    In the helping2 file, self- efficacy (belief that one has the ability to help effectively) , was measured by

    15 questions, each paired with an amount of help question that measured a particular type of

    helping. An example of one of the paired questions follows:

    a. Time spent expressing sympathy, empathy or understanding

    None 0- 15 minutes 15- 30 minutes 30- 60 minutes 1- 2 hours 2- 5 hours ---hours

    b. Did you belie you were capable of expressing sympathy, empathy or understanding to your

    friend?

    1 2 3 4 5 6 7

    Not at all some very much so

    There were three categories of help represented in the 15 questions: 6 were intended to measure

    empathic types of helping, 4 questions were intended to measure informational type of helping, 4

    questions were intended to measure instrumental type of helping, and the 15 th question was open

    ended to allow any additional type of help given to be inserted. Factor analysis was conducted on the

    15 self efficacy questions to see if the results would yield three categories of self efficacy that were

    originally intended.

  • 7/30/2019 R_M_Handout

    12/13

    Variable Description

    effic1 efficacy for encourage reassureeffic2 efficacy for tasks or services

    effic3 efficacy for appraise/clarify

    effic4 efficacy for validate affirmeffic5 efficacy for loaning mateialseffic6 efficacy for information advice

    effic7 efficacy for express willingness to help

    effic8 efficacy for participate in activitieseffic9 efficacy for find someone to help

    effic10 efficacy for express sympathy empathy concern

    effic11 efficacy for reduce tension tell jokeseffic12 efficacy for teach to do better

    effic13 efficacy for empathic listening

    effic14 efficacy for relieve of self blame

    effic15 efficacy for open-ended question

    Graduate.sav

    A wealth of information is collected about each applicant prior to acceptance, and department records

    indicate whether that student was successful in completing the course. Our example uses the

    information collected prior to acceptance to predict successful completion of a graduate program. The

    file is called graduate file and consists of 50 students admitted into the program between 7 and 11

    years ago. The dependant variable is category (1=finished the PhD, 2=did not finish), and 27 predictorvariables are utilized to predict category membership in one of these two groups:

    Variable Description

    Gender: 1= female, 2=male

    Age: age in years at the time of application

    Marital: 1=M, 2=S

    GPA: overall undergraduate GPA

    Area GPA: GPA in the area of specialty

    Grearea: score on the major area section of the GRE

    Grequent: score on the quantitative section of the GRE

    Greverbal: score on the verbal section of the GRE

    Letter1: first of the three recommendation letters (rated1=weak through 9=

    strong)

    Letter2: second of the three recommendation letters

    (rated1=weak through 9= strong)

    Letter3: third of the three recommendation letters

    Motive: applicants level of motivation (1=low to 9 =high)

  • 7/30/2019 R_M_Handout

    13/13

    Stable: applicants emotional stability (same scale for this and all the

    follow)

    Resource: financial resources and support system in place

    Interact: applicants ability to interact comfortably with peers and superiors

    Hostile: applicants level of inner hostilityImpress: impression of selectors who conducted an interview