young children job satisfaction

Upload: lee-hou-yew

Post on 02-Jun-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/10/2019 Young Children Job Satisfaction

    1/49

  • 8/10/2019 Young Children Job Satisfaction

    2/49

    Slide 2

    Stage One: Define the Research Problem

    In this stage, the following issues are addressed:

    Relationship to be analyzed

    Specifying the dependent and independent variablesMethod for including independent variables

    Young Children and Job Satisfaction

    Relationship to be analyzed

    "We are interested in examining the effect of young children on the job satisfaction ofmen and women involved in a variety of work and family roles to see how the presenceof family responsibilities affects their happiness at work. The research is comparative. Itinvolves contrasts between men and women in different work and marital statuses asseveral points in time." (page 800)

  • 8/10/2019 Young Children Job Satisfaction

    3/49

    Slide 3

    Specifying the dependent and independent variables

    The dependent variable is job satisfaction, measured on a four category Likert-scale:1=Very Satisfied, 2=Moderately Satisfied, 3=A Little Dissatisfied, and 4=Very Dissatisfied.Because the data does not follow a normal distribution (See page 803-804), the authors

    recoded the variable to a dichotomous variable where 1 = Very Satisfied and 0 =Moderately Satisfied to Very Dissatisfied. The purpose of the analysis, then, is todetermine what factors contribute to a high level of job satisfaction versus some otherlevel of job satisfaction. With a dichotomous dependent variable, logistic regressionbecomes the analytic techniques of choice.

    The independent variables are grouped into two categories:

    1. Individual and family characteristics (age, race, education, spouse's work status,

    prestige of spouse's occupation, number of children, presence of young children, generalhappiness, and satisfaction with family)

    2. Job characteristics (income, job prestige, job authority, job autonomy,convenience (number of hours worked per week), and past work experience).

    The variable presence of young children is important to answering the main question ofthe article.

    Other variables, which could have been included as independent variables, were used todivide the sample into subgroups which were compared with each other to answer theresearch questions. For example, Sex and Work Status were combined to form acomposite variable WORK_SEX. We will use these variables with the SPSS "Select Casescommand to produce the results for different groups.

    Young Children and Job Satisfaction

  • 8/10/2019 Young Children Job Satisfaction

    4/49

    Slide 4

    Method for including independent variables

    With a dichotomous dependent variable and a variety of independent variables, thestatistical technique to use is logistic regression. While we could structure the analysisto do hierarchical entry of variables (individual, family characteristics, and job

    characteristics in block 1 and the presence of young children in block 2), we will usedirect entry of all variables on a single step to conform to the authors analysis.

    Young Children and Job Satisfaction

  • 8/10/2019 Young Children Job Satisfaction

    5/49

    Slide 5

    Stage 2: Develop the Analysis Plan: Sample Size Issues

    In this stage, the following issues are addressed:

    Missing data analysis

    Minimum sample size requirement: 15-20 cases per independent variable

    Young Children and Job Satisfaction

  • 8/10/2019 Young Children Job Satisfaction

    6/49

    Slide 6

    Missing data analysis

    In the missing data analysis, we are looking for a pattern or process whereby the patternof missing data could influence the results of the statistical analysis.

    The data set for this problem is used for a large number of analyses in the article. Notall variables and cases are used in each analysis, so it makes sense to conduct themissing data analysis on the cases and variables to be included in the problem in thisexercise.

    We will compute the logistic regression model for 1976-77 married, full-time males aspresented in table 2 on page 807. (Note: this analysis does not include the independentvariables SPOCCUP 'Spouses Occupation' and EVWORK 'Ever Work as Long as One Year').

    First, we will exclude the cases not used in this exercise and then we will examinemissing data for the variables used in this exercise.

    Young Children and Job Satisfaction

  • 8/10/2019 Young Children Job Satisfaction

    7/49Slide 7

    Specify the Cases to Include in this Analysis

    Young Children and Job Satisfaction

  • 8/10/2019 Young Children Job Satisfaction

    8/49Slide 8

    Enter the Selection Criterion

    Young Children and Job Satisfaction

  • 8/10/2019 Young Children Job Satisfaction

    9/49Slide 9

    Run the MissingDataCheck Script

    Young Children and Job Satisfaction

  • 8/10/2019 Young Children Job Satisfaction

    10/49Slide 10

    Complete the 'Check for Missing Data' Dialog Box

    Young Children and Job Satisfaction

  • 8/10/2019 Young Children Job Satisfaction

    11/49Slide 11

    Number of Valid and Missing Cases per Variable

    Two independent variables have relatively large numbers of missing cases:JCINCOME 'Job Characteristic - Income' and AUTHORIT 'Job Characteristic - Authority'.

    However, all variables have valid data for 90% or more of cases, so no variables will beexcluded for an excessive number of missing cases.

    Young Children and Job Satisfaction

  • 8/10/2019 Young Children Job Satisfaction

    12/49Slide 12

    Frequency of Cases that are Missing Variables

    Next, we examine the number of missing variables per case. Of the possible 14 variablesin the analysis (13 independent variables and 1 dependent variable), one cases wasmissing half of the variables (7) and should be excluded from the remaining analyses.

    Young Children and Job Satisfaction

  • 8/10/2019 Young Children Job Satisfaction

    13/49

  • 8/10/2019 Young Children Job Satisfaction

    14/49

    Slide 14

    Correlation Matrix of Valid/Missing Dichotomous Variables

    The largest correlation in the matrix of valid/missing data (not shown) is 0.363. None ofthe correlations for missing data values are above the weak level, so we can deletemissing cases without fear that we are distorting the solution.

    Young Children and Job Satisfaction

  • 8/10/2019 Young Children Job Satisfaction

    15/49

    Slide 15

    Minimum sample size requirement:15-20 cases per independent variable

    If we accept the SPSS default of listwise deletion of missing data, we will have 538 casesin the analysis. The ratio of cases to independent variables is 538/13 or 41 to 1. We

    meet this requirement.

    Young Children and Job Satisfaction

  • 8/10/2019 Young Children Job Satisfaction

    16/49

    Slide 16

    Stage 2: Develop the Analysis Plan: Measurement Issues:

    In this stage, the following issues are addressed:

    Incorporating nonmetric data with dummy variables

    Representing Curvilinear Effects with PolynomialsRepresenting Interaction or Moderator Effects

    Young Children and Job Satisfaction

    Incorporating Nonmetric Data with Dummy Variables

    All of the nonmetric variables have recoded into dichotomous dummy-coded variables.

    Representing Curvilinear Effects with Polynomials

    We do not have any evidence of curvilinear effects at this point in the analysis.

    Representing Interaction or Moderator Effects

    We do not have any evidence at this point in the analysis that we should add interactionor moderator variables.

  • 8/10/2019 Young Children Job Satisfaction

    17/49

    Slide 17

    Stage 3: Evaluate Underlying Assumptions

    In this stage, the following issues are addressed:

    Nonmetric dependent variable with two groups

    Metric or dummy-coded independent variables

    Young Children and Job Satisfaction

    Nonmetric dependent variable having two groups

    The dependent variable 'Job satisfaction' was recoded into dichotomous categories.

    Metric or dummy-coded independent variables

    Marital status, race, spouse's work status, presence of young children, job authority, jobautonomy, and ever worked as long as one year are all coded as dichotomous variables.

    Age of respondent, highest year of school completed, prestige of spouse's occupation,number or children, general happiness, satisfaction with family, income, job prestige,hours worked (convenience), and year of the survey can be treated as metric variables.

  • 8/10/2019 Young Children Job Satisfaction

    18/49

    Slide 18

    Stage 4: Estimation of Logistic Regression andAssessing Overall Fit: Model Estimation

    In this stage, the following issues are addressed:

    Compute logistic regression model

    Young Children and Job Satisfaction

    Compute the logistic regression

    The steps to obtain a logistic regression analysis are detailed on the following screens.

    If the cases to be included in this analysis were not selected in the missing data analysis,the selection needs to be completed before proceeding.

  • 8/10/2019 Young Children Job Satisfaction

    19/49

  • 8/10/2019 Young Children Job Satisfaction

    20/49

    Slide 20

    Specifying the Dependent Variable

    Young Children and Job Satisfaction

  • 8/10/2019 Young Children Job Satisfaction

    21/49

  • 8/10/2019 Young Children Job Satisfaction

    22/49

    Slide 22

    Specify the method for entering variables

    Young Children and Job Satisfaction

  • 8/10/2019 Young Children Job Satisfaction

    23/49

    Slide 23

    Specifying Options to Include in the Output

    Young Children and Job Satisfaction

  • 8/10/2019 Young Children Job Satisfaction

    24/49

    Slide 24

    Specifying the New Variables to Save

    Young Children and Job Satisfaction

  • 8/10/2019 Young Children Job Satisfaction

    25/49

    Slide 25

    Complete the Logistic Regression Request

    Young Children and Job Satisfaction

  • 8/10/2019 Young Children Job Satisfaction

    26/49

    Slide 26

    Stage 4: Estimation of Logistic Regression andAssessing Overall Fit: Assessing Model Fit

    In this stage, the following issues are addressed:

    Significance test of the model log likelihood (Change in -2LL)Measures Analogous to R: Cox and Snell R and Nagelkerke RHosmer-Lemeshow Goodness-of-fitClassification matricesCheck for Numerical ProblemsPresence of outliers

    Young Children and Job Satisfaction

  • 8/10/2019 Young Children Job Satisfaction

    27/49

    Slide 27

    Initial statistics before independent variables are included

    The Initial Log Likelihood Function, (-2 Log Likelihood or -2LL) is a statistical measurelike total sums of squares in regression. If our independent variables have a relationshipto the dependent variable, we will improve our ability to predict the dependent variable

    accurately, and the log likelihood value will decrease. The initial 2LL value is 742.850on step 0, before any variables have been added to the model.

    Young Children and Job Satisfaction

  • 8/10/2019 Young Children Job Satisfaction

    28/49

    Slide 28

    Significance test of the model log likelihood

    The difference between these two measures is the model child-square value (57.153 =742.850 685.697) that is tested for statistical significance. This test is analogous to theF-test for R or change in R value in multiple regression which tests whether or not the

    improvement in the model associated with the additional variables is statisticallysignificant.

    In this problem the model Chi-Square value of 57.153 has a significance of 0.000, lessthan 0.05, so we conclude that there is a significant relationship between the dependentvariable and the set of independent variables.

    Young Children and Job Satisfaction

  • 8/10/2019 Young Children Job Satisfaction

    29/49

    Slide 29

    Measures Analogous to R

    The next SPSS outputs indicate the strength of the relationship between the dependentvariable and the independent variables, analogous to the R measures in multipleregression.

    The Cox and Snell R measure operates like R, with higher values indicating greatermodel fit. However, this measure is limited in that it cannot reach the maximum valueof 1, so Nagelkerke proposed a modification that had the range from 0 to 1. We will relyupon Nagelkerke's measure as indicating the strength of the relationship.

    Based on the interpretive criteria, we would characterize this model as weak.

    Young Children and Job Satisfaction

  • 8/10/2019 Young Children Job Satisfaction

    30/49

    Slide 30

    Correspondence of Actual and Predicted Valuesof the Dependent Variable

    The final measure of model fit is the Hosmer and Lemeshow goodness-of-fit statistic,which measures the correspondence between the actual and predicted values of thedependent variable. In this case, better model fit is indicated by a smaller difference in

    the observed and predicted classification. A good model fit is indicated by anonsignificant chi-square value.

    The goodness-of-fit measure has a value of 5.678 which has the desirable outcome of

    nonsignificance.Young Children and Job Satisfaction

  • 8/10/2019 Young Children Job Satisfaction

    31/49

    Slide 31

    The Classification Matrices

    The classification matrices in logistic regression serve the same function as theclassification matrices in Young Children and Job Satisfaction, i.e. evaluating theaccuracy of the model.

    To evaluate the accuracy of the model, we compute the proportional by chance accuracyrate and the maximum by chance accuracy rates, if appropriate. Since the sizes of thegroups in this problem are equal to 46% and 54%, the proportional accuracy criterion isappropriate because we do not have a dominant group.

    The proportional by chance accuracy rate is equal to 0.503 (0.463^2 + 0.537^2). A 25%increase over the by chance accuracy rate would equal 0.628.

    Our model accuracy race of 63.2% meets this criterion.

  • 8/10/2019 Young Children Job Satisfaction

    32/49

  • 8/10/2019 Young Children Job Satisfaction

    33/49

    Slide 33

    Check for Numerical Problems

    There are several numerical problems that can in logistic regression that are notdetected by SPSS or other statistical packages: multicollinearity among the independentvariables, zero cells for a dummy-coded independent variable because all of the

    subjects have the same value for the variable, and "complete separation" whereby thetwo groups in the dependent event variable can be perfectly separated by scores on oneof the independent variables.

    All of these problems produce large standard errors (over 2) for the variables included inthe analysis and very often produce very large B coefficients as well. If we encounterlarge standard errors for the predictor variables, we should examine frequency tables,one-way ANOVAs, and correlations for the variables involved to try to identify the sourceof the problem.

    The standarderrors and Bcoefficients arenot excessivelylarge, so there isno evidence of anumeric problemwith this analysis.

    Young Children and Job Satisfaction

  • 8/10/2019 Young Children Job Satisfaction

    34/49

    Slide 34

    There are two outputs to alert us to outliers that we might consider excluding from theanalysis: listing of residuals and saving Cook's distance scores to the data set.

    SPSS provides a casewise list of residuals that identify cases whose residual is above orbelow a certain number of standard deviation units. Like multiple regression there are avariety of ways to compute the residual. In logistic regression, the residual is thedifference between the observed probability of the dependent variable event and thepredicted probability based on the model. The standardized residual is the residualdivided by an estimate of its standard deviation. The deviance is calculated by takingthe square root of -2 x the log of the predicted probability for the observed group andattaching a negative sign if the event did not occur for that case. Large values fordeviance indicate that the model does not fit the case well. The studentized residual

    for a case is the change in the model deviance if the case is excluded. Discrepanciesbetween the deviance and the studentized residual may identify unusual cases. (See theSPSS chapter on Logistic Regression Analysis for additional details).

    In the output for our problem, SPSS listed one cases that have may be considered anoutlier with a studentized residuals greater than 2:

    Presence of outliers

    Young Children and Job Satisfaction

  • 8/10/2019 Young Children Job Satisfaction

    35/49

    Slide 35

    Cooks Distance

    SPSS has an option to compute Cook's distance as a measure of influential cases and addthe score to the data editor. I am not aware of a precise formula for determining whatcutoff value should be used, so we will rely on the more traditional method for

    interpreting Cook's distance which is to identify cases that either have a score of 1.0 orhigher, or cases which have a Cook's distance substantially different from the other. Theprescribed method for detecting unusually large Cook's distance scores is to create ascatterplot of Cook's distance scores versus case id.

    SPSS Sample Problem

  • 8/10/2019 Young Children Job Satisfaction

    36/49

  • 8/10/2019 Young Children Job Satisfaction

    37/49

    Slide 37

    Specifying the Variables for the Scatterplot

    Young Children and Job Satisfaction

  • 8/10/2019 Young Children Job Satisfaction

    38/49

    Slide 38

    The Scatterplot of Cook's Distances

    Horizontal gridlines were added to the scatterplot to aid interpretation. Based on thegridlines, we can identify four cases with Cook's distances about 0.175 as influentialcases.

    After sorting the data set by theCook's distance variable, weidentify the four cases as havingid numbers: 99, 1807, 1833, and1953. None of these cases wereincluded on the casewise listingfor large studentized residuals.

    Based on these outputs, weidentify five cases out of 538 thatare potential outliers. Since thenumber of outliers representsless than 1% of the sample andnone of the outliers are reallyextreme, I will opt to retain themin the analysis.

    Young Children and Job Satisfaction

  • 8/10/2019 Young Children Job Satisfaction

    39/49

    Slide 39

    Stage 5: Interpret the Results

    In this section, we address the following issues:

    Identifying the statistically significant predictor variables

    Direction of relationship and contribution to dependent variable

    Young Children and Job Satisfaction

  • 8/10/2019 Young Children Job Satisfaction

    40/49

    Slide 40

    Identifying the statistically significant predictor variables

    The table of variables in the equation identifies for us the predictor variables that havea statistically significant individual relationship to the dependent variable. Scanning the'Sig' column, we identify four variables that have a significance level less than

    0.05: GENHAPPY 'How Happy Generally', PRESTIGE 'Job Characteristic - Prestige',CONVENIE 'Job Characteristic - Convenience', and YEAR 'GSS Year for Respondent'.

    Young Children and Job Satisfaction

  • 8/10/2019 Young Children Job Satisfaction

    41/49

    Slide 41

    Direction of relationship and contribution to dependent variable - 1

    The sign of the B coefficients indicates whether the predictor variable increased ordecreased the likelihood of belonging to the group of respondents who were verysatisfied with their jobs.

    Young Children and Job Satisfaction

  • 8/10/2019 Young Children Job Satisfaction

    42/49

    Slide 42

    Direction of relationship and contribution to dependent variable - 2

    The coefficient signs for the variables GENHAPPY 'How Happy Generally', PRESTIGE 'JobCharacteristic - Prestige', and CONVENIE 'Job Characteristic - Convenience' were all

    positive, indicating that a higher score on these variables enhanced the likelihood ofbelonging to the group that was very satisfied with their jobs. The coefficient for YEARwas negative, indicating that job satisfaction has been declining in later years of thesurvey.

    The magnitude of change associated with each independent variable is given in the oddsratio column labeled 'Exp (B)'. This column indicates the increased or decreased odds ofbelonging to the group that was very satisfied with their jobs.

    For each unit increment on the measure of overall happiness, a respondent was 1.76times more likely to be very satisfied with his or her job. For each unit increment in jobprestige, a subject was 1.02 times as likely to be very satisfied with his or her job. Foreach unit increment in job convenience (or hours worked), a subject was 1.02 times aslikely to be very satisfied with his or her job. Finally, for each increase in year, asubject was 0.65 times as likely to be very satisfied with his or her job, i.e. was lesslikely to be satisfied.

    Important to the research question raised by the authors is the finding that

    CHILDLT6 'Presence of Young Children' did not have a statistically significant impact onjob satisfaction.

    Young Children and Job Satisfaction

  • 8/10/2019 Young Children Job Satisfaction

    43/49

  • 8/10/2019 Young Children Job Satisfaction

    44/49

    Slide 44

    Set the Starting Point for Random Number Generation

    Young Children and Job Satisfaction

  • 8/10/2019 Young Children Job Satisfaction

    45/49

  • 8/10/2019 Young Children Job Satisfaction

    46/49

    Slide 46

    Specify the Cases to Include in the First Screening Sample

    Young Children and Job Satisfaction

  • 8/10/2019 Young Children Job Satisfaction

    47/49

    S if th V l f th S l ti V i bl

  • 8/10/2019 Young Children Job Satisfaction

    48/49

    Slide 48

    Specify the Value of the Selection Variablefor the Second Validation Analysis

    Young Children and Job Satisfaction

  • 8/10/2019 Young Children Job Satisfaction

    49/49

    Generalizability of the Logistic Regression Model

    Only one predictor variable, CONVENIE 'Job Characteristic - Convenience, has a stable,statistically significant relationship to the dependent variable, Job Satisfaction.In addition, the accuracy that we should evaluate in assessing our model is in the 56% to59% range rather than in the 63% to 72% range. At this accuracy rate, the model doesnot represent a 25% increase over the proportional by chance accuracy rate.

    In sum, we do find a relationship between one of the independent variables and jobsatisfaction. Our findings should be regarded as tentative or exploratory rather thandefinitive because we would not meet the classification accuracy rate required for ausable model

    Full Model

    Split=0

    Split=1

    Model Chi-Square

    57.153, p=.0000

    54.386, p