predicting truck driver turnover

13
Predicting truck driver turnover Yoshinori Suzuki a, * , Michael R. Crum a , Gregory R. Pautsch b a Department of Logistics, Operations, and Management Information Systems, College of Business, Iowa State University, 2340 Gerdin Business Building, Ames, IA 50011-1350, United States b Department of Economics, College of Business and Public Administration, Drake University, 344 Aliber Hall, Des Moines, IA 50311-4505, United States article info Article history: Received 16 May 2008 Received in revised form 19 November 2008 Accepted 24 January 2009 Keywords: Motor carrier Truck driver turnover rate Survival model Decision support model abstract We propose a decision tool for truckload carriers that can help control driver turnover rates. Our approach is to use an existing econometric method, along with the drivers’ work data, to predict the quit probability of each driver on a weekly basis, so that carriers can identify a subset of drivers who are ‘‘about to quit” in a timely manner. Empirical results from two case studies indicate that our approach does a nice job of predicting driver exits, and that it may become a useful management decision tool. Our method was recently adopted by two US truckload carriers. Ó 2009 Elsevier Ltd. All rights reserved. 1. Introduction US truckload (TL) carriers are plagued with extremely high driver turnover rates. American Trucking Association reported a first quarter 2008 turnover rate of 103% for large TL carriers (over $300 million in annual revenue) and 80% for smaller TL carriers (Transport Topics, 2008). Industry turnover rates have persisted at or above these levels for more than a decade. These rates suggest that the expected employment duration of an average truck driver may be less than 12 months for large TL carriers, and less than 15 months for smaller TL carriers. This phenomenon is caused mainly by: (1) the shortage of heavy- duty truck drivers (see, e.g., US Department of Transportation (2007) for selected statistics), and (2) the very nature of the occupation (a mobile and varied workplace inherently facilitates job shifting). From the carriers’ viewpoint, there are several disadvantages associated with high driver turnover rates. Perhaps the most damaging one is the high cost of replacing drivers. Past studies report that the cost of driver replacement (per replacement) can range from $2200 to $21,000, depending on situations (we will review these studies later). 1 This means that, even if we assume a rather conservative replacement cost of $8000 (Rodriguez et al., 2000), a carrier with 1000 drivers (medium size) and 100% turnover rate (industry average) may incur an annual expense of $8 million just to replace lost drivers. Other disadvan- tages include: (1) loss of driver skills and experiences, (2) reduced customer service quality (e.g., on-time delivery rate), and (3) worsened road safety (see, e.g., Curtis and Wright (2001) and Corsi and Fanara (1988) for further details). Carriers have attempted to reduce the driver turnover rate by using several methods. These methods include: (1) provid- ing good wages, (2) providing good fringe benefits, (3) giving monetary rewards for long stays, and (4) using newer trucks 1366-5545/$ - see front matter Ó 2009 Elsevier Ltd. All rights reserved. doi:10.1016/j.tre.2009.01.008 * Corresponding author. Tel.: +1 515 294 5577; fax: +1 515 294 2534. E-mail address: [email protected] (Y. Suzuki). 1 Driver replacement cost typically includes such items as tractor repositioning cost, drug screening cost for new drivers, road testing and training costs for new drivers, and opportunity cost associated with replacement (lost profit). See Truckload Carriers Association (2004) for the detailed itemization of driver replacement costs. Transportation Research Part E 45 (2009) 538–550 Contents lists available at ScienceDirect Transportation Research Part E journal homepage: www.elsevier.com/locate/tre

Upload: yoshinori-suzuki

Post on 30-Oct-2016

221 views

Category:

Documents


8 download

TRANSCRIPT

Page 1: Predicting truck driver turnover

Transportation Research Part E 45 (2009) 538–550

Contents lists available at ScienceDirect

Transportation Research Part E

journal homepage: www.elsevier .com/locate / t re

Predicting truck driver turnover

Yoshinori Suzuki a,*, Michael R. Crum a, Gregory R. Pautsch b

a Department of Logistics, Operations, and Management Information Systems, College of Business, Iowa State University,2340 Gerdin Business Building, Ames, IA 50011-1350, United Statesb Department of Economics, College of Business and Public Administration, Drake University, 344 Aliber Hall, Des Moines, IA 50311-4505, United States

a r t i c l e i n f o

Article history:Received 16 May 2008Received in revised form 19 November 2008Accepted 24 January 2009

Keywords:Motor carrierTruck driver turnover rateSurvival modelDecision support model

1366-5545/$ - see front matter � 2009 Elsevier Ltddoi:10.1016/j.tre.2009.01.008

* Corresponding author. Tel.: +1 515 294 5577; faE-mail address: [email protected] (Y. Suzuki).

1 Driver replacement cost typically includes such itnew drivers, and opportunity cost associated with rereplacement costs.

a b s t r a c t

We propose a decision tool for truckload carriers that can help control driver turnoverrates. Our approach is to use an existing econometric method, along with the drivers’ workdata, to predict the quit probability of each driver on a weekly basis, so that carriers canidentify a subset of drivers who are ‘‘about to quit” in a timely manner. Empirical resultsfrom two case studies indicate that our approach does a nice job of predicting driver exits,and that it may become a useful management decision tool. Our method was recentlyadopted by two US truckload carriers.

� 2009 Elsevier Ltd. All rights reserved.

1. Introduction

US truckload (TL) carriers are plagued with extremely high driver turnover rates. American Trucking Association reporteda first quarter 2008 turnover rate of 103% for large TL carriers (over $300 million in annual revenue) and 80% for smaller TLcarriers (Transport Topics, 2008). Industry turnover rates have persisted at or above these levels for more than a decade.These rates suggest that the expected employment duration of an average truck driver may be less than 12 months for largeTL carriers, and less than 15 months for smaller TL carriers. This phenomenon is caused mainly by: (1) the shortage of heavy-duty truck drivers (see, e.g., US Department of Transportation (2007) for selected statistics), and (2) the very nature of theoccupation (a mobile and varied workplace inherently facilitates job shifting).

From the carriers’ viewpoint, there are several disadvantages associated with high driver turnover rates. Perhaps the mostdamaging one is the high cost of replacing drivers. Past studies report that the cost of driver replacement (per replacement)can range from $2200 to $21,000, depending on situations (we will review these studies later).1 This means that, even if weassume a rather conservative replacement cost of $8000 (Rodriguez et al., 2000), a carrier with 1000 drivers (medium size) and100% turnover rate (industry average) may incur an annual expense of $8 million just to replace lost drivers. Other disadvan-tages include: (1) loss of driver skills and experiences, (2) reduced customer service quality (e.g., on-time delivery rate), and (3)worsened road safety (see, e.g., Curtis and Wright (2001) and Corsi and Fanara (1988) for further details).

Carriers have attempted to reduce the driver turnover rate by using several methods. These methods include: (1) provid-ing good wages, (2) providing good fringe benefits, (3) giving monetary rewards for long stays, and (4) using newer trucks

. All rights reserved.

x: +1 515 294 2534.

ems as tractor repositioning cost, drug screening cost for new drivers, road testing and training costs forplacement (lost profit). See Truckload Carriers Association (2004) for the detailed itemization of driver

Page 2: Predicting truck driver turnover

Y. Suzuki et al. / Transportation Research Part E 45 (2009) 538–550 539

(see, e.g., Min and Lambert, 2002). These methods, however, did not fully solve the problem (Min and Eman, 2003). Perhapsthe main reason is that the above methods give only the ‘‘short-run” solutions to carriers. For example, if a carrier raises itspay (wage per mile) to drivers, it may tentatively attract and retain many drivers, but once other carriers match theirpay (which almost always happens for competitive reasons) the carrier in question will no longer attract or retain drivers(Belman and Monaco, 2005). Thus, increasing wages or fringe benefits may only induce more ‘‘game playing” by truck driversto ‘‘job hop” from one carrier to another in search for better pay (Staplin and Gish, 2005).

To find better ways to retain drivers, researchers have empirically analyzed drivers’ turnover behaviors. These studieshave successfully identified several factors that affect drivers’ turnover decisions (we will review these studies later). Thepast studies, however, have two types of common limitations. First, these studies are primarily cross-sectional (static) stud-ies. Thus, the past studies gave little or no implications on: (1) how a driver’s likelihood of job quit changes over time andwhy, and (2) how long the exogenous variables will affect the drivers’ turnover decisions into the future. Note that, from thecarriers’ viewpoint, these are important questions that must be addressed to fully understand (and predict) the stochasticdriver turnover behavior (Kammeryer-Muller et al., 2005). Second, the past studies mainly used the survey data. Whilethe use of survey data is suitable for measuring drivers’ attitudes, perceptions, or intents, they do not accurately measurethe operational work variables, such as the weekly miles and weekly home time, many of which are believed to be good pre-dictors of driver turnover (e.g., see Min and Lambert, 2002). The above conditions suggest that the driver turnover literaturelacks the studies that combine time-series econometric approaches with operational work variables.

From the practical standpoint, a model that employs a time-series method along with the operational work variables(data) has several advantages over the conventional survey-based static approach. First, unlike the survey data (which rep-resent the ‘‘perceived” or ‘‘stated” data), the operational work variables represent the ‘‘actual” or ‘‘revealed” data, so thatthe estimation results are expected to be more cognitively congruent with the actual behavior (Econometrics Laboratory,2000). Second, from the data collection standpoint, the operational work data are preferred to the survey data because theformer data are readily available to carriers (most carriers maintain the internal record of operational work data due toregulatory requirements). Third, by combining the operational work data with a time-series model, we can overcomethe limitations of previous studies, such as the inability to capture the dynamic effects of exogenous variables on driverturnover decisions. Fourth, if used in conjunction with the actual work data, a time-series econometric model can be usedas a practical decision tool to predict (forecast) the quit probability of each driver on a regular basis (by using the latestwork data as predictors).2

The above paragraph implies that the approach that combines a time-series model with operational work variables isattractive. Such a model not only overcomes the limitations of previous studies, but also can potentially become a usefuldecision tool for carriers that allows managers to identify a subset of drivers who are ‘‘about to quit” in a timely manner(by forecasting each driver’s quit probability on a regular basis). This condition suggests that, by using the model, carrierscan focus their attention on preventing the exits of ‘‘potential leavers”. Given such features, several TL carriers have ex-pressed interest in providing us their data to investigate how well the time-series model with operational work variablescan explain or predict the actual exits of truck drivers. In this paper we report the results of two case studies in whichwe applied a survival analysis technique to the operational work data of heavy-duty truck drivers, and discuss the effective-ness of our approach from a variety of perspectives.

The goal of this paper is to explore the following questions: (1) are the operational work variables good predictors of dri-ver turnover, (2) how well do the operational work variables predict driver turnover relative to the demographics variablessuch as age and gender (which have been used widely in past studies), and (3) can the model estimated with operationalwork data become a useful decision tool for carriers? If the study results indicate that our approach can predict the driverexits reasonably well such that it can be used as a decision tool, this study will make an important contribution to the driverturnover literature by giving a new method of solving the key industry problem; i.e., control the driver turnover by contin-uously monitoring the predicted quit probability of each driver. It should be noted that our method does not induce ‘‘gameplaying” (job hopping) by drivers, because it focuses on preventing driver exits through improved job satisfaction (by closelymonitoring their quit probabilities which reflect the extent to which they are not satisfied with work conditions) rather thanattracting drivers from other carriers. Thus, unlike the traditional methods (e.g., pay raise), our method may give a ‘‘long-run”solution to carriers that reduces one’s turnover rate without increasing those of others.

2. Literature review

Our review of literature indicates that the truck driver turnover studies consist of two types. The first is the set of eco-nomic studies that estimated the cost of driver replacement to assess the impact (or magnitude) of the driver turnover prob-lem within the industry. The second is the set of studies that used statistical techniques to identify the determinants of driverturnover.

2 Although the driver turnover prediction (on a regular basis) can also be performed by using the survey data, it may be unpractical to do so, as this methodrequires frequent (e.g., weekly) collections of survey data (which is not only expensive, but also inadequate because the data may suffer from the possible lowresponse rates).

Page 3: Predicting truck driver turnover

Table 1Summary of selected driver turnover studies.

Reference Data collect.method

Data type Dependentvariablea

Key factors identified

Beilock and Capelle(1990)

Interview Cross-section

Intention Driver demographics (e.g., education, age, union status)

Rodriguez and Griffen(1990)

Survey Cross-section

Perception Work environment (e.g., pay, home time)

Taylor and LeMay (1991) Survey Cross-section

Intention Recruiting method (e.g., driver referral, magazine adv.)

LeMay et al. (1993) Survey Cross-section

Perception Work environment (e.g., pay, total miles)

Richard et al. (1995) Survey Cross-section

Intention Manager quality (e.g., driver attitude toward dispatcher)

Stephenson and Fox(1996)

Survey Cross-section

Intention Work environment (e.g., pay, empty miles, job security)

Shaw et al. (1998) Survey Cross-section

Actual exit Driver demographics, work environment (e.g., age, union status, pay,home time)

Keller and Ozment(1999a)

Survey Cross-section

Perception Manager quality (e.g., driver attitude toward dispatcher)

Keller and Ozment(1999b)

Survey Cross-section

Perception Manager quality (e.g., driver attitude toward dispatcher)

Keller (2002) Survey Cross-section

Perception Work environment, manager quality (e.g., pay, home time, dispatcherquality)

Min and Lambert (2002) Survey Cross-section

Perception Recruiting method, work environment (e.g., referral, advancementopportunity)

Min and Eman (2003) Survey Cross-section

Perception Driver demographics, firm size (e.g., job category, driving experience)

De Croon et al. (2004) Survey Longitudinal Actual exit Work environment (e.g., job strain)Morrow et al. (2005) Tel. survey Cross-

sectionActual exit Manager quality, work environment (e.g., attitude toward dispatcher)

a Intention = drivers’ intention of company exit; perception = managers’ (dispatchers’) perception of driver exits.

540 Y. Suzuki et al. / Transportation Research Part E 45 (2009) 538–550

2.1. Cost of driver replacement

Several studies have estimated the cost of replacing a truck driver. Rather conservative estimates have been reported inthe studies conducted by practitioners. For example, Truckload Carriers Association (2004) reports that the driver replace-ment cost is typically around $3000 per replacement. These estimates, however, may not be accurate, as they consideredonly the direct (observable) cost of driver replacement. More precise estimates that include the indirect costs of driverreplacement have been provided by the academic side, which typically claim that the cost is roughly $8000 per replacement,with the range being from $2200 to almost $21,000 (see, e.g., Richard et al., 1994; Rodriguez et al., 2000). The overall cost ofdriver replacement in the industry has been estimated as high as 3 billion dollars annually (Keller and Ozment, 1999b).

2.2. Determinants of driver turnover

Another type of driver turnover studies is the one that analyzed the determinants of driver turnover empirically. Most ofthese studies used survey instruments to collect the necessary data, and measured the dependent variable by either the drivers’intent to exit the company (i.e., stated preference) or the supervisors’ perception on the frequency of driver exits (see Table 1).

Early works of this type focused on investigating the effect of driver demographics and perceived work environments onturnover. These studies found that the drivers’ turnover behavior is affected by such factors as driver age, educational level,union status, job category, prior driving experience, past income, pay rate, amount of miles driven, home time, empty (deadhaul) miles, job strain, job security, and job advancement opportunity within the company (see e.g., Beilock and Capelle,1990; Rodriguez and Griffen, 1990; LeMay et al., 1993; Richard et al., 1995; Stephenson and Fox, 1996; Shaw et al., 1998;Keller, 2002; Min and Lambert, 2002). Some studies (e.g., Taylor and LeMay, 1991) investigated the effect of recruiting meth-od on driver turnover, and found that the expected employment duration of those drivers who are hired via certain recruit-ing methods (e.g., word-of-mouth referral through drivers) may be longer than that of those who are hired via other methods(e.g., TV and radio advertisement).

While these early works have made important contributions, they investigated only the effect of directly-observable vari-ables on driver turnover; i.e., they did not consider the effect of more implicit variables that reflect the drivers’ psychologicalfactors. Such implicit effects are examined by more recent works. These studies investigated the effect of drivers’ attitudetoward their supervisors (dispatchers) or the quality of relationship with supervisors on turnover decisions (while control-ling for the effects of directly-observable variables). Works that belong to this category include Keller and Ozment (1999a,b),Keller (2002), and Morrow et al. (2005). These studies found that drivers’ turnover decisions are often affected by suchfactors as the supervisors’ friendliness, work experience, willingness to listen to drivers’ complaints and requests, as wellas the drivers’ perceived satisfaction with supervision.

Page 4: Predicting truck driver turnover

Y. Suzuki et al. / Transportation Research Part E 45 (2009) 538–550 541

Table 1 shows that there is at least one study which used the longitudinal (time-series) approach to analyze the driverturnover (De Croon et al., 2004). This study explicitly considered the dynamic nature of turnover decisions by using repeatedmeasures of predictor variables (i.e., multiple surveys were sent to the same set of drivers over time). An important aspect ofthis study is that it investigated not only whether a driver will quit (change job) or not, but also when the driver will quit.This type of study should provide important behavioral implications to both carrier managers and transportation research-ers. The study, however, used only two ‘‘waves” of measures (i.e., surveys were sent twice to the respondents) to capture thedynamic effect of predictor variables on turnover, which perhaps was not frequent enough to correctly capture the truedynamics involved in the human decision-making processes.

2.3. Summary

In conclusion, our literature review indicates two things. First, previous studies have shown that the driver turnoverbehavior is primarily affected by the following four factors: (1) driver demographics, (2) work environment, (3) supervisors(dispatchers), and (4) recruiting methods. Second, while many driver turnover studies exist in the literature, no study has:(1) utilized operational work variables, (2) employed a time-series econometric approach with sufficiently large time-seriesobservations, nor (3) attempted to develop a day-to-day operational tool that allows carriers to predict the quit probability ofeach truck driver on a regular basis.

3. Model

3.1. Model and assumptions

We adopt the survival model proposed by Petersen (1986) for its ability to capture the dynamic effects of time-varyingcovariates. Let Ti be a non-negative random variable measuring the duration of time a truck driver (i) has stayed with a car-rier at the time when either the driver exits the carrier or when the observation period ends (i.e., when right-censoring oc-curs). Ti is divided into k non-overlapping but adjacent intervals, which need not be of the same length (number and length ofintervals may vary from one driver to another). The end points of these k intervals are given by t0, t1, . . . , tj, . . . , tk�1, tk, where t0

is the hire date, tk is the time when either the exit or right-censoring occurs, and t0 < t1 < � � � < tk�1 < tk. We assume, due to thenature of our data (weekly), that all the time-varying (non-stationary) variables are step-function covariates such that theirvalues stay constant within each interval, but may change from one interval to the next. This condition indicates that thehazard rate (quit probability in an interval) of each driver is also modeled as a step-function, such that its value is held con-stant within each of k intervals.

3.2. Specifying hazard rate

Let hi(tj) be the hazard rate of driver i during the interval from time tj�1 to tj. This hazard rate can be modeled as afunction of certain predictor variables (that are observed prior to time tj) to capture the effect of exogenous variables ondrivers’ turnover decisions. Following Petersen (1986), we specify hi(tj) by using a fully parametric approach. Specifically,we assume that the drivers’ employment durations follow the exponential distribution. This distribution is chosen forthis study for the following reasons: (1) it is theoretically sound (time durations are typically distributed exponential),(2) its functional form is simple so that the model estimation becomes straightforward, and (3) other distributions be-haved badly during the estimation (we tested the Weibull, log-logistic, and Gompertz distributions, in addition to theexponential distribution).

Under this assumption, we can write hi(tj) as follows (suppressing subscript for drivers):

hiðtjÞ ¼ expð�XjÞ; ð1ÞXj ¼ aþ

Xm

bmXm þX

r

/rYrtj; ð2Þ

where Xm is the mth stationary (time invariant) predictor variable affecting the exit behavior of driver i over the entire dura-tion of employment (e.g., gender), Yrtj

is the rth non-stationary (time varying) predictor variable affecting the exit behavior ofdriver i during the interval from tj�1 to tj (e.g., miles earned during the interval), and a, b, / are the parameters to be derivedempirically.

3.3. Parameter estimation

We estimate the model by using the standard maximum likelihood method. Since our data include both the drivers withactual exits (non-censored data; leavers) and those with no observed exits (right-censored data; stayers), we specify the log-likelihood function as follows:

LL ¼Xn

i¼1

di ln hiðtkÞ �Xk

j¼1

Z tj

tj�1

hiðuÞdu

" #; ð3Þ

Page 5: Predicting truck driver turnover

542 Y. Suzuki et al. / Transportation Research Part E 45 (2009) 538–550

where di is a binary variable that is coded 1 for the actual exit and 0 for the right-censored data, and n is the number of driv-ers included in the sample. Observe that we maximize only the survivor function for the right-censored individuals, while wemaximize the whole density function for the non-censored individuals. It should be noted that the maximization of this log-likelihood function can be easily performed by using, for example, an automated routine available from the LIMDEP software(Econometrics Software). Readers that are interested in learning further details of the survival model used in this studyshould see Petersen (1986).

3.4. Using the model as a decision tool

The above model can be used as a decision tool to generate helpful information. Notice that we can predict (given theparameter estimates and covariate data) the hazard rate of any driver during any time interval (past or present) by evalu-ating Eq. (1). This condition suggests the following. First, the model can identify a subset of drivers who are ‘‘about to quit”(potential leavers) in a timely manner by: (1) calculating the hazard rate for each driver on a regular basis, and (2) poolingthe drivers whose predicted hazard rates are higher than the threshold level. Second, for each potential leaver, the model canuncover the reason(s) why the driver’s hazard rate is higher than that of other drivers by contrasting the covariates of thisdriver with those of others (i.e., identify those covariates in which the driver’s values are substantially different from those ofothers in the direction that increases the hazard rate). Third, the model can reveal the quit probability trend (increasing ordecreasing with time) of each driver by tracing the driver’s hazard rate over time. All of the above are important features forcarrier managers.

4. Empirical analyses

4.1. Company information

Two TL carriers agreed to participate in our study by providing their driver data. One is a medium-sized carrier based in amid-western state of the US that has more than 500 truck drivers, and the other is one of the largest TL carriers in the US withmore than 5000 truck drivers. We will refer to these two carriers as carriers A and B, respectively, in the rest of this paper(actual carrier names are not used for confidentiality reasons). At the time of the study, the driver turnover rate of carrier Awas about 80%, while that of carrier B was roughly 150%.

4.2. Basic approach

We employ the following framework and assumptions during empirical analyses. First, based on the findings from liter-ature review, we posit that drivers’ turnover decisions are affected mainly by driver demographics, work environment, dis-patchers, and recruiting method. Given this proposition, we construct our model by using the set of covariates thatencompass these four factors. Second, based on the theories of organizational behavior, we use two types of operationalwork variables. The theory of temporal turnover (e.g., Kammeryer-Mueller et al., 2005) states that a worker’s turnovercan be predicted better by using the covariates that are measured closely in time (i.e., the closer the time of variable mea-surement to time t, the more accurate the turnover prediction of the worker at time t). The emotional exhaustion theory (e.g.,Maslach, 1982), on the other hand, states that the effect of covariates on turnover decisions can accumulate over time; e.g.,the longer the amount of time a worker is exposed to undesirable conditions, the more likely the worker will quit. Giventhese two theories, we posit that the drivers’ operational work variables consist of two types: i.e., (1) those that have imme-diate impacts on driver exits, and (2) those that have lagged (cumulative) effects on driver exits.3 Based on this proposition,we divide the operational work variables into two types, and test their effects separately.

4.3. Data

We obtained the operational work data, as well as the demographic data of drivers, from the two carriers. The operationalwork data of carrier A consist of weekly observations of 971 drivers over a 9 months time period, with a total sample size of28,838 (usable samples only). The operational work data of carrier B consist of weekly observations of 5106 drivers over a 12months time period, with a total sample size of 117,874 (usable samples only). Table 2 shows the sample covariates used inthis study, along with their summary statistics (selected variables).

As expected, the covariates included in the data were not identical between the two carriers. The variables that were com-mon to both carriers include: driver age, gender, ethnicity, marital status, date of hire, employment termination date (foractual exits), job category (e.g., over-the-road driver), assigned manager (dispatcher), wage per mile, weekly miles (paid

3 For example, the dispatcher assignment of a driver belongs to the first type, because what matters to a driver is the quality of the current dispatcher, not thequality of the previous dispatcher(s). The weekly paid miles, on the other hand, belongs to the second type, as a driver’s exit behavior in week j should not beaffected by the amount of miles he/she gained during week j (because the total miles of a driver during week j is unknown to the driver until week j + 1), butrather by the total miles the driver had accumulated during the past few weeks.

Page 6: Predicting truck driver turnover

Table 2Selected covariates and summary statistics.

Covariates Op. work variable? Stationary variable? Summary statisticsa

Metric Carrier A Carrier B

Driver demographicsDriving history (past criminal/DWI history) No Yes % bad hist. – 30.6Ethnicity (white, black, hispanic, etc.) No Yes % white 91.3 71.0Gender No Yes % male 93.1 96.9Job switch (num. of past switch) No Yes Average – 5.0Duration of last employment (days) No Yes Average – 560.8Whether the previous quit was voluntary No Yes % voluntary – 34.6Work exp. (years worked as a driver) No Yes Average – 4.6Age of truck driver No No (immediate) Average 44.4 44.7Number of people dependent on driver No No (immediate) Average 1.3 –Marital status No No (immediate) % married 65.7 52.1Previously hired by the carrier No Yes % hired 13.0 28.1

Recruiting sourcesPersonal referral No Yes % hired 22.4 8.1Magazine No Yes % hired 5.4 –News paper No Yes % hired 1.5 –Internet No Yes % hired 0.8 –Trailer advertisement No Yes % hired 0.5 –

Work environmentsFrequency of freight loading (per week) Yes No (lagged) – – –Frequency of hazardous loads (per week) Yes No (lagged) – – –Time spent at home (per week) Yes No (lagged) – – –Paid miles (per week) Yes No (lagged) – – –Wage per mile Yes No (lagged) – – –Avg. wait time at consignee (per week) Yes No (lagged) – – –Avg. wait time at shipper (per week) Yes No (lagged) – – –Industry average pay rate (per mile) Yes No (immediate) – – –Whether a driver works on weekends Yes No (immediate) – – –

DispatchersThe dispatcher to which a driver is assigned Yes No (immediate) – – –

a Due to confidentiality reasons, we report only the summary statistics of selected demographic variables.

Y. Suzuki et al. / Transportation Research Part E 45 (2009) 538–550 543

miles), weekly home times, weekly dwell (detention) times, and weekend job assignments. The two carriers, however, haddifferent sets of variables for the rest of their data. For this reason, we calibrate the survival model separately for each car-rier.4 We use Microsoft Access, along with Visual Basic for Applications (VBA), to arrange and transform the raw data into usableformats.

4.4. Incorporating lagged effects

As mentioned previously, we use two types of operational work variables (those with immediate impacts and those withlagged impacts). We measure the effects of these two variable types separately by using the following procedure. First, weask carriers A and B to classify all the operational work variables into either the ones having immediate impacts or the oneshaving lagged (cumulative) impacts by using their experiences and judgments. Second, for those variables that are identifiedas having lagged effects, we create the lagged operators (lagged variables) of various orders. Third, we use the following formto specify Xj, in lieu of Eq. (2):

4 Wemodel mdetentioare exp

Xj ¼ aþX

m

bmXm þXP

r¼1

/rYðIÞrtjþXR

r¼Pþ1

kr

PQq¼1YðLÞrtj�q

Q

" #; ð4Þ

where YðIÞrtjis the rth non-stationary variable having immediate impacts (r = 1,2, . . . ,P), YðIÞrtj

is the rth non-stationary var-iable having lagged effects (r = P + 1,P + 2, . . . ,R), k’s are parameters, and Q is the duration of time the YðIÞrtj

variables can haveimpacts on drivers’ exit behaviors.

Observe in Eq. (4) that the effects of YðIÞrtjvariables are captured by using the Qth order moving average operators. Eq. (4),

therefore, implies that a driver’s hazard rate in week j is a function of: (1) values of stationary variables representing drivercharacteristics, (2) values of non-stationary variables (during week j) that have immediate impacts on driver exits, and (3)

do not calibrate a single model by using only the ‘‘common” variables for the following reasons. First, removal of ‘‘uncommon” variables can result inisspecification. Second, many ‘‘common” variables are measured differently by the two carriers (e.g., the two carriers use different definitions for

n time and job categories). Third, the two carriers use different pay schemes and operational policies, so that the effects of many ‘‘common” variablesected to be different between the two carriers.

Page 7: Predicting truck driver turnover

544 Y. Suzuki et al. / Transportation Research Part E 45 (2009) 538–550

moving average values of non-stationary variables (weeks j � Q to j � 1) that have lagged (or cumulative) effects on driverexits. Since the value of Q is unknown, we estimate this coefficient empirically by testing a variety of values and selecting theone that gives the best fit to the data.

4.5. Missing-value treatment

For both carriers A and B, the data contained considerable amounts of missing values. This is a serious problem from themodel calibration standpoint, because if a driver’s data for a specific week is missing, the driver’s lagged operators cannot becalculated for several weeks after that week. While we can delete all the data (drivers) with missing values, we do not usethis approach, as it results in removing the majority of drivers (nearly all drivers had missing values because of vacations,etc.). To resolve this problem, we employ the following procedure.

First, data of drivers with too many missing values are deleted. Specifically, if a driver’s data are missing in at least onevariable for more than half of the entire employment period, the driver’s data are removed (this action resulted in deletingroughly 5.8% of samples from carrier A, and about 12.3% from carrier B). Second, for the remaining data, the missing valuesare filled by using one of the following methods: (1) simply enter ‘‘0”, (2) use the same value from the previous week, or (3)use the average value of all the drivers.5 The first method is deemed appropriate for those variables that represent ‘‘rareevents”, such as accidents. The second method is deemed appropriate for the variables in which certain degrees of consistencyacross time periods are expected (e.g., assigned dispatcher). This method is also used to fill the missing values for those weeks inwhich drivers were on vacation. The third method is deemed appropriate for those stationary variables in which similaritiesacross drivers are expected, such as the unemployment duration prior to the current job (drivers have similar unemploymentperiods between two consecutive jobs, as they can quickly find the next job). For each variable, the method of missing-valuetreatment is determined by consulting with the carriers (A and B).

4.6. Defining survival distribution

In general, the survival distribution is defined over the range [0,+1). This condition, however, is violated in our study fortwo reasons. First, our model uses lagged variables so that for each driver the first Q observations must be discarded. Thiscondition indicates that the lower bound of Ti ð8iÞ, is Q, not 0. Second, our data include those drivers who were hired beforethe ‘‘starting date of the measurement period” (denoted ts). This pattern implies that, for these ‘‘left-censored” drivers, thelower bound of the survival distribution is even higher than Q.

In theory, we should accommodate for this situation by truncating the survival distribution such that the distributionrange is adjusted appropriately for each individual driver. This approach, however, is computationally challenging. The useof this approach (for models with non-stationary covariates) makes the already-complex estimation procedure even morecomplex, especially with large samples. Because of its computational difficulties, this procedure is allowed in none of thecommercial software available today (to the best of our knowledge).

Thus, we employ a heuristic approach instead, and define t1 (end of the first interval) as:

5 Weis wellhave id

6 Whidentica(see, e.g

t1 ¼ t0 þ Eþ Q þ 1; ð5Þ

where E is the number of weeks that elapsed between the times t0 and ts (E = 0 if t0 P ts). Eq. (5) indicates that, for everydriver i, the first interval is defined as the time between t0 (hire date) and the end of the first week for which the full dataof predictor variables are available for the driver.

The major advantage of this approach is that each driver’s survival distribution is always defined over the range [0,+1),so that the model can be calibrated in a standard way without truncating the distribution. A limitation of this approach, how-ever, is that we must treat all the non-stationary covariates as ‘‘constants” within the entire duration of the first interval (t0

to t0 + E + Q + 1) (which is a strong assumption). Given, however, the unavailability of covariate data (from t0 to ts) and thecomputational challenges of truncating the distribution, our approach may be considered a reasonable heuristic.6 Note thatthis approach has been utilized (implicitly) by several empirical studies of survival analyses (see Morita et al. (1993) andSteel (2002) for details).

4.7. Variable selection

Since the number of variables used in our analysis is large (over 70 per carrier), and since a simpler model is preferredfrom the practical standpoint, we use only the subset of available covariates in our final models. We select the final set of

do not use regression imputations because we do not know of any imputation method appropriate for survival models with time-varying covariates. Itknown that the imputation can produce biased results unless the imputation model is compatible with the analysis model (i.e., the two models mustentical or similar functional forms). See Grover and Vriens (2006, p. 182) for more detailed discussions of this issue.ile it is possible to calibrate the model by using only the samples (drivers) who are hired at or after ts to ensure that the range of survival distribution isl across all drivers, we do not use this approach. This action (using only the samples with limited range of dependent variable) is known to bring biases., Morita et al., 1993).

Page 8: Predicting truck driver turnover

Table 3Goodness-of-fit by moving average durations.

Moving average duration

Q = 1 Q = 2 Q = 3 Q = 4 Q = 5

Carrier ALog-likelihood �2042.4 �2125.2 �2135.0 �2147.6 �2153.7Parameters 19 18 17 18 17AICa 4.246 4.414 4.433 4.461 4.471Pseudo R2 b 0.213 0.182 0.179 0.173 0.172

Carrier BLog-likelihood �9913.5 �9893.2 �9868.9 �9897.3 �9903.4Parameters 29 31 33 31 29AICa 3.894 3.887 3.879 3.889 3.890Pseudo R2 b 0.314 0.315 0.317 0.315 0.314

a AIC (Akaike Information Criterion) = �2(LL � K)/n, where LL = log-likelihood, K = number of parameters, and n = number of drivers. This statistic reflectsthe goodness of model fit adjusted for the number of estimated parameters (the smaller the value the better the fit).

b Pseudo R2 = 1 � (AIC/AIC0), where AIC0 is the AIC of the model with only the intercept. This fit statistic is conceptually similar to the adjusted R2 of thestandard linear regression model.

Y. Suzuki et al. / Transportation Research Part E 45 (2009) 538–550 545

variables as follows. First, we remove those variables that are unlikely to have significant impact on driver turnover based onpast empirical findings and theoretical expectations. Second, we identify the variables that appear to be measuring the sameor similar effects within each factor (i.e., those that are mutually correlated) based on our experience and judgment,7 andremove these collinear variables. Third, once we obtain the ‘‘prescreened” covariates by performing the above two steps, we(1) test run the survival model and obtain t-statistics for all the variables, (2) identify the variable that has the smallest t-value,and (3) remove this variable from the model. We repeat this elimination process iteratively until all of the retained variablesattain the minimum threshold t-statistics.8 The above ‘‘backward elimination” procedure is widely used in empirical studieswhere the purpose of regression is the prediction of future observations (Chatterjee and Hadi, 2006).

5. Estimation results

Table 3 shows the goodness-of-fit of the two models (carriers A and B) by the length of moving average durations (Q = 1 to5). Results indicate that, for both carriers, the best fit is achieved by relatively small Q values. Observe that for carrier A themodel with Q = 1 (1-week moving average) gives the best fit, while for carrier B the model with Q = 3 (3-week moving average)gives the best fit. Therefore, results imply that the non-stationary predictor variables (with lagged effects) may affect the driv-ers’ turnover decisions only for a short period of time (most likely 3 weeks or less). Given these results we will assume, in thediscussions that follow, that Q = 1 for carrier A and Q = 3 for carrier B. Parameter estimates (selected) of the two models areshown in Table 4. Descriptions of the predictor variables are given in the Appendix.

Major findings from the carrier A results are as follows. First, drivers tend to stay with the carrier for longer durations oftime if they: (1) are older drivers, (2) have many dependents (e.g., children), (3) are hired through personal referrals (by otherdrivers), and (4) are given the jobs with higher miles per leg (per load). Second, drivers tend to quit the company if they: (1) arere-hires (those with records of voluntary exits from carrier A in the past), (2) are hired via magazine advertisement, (3) areassigned to certain ‘‘bad” dispatchers (whose driver turnover rates are higher than those of others), and (4) are given theweekend jobs with less than 700 miles (according to carrier A, drivers demand large miles if they have to work on weekends).

Major findings from the carrier B results are as follows. First, drivers tend to stay longer with the carrier if they: (1) are olderdrivers, (2) work on weekends (which implies extra income), (3) are assigned to experienced (highly ranked) dispatchers, and(4) are earning good wages per mile. Second, drivers tend to quit the carrier if they: (1) had frequently switched jobs (carriers)prior to joining carrier B (possible job hoppers), (2) had voluntarily left the previous carrier, (3) are relatively experienced driv-ers, (4) are involved in accident(s) while employed by carrier B, (5) drive fewer miles than most other drivers assigned to thesame dispatcher, and (6) observe large discrepancies between the miles they are paid for and the actual odometer miles.

By pooling the findings that are common to both carriers, we can draw several implications. First, there seems to be asignificant dispatcher effect. For both carriers A and B our results imply that a driver’s hazard rate is affected significantlyby the quality of the assigned dispatcher. This finding is consistent with that of Keller and Ozment (1999a). Second, the dri-ver age may be an important predictor of the employment duration. Results of both carriers imply that the higher the age of adriver, the longer the duration of time the driver is expected to stay with a carrier. Third, the drivers’ hazard rates may be

7 We do not calculate the correlation matrix of covariates for two reasons. First, a considerable proportion of our covariates (over 30%) are dummy variables.Second, our data are panel (cross-section time-series) data, so that the calculation of simple correlation matrix (which does not account for any individual effectat all) is inappropriate.

8 We use the minimum threshold t-value of 1.960 (significant at the 95% confidence level) for carrier A, and 2.575 (significant at the 99% confidence level) forcarrier B. Note that we use a higher confidence level for carrier B than for carrier A, because the sample size for the former is considerably larger than that forthe latter.

Page 9: Predicting truck driver turnover

Table 4Estimation results (selected variables).a

Factorb Carrier A Carrier B

Coeff.c t-Stat. p-Value Coeff.c t-Stat. p-Value

Demographic variablesDriver age 1 0.046 10.18 0.000 0.009 4.99 0.000Number of dependents 1 0.111 3.03 0.002 – – –Previously hired 1 �0.912 7.43 0.000 – – –Hire by competition 1 – – – 0.102 2.81 0.005Job switch 1 – – – �0.070 16.77 0.000Voluntary leave 1 – – – �0.202 5.81 0.000Work experience 1 – – – �0.029 7.02 0.000Personal referral 2 0.307 2.66 0.008 – – –Magazine 2 �0.660 4.20 0.000 – – –

Operational work variablesMiles per leg 3 0.004 5.17 0.000 – – –Week End LT700 3 �0.002 16.14 0.000 – – –Weekend both 3 – – – 3.308 52.88 0.000Weekend either 3 – – – 1.686 28.25 0.000Dispatch wait 3 – – – 0.038 3.56 0.000Rate per mile 3 – – – 32.088 58.81 0.000Accidents 5 – – – �2.357 12.56 0.000Hazmat 3 – – – �0.394 3.31 0.001Home time 3 – – – �0.053 4.64 0.000Team average 3 – – – �0.001 11.81 0.000Variance 3 – – – �1.397 5.52 0.000Dispatcher #02 4 �1.884 3.68 0.000 – – –Dispatcher #12 4 �1.794 5.27 0.000 – – –Dispatcher #13 4 �1.043 6.74 0.000 – – –Dispatcher #14 4 �1.925 5.63 0.000 – – –Dispatcher #17 4 �2.642 4.41 0.000 – – –Dispatcher #20 4 �0.931 5.48 0.000 – – –Dispatcher #21 4 �0.624 3.31 0.001 – – –Dispatcher #22 4 �0.512 2.75 0.006 – – –Dispatcher rank 4 – – – 0.006 9.99 0.000

Log-likelihood �2042.4 �9868.9Number of drivers in sample 971 5106Total sample for estimation 28,838 117,874

a See the Appendix for the description of variables.b Descriptions of factors: 1 = driver demographics, 2 = recruiting method, 3 = work environment, 4 = dispatchers.c The model is calibrated by first calculating the hazard rate of each driver for each time period t (by expressing the hazard rate as an inverse function of

the exogenous variables that are observed at or before time t), and then fitting the survivor function (or density function for non-censored data) to theactual employment (duration) data. Thus the coefficients shown in this table represent the (inverse) effects of exogenous variables on driver hazard rateswithin time periods (i.e., how a unit increase in an exogenous variable decreases the quit probability of drivers).

546 Y. Suzuki et al. / Transportation Research Part E 45 (2009) 538–550

affected by the total pay. Results of carrier A indicate that drivers with higher paid miles (per leg) are less likely to quit, whilethose of carrier B indicate that drivers with higher wages per mile are less likely to quit. These two results, in combination,imply that drivers with higher total pay (earnings) are less likely to quit.

Table 5 shows how much data variance is explained by the operational work variables, the demographic variables, andthe model intercept. The table compares the goodness-of-fit of the four models that are estimated with: (1) all predictor vari-ables, (2) operational work variables only, (3) demographic variables only, and (4) intercept only. Two implications may beobtained from the table. First, both the demographic variables and operational work variables contribute significantly to im-prove the model fit. Results indicate that, for both carriers, the goodness of model fit reduces significantly if either the demo-graphic variables or the operational work variables are removed (p-value <0.0001). This pattern implies that the operationalwork variables are good predictors of hazard rates. Second, the model estimated with only the operational work variablesmay fit the data better than that estimated with only the demographic variables. The results show that, for both carriers,the former model attains the better AIC and pseudo R2 values than the latter model, which means that the former fits thedata better than the latter after adjusting for the number of estimated parameters. Thus, in our data, the operational work vari-ables may do a better job of explaining driver exits than the demographic variables.

6. Model validation

In this section we test the validity of our model (approach) from a variety of perspectives. Our goal is to show that: (1) themodel estimation results are robust (ensure that the variables retained in the final models are truly good predictors of driverturnover – i.e., they are not chosen by coincidence), and (2) the model can be used as a practical decision tool by carriermanagers.

Page 10: Predicting truck driver turnover

Table 5Comparisons of demographic and operational work variables.

Full modela Operational-work variables onlya Demographic variables onlya Intercept only

Carrier ALog-likelihood �2042.4 �2115.4 �2210.8 �2619.1Number of parameters 19 14 6 1AIC 4.246 4.386 4.566 5.397Pseudo R2 0.213 0.187 0.154 –Likelihood ratio testb – <0.0001 <0.0001 <0.0001

Carrier BLog-likelihood �9868.9 �10,003.7 �13,530.7 �14,486.8Number of parameters 33 23 11 1AIC 3.879 3.927 5.304 5.675Pseudo R2 0.317 0.308 0.065 –Likelihood ratio testb – <0.0001 <0.0001 <0.0001

a Model intercepts are estimated for these models.b Compares the fit of the model in the column against that of the full model (p-values).

Y. Suzuki et al. / Transportation Research Part E 45 (2009) 538–550 547

We validate the model by using five methods. First, we test the face validity of our model by using carrier inputs. Specif-ically, we show the model estimation results to the two carriers (A and B), and ask if they see any ‘‘surprise” or ‘‘strange”findings. (If, for example, a carrier strongly believes that a variable is a good predictor of the drivers’ hazard rates but theresults do not support this view, it is considered a ‘‘surprise” to the carrier.) Since both carriers confirmed that there areno surprising results, the model is regarded as having face validity.

Second, we test the robustness of parameter estimates by conducting an experiment similar to the ‘‘Jackknife” procedure.The test is performed as follows. First, for both carriers A and B, we randomly extract roughly 80% of samples (drivers) fromthe data (using a PC random number generator). Second, we estimate the model parameters by using only the extracted sam-ples, and examine whether the variables shown in Table 4 (those that were found to be significant when the model was esti-mated with full data) achieve the statistical significance. We repeat the above procedure (experiment) three times for eachcarrier. The results (not reported) indicate that, for both carriers, all of the variables shown in Table 4 are found to be sta-tistically significant in every experiment. Hence, our parameter estimates are judged as reasonably robust.

Third, we test the macro-level prediction accuracy of the model. Using the parameter estimates given in Table 4, we predictthe annual driver turnover rates of the two carriers, and compare them with the actual rates. If the model is valid, it shouldproduce the turnover rate predictions similar to the actual rates. We predict the driver turnover rates by: (1) calculating thesummed hazard rates of all drivers by week, (2) taking the average of these figures across weeks, (3) multiplying this averageby 52, and (4) dividing the resulting figure by the average number of drivers in our sample (per week). The results indicatethat the predicted rates are 83% and 142% for carriers A and B, respectively, while the actual rates are 80% and 150%, respec-tively. Given these results, the macro-level prediction accuracy of the model is deemed reasonable.

Fourth, we test the micro-level prediction accuracy of the model. For both carriers A and B, we divide the drivers into twogroups; namely, the ‘‘actual-exit” and ‘‘right-censored” groups. We then calculate the predicted (fitted) hazard rate of each dri-ver during the ‘‘terminal” interval (tk�1 to tk), and compare the predicted rates of the ‘‘actual exits” and ‘‘right-censored drivers”by using a t-test. If valid, the model should give higher predicted hazard rates for actual exits than for right-censored data. Theresults (see Table 6) indicate that, for both carriers, the predicted hazard rates are significantly higher for drivers with actual exitsthan for right-censored drivers. Given this pattern, the micro-level prediction accuracy of the model is deemed reasonable.

Fifth, we test the external validity of the model; i.e., we analyze whether the model can accurately forecast the hazard ratesof those drivers who are not included in the estimation sample. The test is performed as follows. First, for both carriers, we dividethe drivers into two groups; i.e., the calibration group and the forecasting group, by using a PC random number generator(roughly 80% of drivers are assigned to calibration sample, and 20% to forecasting sample). Next, we calibrate the model byusing only the calibration sample, and use the estimated model to forecast the hazard rates of the drivers in the forecastingsample (during the terminal intervals). We then compare the forecasted hazard rates of the ‘‘actual exits” and ‘‘right-censoreddrivers” by using a t-test. The above procedure (experiment) is repeated three times for each carrier. The results (shown inTable 7) indicate that, for both carriers, the model forecasts significantly higher hazard rates for ‘‘actual exits” than for‘‘right-censored drivers” in all of the experiments. Thus, the model’s external forecasting accuracy is judged reasonable.

Table 6Predicted hazard rates of censored and non-censored data.

Data type Average hazard rate t-Statistic p-Value

Carrier AAct. exit 0.0208 (n = 338) 6.672 0.000Censored 0.0143 (n = 633)

Carrier BAct. exit 0.0681 (n = 2952) 28.415 0.000Censored 0.0078 (n =2154)

Page 11: Predicting truck driver turnover

Table 7External validity.

Calibration sample (n) Forecasting sample (w) Data type Forecasted hazard rates t-Statistic p-Value

Carrier AExperiment 1 778 193 Act. exit 0.021 (n = 107) 3.194 0.002

Censored 0.013 (n = 86)Experiment 2 795 176 Act. exit 0.023 (n = 95) 2.864 0.005

Censored 0.014 (n = 81)Experiment 3 768 203 Act. exit 0.019 (n = 117) 3.493 0.001

Censored 0.012 (n = 86)

Carrier BExperiment 1 4047 1059 Act. exit 0.076 (n = 543) 9.418 0.000

Censored 0.011 (n = 516)Experiment 2 4179 927 Act. exit 0.064 (n = 559) 10.008 0.000

Censored 0.010 (n = 368)Experiment 3 4144 962 Act. exit 0.064 (n = 539) 9.188 0.000

Censored 0.012 (n = 423)

548 Y. Suzuki et al. / Transportation Research Part E 45 (2009) 538–550

7. Model implementation

7.1. Carrier A

Carrier A has partially implemented the proposed model. Specifically, the carrier has modified its information system afterthe study such that the new system can give dispatchers the weekly information on how many weekend loads with less than700 miles (unattractive loads) they have assigned to their drivers during the previous week. This enhanced system allows dis-patchers to use the ‘‘unattractive load” information in a timely manner to intelligently access the job satisfaction levels of theirdrivers. Additionally, the carrier has taken the following two actions based on our study findings. First, the carrier has createdthe driver referral team. The mission of this team is to increase the number of personal referral driver recruits by: (1) developinggood recruiting materials, and (2) recognizing the top 15 drivers who achieved the highest personal referral recruits everymonth (with monetary rewards). Second, the carrier has initiated an incentive program for dispatchers. Under this new pro-gram, a dispatcher’s pay is based on his/her driver retention rate, such that the higher the retention rate, the higher the pay.

The above actions have helped the carrier improve its driver turnover rate. In roughly 2 years, the carrier has successfullyreduced the turnover rate from about 80% to 70%. This 10% reduction of turnover rate should be viewed as a considerableachievement, because during the same 2 years the industry average driver turnover rate has increased from about 100%to over 120%. The carrier is currently considering the development of a decision support system that utilizes the proposedmodel to predict the hazard rate of each truck driver on a weekly basis.

7.2. Carrier B

After spending some time for validating the effectiveness of the model, the carrier is convinced that the model can givereasonably accurate predictions of drivers’ hazard rates, and has decided to adopt the model as a driver-management deci-sion tool. The carrier has recently completed their work on developing a decision support system that: (1) estimates the dri-ver survival model on a regular basis (to update the parameters), and (2) forecasts the hazard rate of each driver on a weeklybasis (to identify a set of drivers who are about to quit). Their ultimate goal is to integrate the hazard-rate forecasting system(survival model) with the carrier’s load coordination and driver dispatch systems to include the drivers’ hazard rates in loadassignment decisions. By developing such a system the carrier wishes to better control driver turnover rates.

It is worth noting that, prior to this study, carrier B was already working on internally developing an AI-based algorithmfor predicting driver exits (by using the operational work variables as predictors). After seeing the results of this study, how-ever, the carrier realized that our method can predict the driver exits in a more cost-effective manner than the AI-basedmethod without sacrificing the accuracy of predictions significantly. This ‘‘cost effectiveness” was the key factor that con-vinced carrier B to adopt our method, instead of the AI-based method.

8. Conclusion and limitation

This study has made three important contributions to the truck driver turnover literature. First, we have empiricallyshown that the operational work variables are good predictors of truck driver hazard rates. Our results indicate that drivers’turnover behaviors can be explained significantly better by incorporating certain operational work variables, such as weeklymiles, weekend job assignments, and wage per mile, into the survival model. The results also show that the operational workvariables may do a better job of explaining drivers’ turnover decisions than the traditional demographic variables. These re-sults are found consistently for both carriers.

Page 12: Predicting truck driver turnover

Y. Suzuki et al. / Transportation Research Part E 45 (2009) 538–550 549

Second, we have shown that the Petersen model (1986), if used in conjunction with the operational work data, may be-come an effective decision tool for carriers. We have empirically tested the model’s capability to predict the drivers’ turnoverdecisions, and showed that it can give reasonably accurate predictions of driver hazard rates. This condition implies that car-riers can: (1) use the model to identify a smaller set of drivers who are about to quit in a timely manner (by monitoring thepredicted hazard rate of each driver continuously), and (2) focus their attention on preventing the exits of these fewer driv-ers, rather than monitoring all drivers.

Third, our study has provided interesting implications on how the past events may affect the drivers’ turnover decisions.Our results indicate that a driver’s hazard rate during a given time period may be affected only by those events that tookplace rather recently (within 3 weeks or less). This condition implies that when a driver experiences a ‘‘bad” event that re-duces his/her job satisfaction (e.g., low miles), this event may increase the driver’s hazard rate for up to 3 weeks, but may notaffect the hazard rate beyond 3 weeks. While not generalizable, this finding may give interesting implications to carriers andresearchers with respect to the dynamic nature of the relationship between driver turnover decisions and time-varying pre-dictor variables.

Our study has limitations. First, the operational work data are subject to data collection errors. Although we believe thatthe data used in this study are reasonably reliable, readers should understand that: (1) our study results rest on the accuracyof the data, and (2) we cannot exclude the possibility that our results may contain certain degrees of estimation bias due todata collection errors. Second, our model does not account for the unobserved heterogeneity across drivers. Although in the-ory we can use either the fixed-effect or the random-effect method to capture the heterogeneity, neither would work for thisstudy. The former is inappropriate because it not only requires the estimation of a large number of parameters, but alsoprohibits the estimated model from predicting driver exits beyond the estimation sample (which is one of our goals)(Chintagunta, 1993). The latter is available primarily for survival models with cross-sectional data (see, e.g., Greene, 2008,p. 938). Thus, in our model, the unobserved heterogeneity is captured only by the error terms. If proper methods of control-ling individual heterogeneity are developed in the future (for survival models with time-varying covariates), researchers mayre-estimate our models and examine the robustness of the empirical results reported in this paper.

Appendix. Predictor variables (those shown in Table 4)

Description

Stationary?

Driver demographics

Driver age Age of driver No (immediate) Number of dependents Number of dependents No (immediate) Previously hired 1 = Previously hired by the carrier, 0 = not previously hired Yes Hire by competition 1 = Previously worked for the carrier’s competitor, 0 = otherwise Yes Job switch Previous employment occasions (number of carriers worked for) Yes Voluntary leave 1 = Voluntarily left previous employer, 0 = involuntary leave Yes Work experience Prior work experience (in years) Yes Personal referral 1 = If hire source is referral, 0 = otherwise Yes Magazine 1 = If hire source is magazine advertisement, 0 = otherwise Yes

Operational work variables

Miles per leg Miles per leg (per load) during the week No (lagged) Weekend LT700 Number of legs less than 700 miles during weekend No (lagged) Weekend both 1 = Driving on both Saturday and Sunday, 0 = otherwise No (immediate) Weekend either 1 = Driving on Saturday or Sunday, 0 = otherwise No (immediate) Dispatch wait Average wait time for next dispatch (in hours) No (lagged) Rate per mile Base pay rate per mile No (lagged) Accidents Number of preventable accidents No (lagged) Hazmat Number of loads with hazardous materials No (lagged) Home time Weekly home time (in days) No (lagged) Team average Average paid miles of all drivers assigned to the same dispatcher No (lagged) Variance Percentage difference between miles paid and miles on odometera No (lagged) Dispatcher #1–#30 1 = If driver is assigned to the dispatcher, 0 = otherwise No (immediate) Dispatcher rank Ratio-scale index based on experience (higher value = higher rank) No (immediate)

a Drivers are paid by ‘‘billed miles” (miles based on shortest routes). Thus their ‘‘paid miles” are generally lower than ‘‘actual miles” (odometer miles), asthey must occasionally deviate from the shortest routes (for fueling, etc.).

References

Beilock, R., Capelle, R.B., 1990. Occupational loyalties among truck drivers. Transportation Journal 29, 20–29.Belman, D., Monaco, K., 2005. Are truck drivers underpaid? Applied Economics Letters 12, 13–18.

Page 13: Predicting truck driver turnover

550 Y. Suzuki et al. / Transportation Research Part E 45 (2009) 538–550

Chatterjee, S., Hadi, A.S., 2006. Regression Analysis by Example, fourth ed. John Wiley & Sons, Hoboken, NJ.Chintagunta, P.K., 1993. Investigating purchase incidents, brand choice and purchase quantity decisions of households. Marketing Science 12, 184–208.Corsi, T., Fanara, P., 1988. Driver management policies and motor carrier safety. The Logistics and Transportation Review 24, 153–163.Curtis, S., Wright, D., 2001. Retaining employees—the fast track to commitment. Management Research News 24, 56–60.De Croon, E.M., Sluiter, J.K., Broersen, J.P.J., Blonk, R.W.B., Frings-Dresen, M.H.W., 2004. Stressful work, psychological job strain, and turnover: a 2-year

prospective cohort study of truck drivers. Journal of Applied Psychology 89, 442–454.Econometrics Laboratory, 2000. Combining Revealed and Stated Preference Data. University of California, Berkeley, CA.Greene, W.H., 2008. Econometric Analysis, sixth ed. Pearson/Prentice Hall, Upper Saddle River, NJ.Grover, R., Vriens, M., 2006. The Handbook of Marketing Research: Uses, Misuses and Future Advances. Sage, Thousand Oaks, CA.Kammeryer-Mueller, J.D., Wanberg, C.R., Blomb, T.M., Ahlburg, D., 2005. The role of temporal shifts in turnover processes. Journal of Applied Psychology 90,

644–658.Keller, S.B., 2002. Driver relationship with customers and driver turnover: key mediating variables affecting driver performance in the field. Journal of

Business Logistics 23, 39–65.Keller, S.B., Ozment, J., 1999a. Exploring dispatcher characteristics and their effect on driver retention. Transportation Journal 39, 20–34.Keller, S.B., Ozment, J., 1999b. Managing driver retention: effects of the dispatcher. Journal of Business Logistics 20, 97–120.LeMay, S.A., Taylor, S.G., Turner, G.B., 1993. Driver turnover and management policy: a survey of truckload irregular route motor carriers. Transportation

Journal 33, 15–20.Maslach, C., 1982. Understanding burnout: definitional issues in complex phenomenon. In: Paine, W.S. (Ed.), Job Stress and Burnout. Sage, Beverly Hill, CA,

pp. 29–40.Min, H., Lambert, T., 2002. Truck driver shortage revisited. Transportation Journal 42, 5–17.Min, H., Eman, A., 2003. Developing the profiles of truck drivers for their successful recruitment and retention: a data mining approach. International Journal

of Physical Distribution and Logistics Management 33, 149–162.Morita, J.G., Lee, T.W., Mowday, R.T., 1993. The regression-analog to survival analysis: a selected application to turnover research. Academy of Management

Journal 36, 1430–1464.Morrow, P.C., Suzuki, Y., Crum, M.R., Ruben, R., Pautsch, G., 2005. The role of leader-member exchange in high turnover work environments. Journal of

Managerial Psychology 20, 681–694.Petersen, T., 1986. Fitting parametric survival models with time-dependent covariates. Applied Statistics 35, 281–288.Richard, M.D., LeMay, A.S.A., Taylor, G.S., Turner, G.B., 1994. A canonical correlation analysis of extrinsic satisfaction in a transportation setting. The Logistics

and Transportation Review 30, 327–338.Richard, M.D., Lemay, S.A., Taylor, S.G., 1995. A factor-analytic logit approach to truck driver turnover. Journal of Business Logistics 16, 281–299.Rodriguez, J.M., Griffen, G.C., 1990. The determinants of job satisfaction of professional drivers. Journal of the Transportation Research Forum 30, 453–464.Rodriguez, J., Kosir, M., Lantz, B., Griffen, G., Glatt, J., 2000. The Costs of Truckload Driver Turnover. Upper Great Plains Transportation Institute, North Dakota

State University, Fargo, ND.Shaw, J.D., Delery, J.E., Jenkins Jr., D.G., Gupta, N., 1998. An organization-level analysis of voluntary and involuntary turnover. Academy of Management

Journal 41, 511–525.Staplin, L., Gish, K.W., 2005. Job change rate as a crash predictor for interstate truck drivers. Accident Analysis and Prevention 37, 1035–1039.Steel, R.P., 2002. Turnover theory at the empirical interface. Problems of fit and function. Academy of Management Review 27, 346–360.Stephenson, F.J., Fox, R.J., 1996. Driver retention solutions: strategies for for-hire truckload (TL) employee drivers. Transportation Journal 35 (4), 12–25.Taylor, G.S., LeMay, S.A., 1991. A causal relationship between recruiting techniques and driver turnover in the truckload sector. Transportation Practitioners

Journal 59, 56–66.Transport Topics, 2008. Driver Turnover Rates Decline, But Trucking Expects Reversal.Truckload Carriers Association, 2004. How to Recruit and Retain Drivers. <http://www.smallcarrieruniversity.com/ccu_driver_recruitment.shtml>.US Department of Transportation, Federal Motor Carrier Safety Administration, 2007. Driver Issues: Commercial Motor Vehicle Safety Literature Review.

Washington, DC.