shirking, sorting, and reelection in congress sorting, and reelection in congress* 1. introduction...

43
Shirking, Sorting, and Reelection in Congress Bruce Bender Timothy C. Haas Sunwoong Kim University of Wisconsin-Milwaukee March 9, 2004

Upload: buikhanh

Post on 15-Apr-2018

218 views

Category:

Documents


2 download

TRANSCRIPT

Shirking, Sorting, and Reelection in Congress

Bruce Bender Timothy C. Haas Sunwoong Kim

University of Wisconsin-Milwaukee

March 9, 2004

Abstract In the literature on shirking with respect to constituent interests by legislators, the attempts to identify shirking have relied on the proxy of voting score indexes and have been shown to suffer from conceptual or econometric problems. None of these attempts actually estimated the degree of shirking. The objective of this paper is to estimate shirking by members of the U.S. House of Representatives. We accomplish this by developing a stochastic dynamic model of the incumbent's optimization problem while also allowing for a self-selected population of challengers and turnover of congressional seats. The parameter values of the model are numerically estimated using the method of minimum chi-squared. Estimated parameter values are consistent with voters' disciplining shirking incumbents at the ballot box and with challengers self-selecting on the basis of the degree to which their interests and the voters’ interests coincide. Simulations based on the estimated parameter values indicate that the voters’ punishment of shirking congressmen induces incumbents to shirk little in absolute terms and considerably less than if unconstrained by the voters. Furthermore, estimates of mean shirking by terms-served categories are consistent with the empirical findings of no significant last-period problem with respect to shirking.

Shirking, Sorting, and Reelection in Congress* 1. INTRODUCTION There is a large literature on shirking by congressmen, where shirking is the failure of the

congressman to faithfully serve the interests of his constituents.1 Unfortunately, shirking by congressmen

is not directly measurable. The literature has focused, however, on two ways to identify the existence of

shirking. The first approach has been to determine the degree to which the congressman’s personal

ideology influences his voting on legislation to the detriment of his constituents’ interests. This approach

reached its pinnacle with the innovative two-stage methodologies of Carson and Oppenheimer (1984) and

Kalt and Zupan (1984) that spawned over the succeeding years an enormous number of articles

implementing this methodology. Conceptual problems (Lott and Davis [1992], Goff and Grier [1993],

and Bender [1994]) and econometric problems (Jackson and Kingdon [1992]) eventually led to its

abandonment. The second approach has been an examination of the last-period problem in politics. This

approach has been concerned more with the voters’ ability to sort out of office shirking politicians than

with an actual measurement of shirking. Specifically, if voters can successfully perform this sorting

function early enough in a shirking politician’s career, then those surviving politicians will be the ones

whose interests coincide with their constituents’ interests and therefore the ones who will have no

incentive to shirk in their last term in office when the reelection constraint no longer exists. While the

evidence of no significant change in voting records for congressmen in their last terms indicates no

significant last-period problem (Lott [1987; 1990], Van Beek [1991], and Lott and Bronars [1993]), this

approach does not explicitly measure the degree of shirking in the last term and certainly not in preceding

terms in office.

While the literature recognizes the linkage between shirking and the likelihood of reelection, the

lack of a quantifiable estimate of the degree of shirking is a direct result of the lack of a model of the

*The authors benefited from comments on earlier drafts by George Tolley, Roy Gilbert, Ed Lopez, John Patty, Sol Shalit, and seminar participants at the University of Chicago. The authors thank Michael Kim for helping with the data collection.

2

reelection process that lends itself to the estimation of shirking. The objectives of this paper are to

develop and estimate a stochastic dynamic model of the reelection process that includes optimal behavior

in office by incumbents, a self-selected (in the sense of the degree to which the interests of the challengers

and their potential constituents coincide) population of challengers, and turnover of the congress so that

the degree of shirking can be directly calculated from the estimated parameter values. By taking into

account both optimal shirking by incumbents and self-selection by challengers shirking is the outcome of

both moral hazard and adverse selection problems. Furthermore, the estimation of shirking is done

without resorting to the use of an empirical proxy for shirking despite the fact that the degree of shirking

is not directly measurable. The model does, however, abstract from considerations of party and the

seniority system as independent influences on a congressman's behavior in office in order to focus on the

relationship between constituent and congressman.2

Our treatment of the incumbent’s constrained maximization problem is similar in approach to

Lott and Reed’s (1989).3 The utility per congressional term received by the incumbent is assumed to

depend upon the degree to which the incumbent acts in office to serve his own interests. The probability

that the incumbent continues in office is a direct function of the degree to which the incumbent serves the

interests of his constituents. Accordingly the incumbent’s optimal time path of shirking is determined by

a constrained dynamic maximization of expected discounted utility. Furthermore, a self-selected

population of challengers stands ready to replace the incumbent should he leave office. Aggregation of

the results of the constrained choices of the individual incumbents determines the behavior and tenure

structure of the congress over time.

The model is quite parsimonious, consisting of a relatively small number of behavioral equations

and requiring a relatively small number of parameters to be estimated. Each parameter has an explicit and

1See Bender and Lott (1996) for a critical overview of the shirking literature. 2 We do not imply that the impact of party and the seniority system on a congressman's behavior in office is necessarily trivial. We do feel, however, that constituency preference is the primary influence on such behavior and leave the introduction of party and seniority to future work. 3 Unlike Lott and Reed, however, we explicitly solve, albeit numerically rather than analytically, the incumbent’s problem.

3

precise interpretation regarding the voters’ sorting function or the self-selection of challengers. The actual

estimation of parameter values is not accomplished by typical regression techniques. Instead, the

parameter values of the model are numerically estimated by the method of minimum chi-squared first

suggested by Neyman (1949).

Specifically, we simulate the retirement of incumbents and then the dynamic optimal choices of

the incumbents who run for reelection for a given set of parameter values for thirteen successive

congresses. Congressmen who retire or who are defeated are replaced by challengers from a population

whose distribution of preferences is characterized by a set of parameter values. We then compare the

terms served of defeated simulated congressmen over the thirteen congresses with the terms served of

defeated actual congressmen over the thirteen congresses from 1974 to 1998. Parameter values are then

perturbed and the process repeated until there is convergence to a set of parameter values that maximizes

goodness of fit as measured by Pearson’s chi-squared statistic. Given these estimated parameter values,

we can then simulate incumbent shirking.

The null hypothesis that the simulated and actual numbers of defeated congressmen by terms-

served categories are generated by the same process cannot be statistically rejected. Estimated parameter

values are consistent with voters punishing at the ballot box congressmen who shirk and with

congressional challengers self-selecting so that those challengers with a lower propensity to shirk are

more likely to run for office. Note that such self-selection can rationally occur only if voters can sort

incumbents according to their degree of faithfulness to the voters’ interests. Sorting therefore not only

directly mitigates the moral hazard problem of shirking by incumbent congressmen but also indirectly

mitigates the adverse selection problem.

The calculated mean level of shirking based on the estimated parameter values is quite low in

absolute terms and considerably lower than the mean level of shirking that would exist if congressmen’s

behavior were unconstrained by the voters. Furthermore, even though an individual congressman’s

optimal shirking increases with terms served, the mean level of shirking over all congressmen at first is

fairly constant as the number of terms served increases and then decreases as the number of terms served

4

increases. This is consistent with voters constraining the shirking of incumbent congressmen and over

time sorting out of office those congressmen who are inclined to shirk so that the congressmen with

greater tenure are the ones inclined not to shirk. It also explains why researchers have found no significant

last-period shirking problem.

The next section develops the dynamic model of incumbent behavior. Section 3 considers the

impact of sorting by the voters on self-selection by potential candidates and on the mean level of shirking

for the congress. Section 4 provides a model of shirking and turnover in the congress over time. Section 5

describes the methodology for the estimation of the model’s parameter values, and section 6 presents and

interprets these estimated values. Section 7 simulates and analyzes congressional shirking over time based

on the estimated parameter values. Finally, section 8 presents conclusions.

2. MODEL OF INDIVIDUAL INCUMBENT SHIRKING AND TENURE

Consider a congressman as an agent elected by his constituents to serve their interests. Who is the

principal that the congressman represents? We certainly do not imply that it is the median voter.4 It is

more reasonable to view the faithful congressman as serving Fenno's (1978) reelection constituency or

taking Peltzman's (1984) electoral support-maximizing position. For our purposes we can assume the

existence of a set of positions that the faithful congressman will support and refer to this as the set of

positions desired by the constituent-principal.

We also recognize that a congressman may receive utility from taking other positions not

consistent with the interests of his constituents. Taking such positions may be the result of the

congressman's ideology or interests not being consistent with the interests of his constituents, direct

bribes, bribes in the form of promises of future employment, or any other inducement. This is not to say

that the faithful congressman is morally superior. While he may take the positions desired by his

constituents because he truly receives utility from serving their interests, he may also desire to take those

5

positions because they are consistent with his ideology and interests. The latter is probably closer to

reality. Nevertheless, in either case we will refer to such a congressman as one with a “preference” for

serving the interests of his constituents.

Define a continuous variable, η∈(0,1), as a preference parameter indexing the congressman’s

utility from serving constituent interests. η=0 indicates that the congressman’s preference is strongest for

serving his constituents’ interests, whereas η=1 indicates that it is weakest.5 The congressman’s value of

η is not directly observable by his constituents.

Define a continuous variable, x∈(0,1), that indexes a congressman’s behavior according to the

degree to which the congressman faithfully serves the interests of his constituents. A congressman who

behaves as a perfectly faithful agent for his constituents is characterized as choosing a value of x=0,

whereas a congressman who behaves as a completely unfaithful agent is characterized as choosing a value

of x=1.6 Equivalently, x=0 indicates zero shirking with respect to constituent interests, and x=1 indicates

maximum shirking. The continuous nature of x allows for the existence of a continuum of congressmen

with the degree of shirking increasing as x increases.7 The variable x is clearly a choice variable for the

congressman. A congressman might prefer to choose a value of x greater than zero because such a value

would maximize his utility given his value of η.

The choice of a specific utility function for congressmen should have two properties. First, given

the definitions of η and x, the weaker the congressman’s preference for serving constituents’ interests, the

more the congressman will shirk in order to obtain his unconstrained maximum possible utility (i.e., in the

absence of the reelection constraint). Second, for the purpose of avoiding the introduction of bias into the

4 For critiques of the median voter model of elections see, for example, Stigler (1972), Fiorina (1974), Peltzman (1984), Romer and Rosenthal (1984), Goff and Grier (1993), Bender (1994), and Jung, Kenny, and Lott (1994). Bender and Lott (1996) provide an overview of the literature. 5 Although η is defined on the open interval (0,1), we refer to η taking the benchmark values of 0 and 1 for expositional purposes. 6 As in the case of η, we refer to x taking the benchmark values of 0 and 1 for expositional purposes even though x is defined on the open interval (0,1). 7 Our characterization of behavior by the variable x is in the spirit of Sutter's (1998) characterization of congressmen as “angels” and “knaves” except that we allow for a continuum of behavior.

6

model, the unconstrained maximum utility should be the same for each congressman. This standardizes

the potential payoff or maximum possible utility across congressmen.

Let the utility received by a congressman serving term t in office, ut(x(t)), be defined as

(1) ut(x(t)) = η1η

η1η

)η1(η))t(x1()t(x−

−−

.

This utility function has both of the desired properties above. The unconstrained maximum utility for a

congressman is obtained when the congressman serves with a degree of shirking of x(t)=η. The

unconstrained maximum utility for each congressman is equal to one regardless of his value of η.8

We assume that each incumbent who runs for reelection receives greater expected utility from

holding office than from being a private citizen. We also assume that each incumbent has a planning

horizon or maximum number of possible terms in office over which he chooses his optimal dynamic

behavior. Although the length of the planning horizon may vary across congressmen for purely

idiosyncratic reasons, there is no way of discerning those reasons. We therefore treat all congressmen

identically by constructing the planning horizon for each congressman using an assumed mandatory

retirement age of R. If we denote Ac as the age at which the incumbent was first elected (i.e., the age at

which he became a successful challenger, hence the subscript c), then the planning horizon or maximum

number of two-year terms that the congressman can serve is denoted as T=(R-Ac)/2.9 The assumption that

each congressman running for reelection receives greater expected utility as a congressman than as a

private citizen coupled with the assumed mandatory retirement age implies that T is indeed a planning

horizon. If the congressman has already served m terms, then the planning horizon shrinks to T-m terms.

8 Note that η has been defined to take values only in the open interval (0,1) because the right-hand side of equation (1) is undefined for the benchmark values of 0 and 1 used for expositional purposes. Further note that the benchmark values of 0 and 1 for x imply that ut(x(t))=0, which is why x also takes values only in (0,1). 9 If (R-Ac)/2 is not an integer, then its value will be rounded up to the next integer.

7

The value of T is congressman-specific. Finally, we assume that voters vote retrospectively.10

Setting a maximum of T terms for an incumbent does not necessarily mean that the incumbent

will serve T terms. Defeat at the polls is possible. Even if the incumbent could be continuously reelected

for the T terms, death and morbidity are stochastic, and, to at least some degree, are other reasons for

leaving office such as retirement in the sense of leaving the labor force, the opportunity to take a cabinet

position or judgeship, and the opportunity to run for governor or senator.

With respect to the incumbent’s bid for reelection, we presume that the challenger cannot win the

election but that the incumbent can lose the election. This is not a play on words. If the incumbent has

served the interests of his constituents while in office, then his campaign promises to do so will have

more credibility than the challenger's precisely because the incumbent has established such a record in

office.11 If the incumbent has served less well, then his credibility and consequently his reelection

probability will be reduced. Although other factors may affect a voter's decision whether to vote for the

incumbent, we model that decision as a function of how well the incumbent has served the interests of his

constituents in his most recent term.

Our specific probability of reelection function (i.e., voters’ sorting function) is

(2) Pt+1(x(t)) = α exp(-λx(t)).

where Pt+1(x(t)) is the probability that the incumbent congressman is reelected for term t+1, x(t) is the

10 There is considerable evidence that voters appear to use a retrospective voting strategy in voting for incumbents. See Francis, Kenny, Morton, and Schmidt (1994). 11 Bernhardt and Ingberman (1985) and Ingberman (1989) treat risk-averse voters as viewing a campaign promise as the expected value of the position that will be taken by the candidate if elected but as viewing the variance around that expected value as smaller for the incumbent if the incumbent's promise is consistent with the past position taken by the incumbent in office. Dougan and Munger (1989) treat the incumbent's reputation for ideological commitment to positions the voters agree with as serving the function in political markets that brand name serves in economic markets. The η value of the challenger, even a potentially faithful challenger with an η value of zero, and the η value of the incumbent are not directly observable by the voters. Only if the behavior of the incumbent were unconstrained (i.e., if there were no future reelection campaign) would the incumbent's η value be directly observable. However, voters have direct observations of the incumbent's behavior in office, x(t), but no such direct observations of the challenger's.

8

degree to which the congressman shirked with respect to the interests of his constituents in term t, the

parameter α is the probability that a perfectly faithful incumbent (i.e., x(t)=0) is reelected,12 and the

parameter λ>0 indicates the reduction of reelection probability caused by the degree to which the

incumbent shirked. Note that the probability of reelection function is a negative exponential function13

and α is the vertical axis intercept of that function. As an illustration of equation (2), if α=0.9, λ=2, and

x(t)=0.1, then the congressman’s probability of continuing for term t+1 is 0.74.

If the incumbent were to lose the reelection bid, then he obviously would be replaced in office by

the victorious challenger. We treat the challenger as being randomly drawn from a population of potential

challengers characterized by a joint distribution of η and Ac that will be specified later.

The objective of an incumbent serving term m in office is to maximize the expected utility

received over a career of T-m+1 potential terms in office given the reelection constraint. The incumbent

does so by his behavior in office over time or his choices of the values of x(t), t=m,m+1,…,T. Note that

even though there is a maximum of T-m+1 remaining terms in office, the voters’ sorting function (2)

makes the actual number of terms that will be served stochastic.

Each incumbent running for reelection must receive greater expected utility from holding office

than from being a private citizen. Otherwise the incumbent would not run for reelection. Denote the

utility per time period of a private citizen as H, where 0≤H<1. Foregone utility per time period as a

private citizen is the opportunity cost of being a congressman and therefore is the utility that a

congressman will receive if he fails to be reelected. The larger is that opportunity cost, the less faithfully

will the congressman serve his constituents ceteris paribus.

Unfortunately, H is not observable. Even if we had data on the wage rates of congressmen prior

to obtaining office, this would not be equivalent to their utilities as private citizens. It is not all that

12 It is not necessary that α be equal to one, although clearly 0≤α≤1. If voters perceive x(t)=0 with some noise, then α will be less than one. The precise value of α is an empirical matter. 13 Even though (2) relates the probability of the incumbent's winning reelection for term t+1 to the degree of shirking by the incumbent during term t, it is not a probability density function and there is no reason that it should be.

9

uncommon for people to take pay cuts when becoming congressmen. Furthermore, the relevant

opportunity cost, H, is not the utility received prior to becoming a congressman but the utility that would

be received as a private citizen after leaving office. This is even more problematic because the

opportunity set facing a congressman after leaving office is certain to be larger than the opportunity set

before taking office.

We therefore assume that the opportunity cost of the utility that would be received as a private

citizen is the same for all incumbents and, more specifically, set that opportunity cost equal to zero. This

assumption is less damaging to the extent that there is no reason to believe that congressmen’s

preferences (η values) are systematically related to their opportunity costs. However, even if the

correlation of preference and opportunity cost were zero for the population of congressmen, this would

not imply that all congressmen must have the same opportunity cost. Our assuming away these

idiosyncratic differences in opportunity cost, thereby ignoring H in the congressman’s dynamic

optimization problem, will have the likely effect of reducing the ability of the model to fit the data.

Denote the maximum lifetime expected discounted utility of an incumbent in term t (for the

period between t and T) as Ue(t) and the comparable utility for a private citizen as Up(t).14 In the last time

period (t=T) the incumbent will have an unconstrained maximization problem because there are no more

future elections (i.e., the probability of continuing for another term, PT+1, is zero regardless of the value of

x(T) that the incumbent chooses). That is, he will choose x*(T) in order to maximize the utility function

defined in (1), where x* denotes the value of x that maximizes utility. His solution is x*(T) = η, and his

utility for the period is one. For a non-terminal period (1≤t<T), the lifetime utilities are recursively

determined by choosing the values of x*(t), x*(t+1), x*(t+2), …, x*(T-1), and x*(T) that satisfy

(3) Ue(t) = max{x(t)} [ut(x(t)) + δ{Pt+1(x(t))Ue(t+1) + (1-Pt+1(x(t)))Up(t+1)}],

10

where δ = (1+r)-2 is the real discount factor for one term (two years) and r is the real one-year opportunity

cost.

The process of solving for the values of x*(t) or the values of optimal shirking over time can be

performed recursively backwards from T to the first period. For example, consider the next to the last

period (t=T-1). The incumbent will choose x*(T-1) to maximize expected utility for the periods T-1 and

T. His choice of x*(T-1) determines his utility in T-1 and probability of reelection for period T by

equation (2). If reelected, he will receive Ue(T)=1 in period T by choosing x*(T)=η in period T. If not

reelected, then he will receive Up(T)=H as a private citizen. As previously noted, when solving (3) for the

x*(t) values we set H=0. Similarly, the incumbent in period t=T-2 chooses the optimal x*(T-2). For any

given pair of values of the parameters, α and λ, and any given value of the incumbent's preference

parameter, η, the incumbent's optimal degree of shirking, x*(t), will increase monotonically as t increases

until x*(T) = η, the incumbent's unconstrained degree of shirking, in the final period of the incumbent's

time horizon. Finally, note that even though T, the maximum number of terms that the incumbent can

serve, is exogenous to the model, the actual number of terms served is endogenous because it is

dependent upon the incumbent's sequential choice of the x(t) values. Equation (3) can be solved for the

optimal values of x(t) by numerical methods.15

3. IMPACT OF SORTING ON SELF-SELECTION AND THE MEAN LEVEL OF SHIRKING

For any degree of electoral discipline imposed by the voters (i.e., the specific values of α and λ in

the voters' sorting function (2)), optimizing behavior by the individual incumbent will result in

congressmen with higher values of the shirking preference parameter, η, having higher time paths of

14 The subscripts e and p refer to the incumbent (e for elected congressman) and the private citizen (p for private citizen). 15 It is interesting to contrast our optimization problem by the candidate to those in the “citizen-candidate models” by Osborne and Slivinski (1996) and Besley and Coate (1997). In their one-term models the candidate implements his policy preference if elected, just like the incumbent in our model in the last term of his time horizon. The key decision variable in their models is whether to run for office based on maximizing expected utility given the candidate’s policy preference. In our dynamic model, the key decision variable is how much the incumbent will

11

optimal shirking than congressmen with lower values of η. It follows that congressmen with higher values

of η will have a smaller expected number of terms served than will congressmen with lower values of η.

Consequently, sorting by the voters not only constrains incumbents' shirking in the earlier terms of their

careers but also reduces shirking by more quickly removing from office those incumbents with greater

preference to shirk.

There also is an indirect way in which sorting affects the overall level of shirking. Sorting by the

voters implies that a successful challenger with a lower η value can expect a greater number of terms in

office and a larger utility in each term than would a successful challenger with a higher η value. Since

running for office, especially the first time, is not costless, we can treat this cost as equivalent to an entry

fee or initial outlay for an investment. A challenger with a lower η value therefore can expect a higher

return on investment than a challenger with a higher η value16 and therefore would be more likely to run

for office. Consequently, the distribution of η for the population of challengers therefore should be one

for which smaller values of η are more likely to be observed than are larger values of η. This self-

selection toward challengers with a smaller preference for shirking is a result of sorting by the voters.17

Electoral discipline imposed by the voters not only reduces shirking by incumbents but also induces

challengers to self-select in terms of preference for shirking.

Choosing a specific distribution of η consistent with the above reasoning still has some degree of

arbitrariness because η is not directly observable. We greatly reduce this arbitrariness factor by selecting

a family of distributions. We let the distribution of η for challengers be a beta distribution described

below:

shirk (i.e., will deviate from the constituency’s preference) over time so as to maximize the expected present value of utility over a sequence of potential terms in office given the incumbent’s preferences. 16 Indeed, if the initial outlay is sufficiently large, then there is no guarantee that the rate of return on investment will even be positive for a challenger with a high η value. 17 To see this starkly, suppose that sorting of incumbents did not occur – i.e., incumbents continued in office according to a random process. This would imply that all incumbents would have the same expected tenure in office and all would receive the same utility each term because each would choose x(t)=η regardless of their η values. There would be no incentive for self-selection by challengers. In this case the best guess for the distribution of η for the population of challengers would be a uniform distribution.

12

(4) f(η) = B(a,b)-1ηa-1(1-η)b-1 0≤η≤1

= 0 otherwise,

where a and b are parameters with a>0 and b>0 and B(a,b) = 0∫1 ηa-1(1-η)b-1 dη. This distribution is

defined on (0,1), as is η, and is quite flexible. The expected value is a/(a+b), a=1=b yields a uniform

distribution, a>1 and b>1 yields a unimodal distribution, and a value of (a/b) that is <1 (>1) yields a

distribution skewed to the left (right). A sufficiently small value of (a/b) will produce a distribution that is

close to one that is monotonically decreasing.

Since our objective is to estimate the overall level of shirking in the U. S. House of

Representatives over time, we need to specify an initial congress. This initial congress is simply a sitting

congress that is designated as the initial congress in our series of congresses. We expect that the

distribution of η for the incumbents in this initial congress is one for which smaller values of η are more

likely to be observed than are larger values of η because congressmen with low η values are likely to

have survived longer in office than the congressmen with the high η values given sorting by the voters.

As was the case when looking at the population of challengers, we let the distribution of η for the

congressmen in this initial congress be described by the beta distribution below:

(5) g(η) = B(c,d)-1ηc-1(1-η)d-1 0≤η≤1

= 0 otherwise.

Estimation of the values of the parameters, α and λ, of the voters' sorting function in equation (2),

the values of the parameters, a and b, of the distribution of η for the challengers population in equation

(4), and the values of the parameters, c and d, of the distribution of η for the congressmen in the initial

congress in equation (5) is necessary to allow the estimation of the mean level of shirking in the congress

13

over time. The next section develops an aggregated model of shirking and turnover for the congress over

time that is amenable to the estimation of the parameter values. The succeeding section describes the

estimation procedure.

4. MODEL OF SHIRKING AND TURNOVER IN THE CONGRESS OVER TIME

We start with an initial congress with each incumbent congressman characterized by number of

terms served, preference for shirking or η value, and age. Given the incumbent's utility function in

equation (1) and the voters' sorting function with its parameters, α and λ, in equation (2), the incumbent

dynamically maximizes expected discounted utility over the terms remaining until the mandatory

retirement at age R by choosing the optimal time path of shirking over those terms according to equation

(3). Shirking in any term t determines the probability that the incumbent is reelected to term t+1. Should

the incumbent be defeated for reelection to another term, then the incumbent is replaced in office by a

challenger with a specific age and η value. While the incumbent in our model is automatically retired at

age R, it is possible for the incumbent to retire before age R for stochastic reasons. A retired incumbent is

replaced by a challenger with a specific age and η value. The next congress is composed of those

incumbents who were reelected and the challengers who replaced either defeated or retired incumbents.

Each incumbent in this new congress is characterized by number of terms served, η value, and age. Each

incumbent optimally shirks in the new term, and defeated or retired incumbents are replaced by a new set

of challengers in the succeeding congress. Over time we construct a series of congresses consisting of

incumbents with specific values of terms served, η, and age who optimally shirk in each congress. We

can then address questions such as the mean level of shirking in congress, the relationship between

shirking and terms served, and the last-period shirking problem.

At this stage, however, equations (1), (2), and (3) can only determine the probability of reelection

of an incumbent with a specific number of terms served, η value, and age. Whether the incumbent is

actually reelected or defeated must be determined stochastically in light of the reelection probability.

14

While equation (4) specifies a marginal distribution of η for the challenger population, the actual η value

and age of the challenger succeeding a defeated or retired incumbent must be stochastically determined in

light of a joint probability distribution of η and age. While equation (5) specifies a marginal distribution

of η for the incumbents in the initial congress, the actual number of terms served, η values, and ages of

the incumbents in the initial congress must be stochastically determined in light of a joint probability

distribution of terms served, η, and age. Furthermore, in order for the model to generate quantitative

results it is necessary to estimate the parameter values, α and λ, of the voters' sorting function in (2), the

parameter values, a and b, of the marginal distribution of η for challengers in (4), and the parameter

values, c and d, of the marginal distribution of η for the sitting incumbents in the initial congress in (5).

The next section constructs the joint probability distributions in the course of describing the methodology

for estimating the values of the parameters.

5. METHODOLOGY FOR ESTIMATING PARAMETER VALUES

5.1. Overview of the estimation methodology

We start with an initial congress of 435 congressmen because there are 435 members of the U. S.

House of Representatives. Given an initial set of parameter values we simulate the optimal shirking of the

incumbents over a series of thirteen successive congresses. After each congress defeated and retired

incumbents are replaced by successors randomly drawn from the population of challengers. We then

perturb the parameter values and rerun the simulations of the thirteen successive congresses. The

parameter values are perturbed and the simulations rerun until we have observations of the outcomes of

the optimal behavior of the incumbents in the simulated congresses which, when compared to the

observations of the outcomes of the behavior of the incumbents in the actual congresses, minimizes the

value of the chi-squared statistic. The obvious first question is what outcome of incumbent behavior to

choose.

Our choice is the number of terms that an incumbent serves until he is defeated for reelection. It

15

is not unreasonable to ask a model of electoral behavior to make this prediction. We therefore create the

categories labeled “ts terms served at the time of electoral defeat,” where ts=1,2,3,…,21, and compare the

number of actual and expected incumbents in each category.18 Note that since each observation is an

individual incumbent who, by the nature of the categories, can only be counted once, the chi-squared

distribution’s assumption of the independence of observations is not violated.19 The data for terms served

at the time of defeat in each electoral category is compiled for the thirteen congresses from 1974 through

1998 that comprise our sample.20

Pearson's chi-squared statistic is

(6) χ2 = (observed∑=

m

1ii –expectedi)2/(expectedi),

where m is the number of cells or “ts terms served at the time of electoral defeat” categories.21 The

methodology for calculating the expected number of congressmen in each cell will be described in the

next subsection. Note that minimizing the value of the chi-squared statistic is equivalent to minimizing

the normalized sum of the squared deviations of the actual from the expected number of congressmen in

the cells.

18 The last category is “21 terms served at the time of electoral defeat” because this is the largest number of terms that any defeated incumbent in our data set served. 19 The assumption of independence of observations of the chi-squared distribution rules out other seemingly attractive alternative implications of the electoral model. One such alternative is the number of incumbents in each terms-served cohort – i.e., the tenure structure of the congress. This alternative violates the independence of observations assumption because any incumbent who has served n terms must have served n-1 terms in the preceding congress, n-2 terms in the congress before that one, and so on. Only if each observation is a single individual and each individual can only constitute a single observation will the independence of observations assumption be satisfied. 20 The data sources for all our data are the 1976 through 2000 editions of the Almanac of American Politics and the 2000 edition of the Congressional Biographical Directory. 21 Note that by comparing the actual defeated incumbents and the expected defeated incumbents in each of the terms-served categories for the whole thirteen-election period, we are really testing the long-run explanatory ability of our model. The noise associated with the data of defeated incumbents in any single election year makes a

16

5.2. The estimation methodology

This subsection puts the flesh on the bones described in the overview. Specifically, we describe

the following in detail: handling the retirement of congressmen; generating the joint distribution of η and

age for challengers; calculating the expected number of congressmen in each cell of the chi-squared

distribution; generating the joint distribution of terms served, η, and age of the congressmen in the initial

congress; and the numerical methodology used for the parameter value search.

5.2.1. Retirement

We define the term “retirement” quite broadly in that it refers to voluntarily not running for

reelection for any one of several reasons. A congressman may leave office for normal retirement, personal

reasons, death, or morbidity. A second category of reasons includes running for higher office or accepting

a position such as a judgeship or cabinet secretary. A third category is that the congressman loses his seat

because of redistricting. While in a sense this last category may not be purely voluntary, we treat leaving

office because of redistricting as retirement because it does constitute a decision not to run for

reelection.22

Although we can generically conceptualize the calculus of the incumbent’s retirement decision

(Schansberg [1994]), there is no theoretical model of the retirement decision that relates the probability of

retirement to the number of terms served. The different causes of retirement, as we broadly define it,

virtually guarantee this. However, it is necessary that we have a method for retiring simulated incumbents

statistically acceptable explanation of defeated incumbents by terms-served categories on an election-by-election basis virtually impossible. It is the long-run behavior of congressmen, however, that is of interest. 22 During our sample period 45 seats were eliminated because of redistricting. Many other districts were geographically altered – most in a minor manner, some substantially. It is likely that some incumbents chose not to run for reelection as a result of redistricting, but it is possible only to conjecture which incumbents retired for this reason. There were actually 13 incumbents whose seats disappeared because of redistricting but who still ran for reelection in the new congressional district. 12 of the 13 ran in either a primary or general election against another incumbent. The thirteenth incumbent changed his address so that he could avoid running against another incumbent. Instead he ran against (and lost to) a nonincumbent in a newly created district. We treat all thirteen of these incumbents as if they retired. In the case of the first twelve, it is a certainty that one of the two incumbents in each of the reelection contests would have to lose even if both had been perfectly faithful incumbents. In the case of the thirteenth it is difficult to refer to him as an incumbent running for reelection because he intentionally chose to run in a district that he had never before represented.

17

when simulating congresses over time. In each congress we therefore stochastically retire incumbents by

terms-served category by randomly drawing from an empirical distribution of retirement by terms-served

categories.

In each congress there are 435 actual incumbents. Of these 435-k choose to run for reelection and

k choose to retire, where the value of k varies from congress to congress. We retire k members of the

congress of 435 simulated incumbents.23 We calculate from the data the number of incumbents who

retired after exactly ts terms served, ts=1,2,…,27, and the number of incumbents who served at least ts

terms.24 Dividing the first number by the second yields an empirical frequency rate or probability of

retirement by terms-served category denoted as r(ts).25 At the beginning of each congress k simulated

incumbents are retired as follows. Incumbents are randomly drawn without replacement. If the first

incumbent has, for example, 6 terms served, then his probability of retirement is r(6). Given the value of

r(6) a random number generating process will determine whether or not this incumbent is retired.26

Random draws without replacement continue until the required number of k incumbents is retired. If after

drawing all 435 members of the simulated congress less than k incumbents have been retired, then the

random drawing without replacement begins again from the population of remaining incumbents.

We further impose a restriction for each congressman of a maximum career of T=(R-Ac)/2 two-

year terms, where Ac is the age of the incumbent when first elected to office or, equivalently, the

incumbent's age when he became a successful challenger. Any congressman who will exceed this

maximum number of terms by the end of his current term cannot run for reelection. He is automatically

23 Note that all we are doing is guaranteeing that in each congress the number of simulated incumbents running for reelection is equal to the number of actual incumbents running for reelection. 24 Note that with respect to retirement there are 27 categories of terms served rather than only 21 because in the data set the incumbent with the longest tenure retired after 27 terms. 25 Note that it is not necessary that the sum of all the r(ts) equals one. The calculated retirement probabilities are: r(1)=.02403, r(2)=.06093, r(3)=.09371, r(4)=.10671, r(5)=.14400, r(6)=.14558, r(7)=.12286, r(8)=.15534, r(9)=.17749, r(10)=.18644, r(11)=.08511, r(12)=.26549, r(13)=.15116, r(14)=.15942, r(15)=.26786, r(16)=.29730, r(17)=.20690, r(18)=.35000, r(19)=.33333, r(20)=.20000, r(21)=.25000, r(22)=.60000, r(23)=.00000, r(24)=.50000, r(25)=.00000, r(26)=.00000, and r(27)=1.00000. 26 The empirical value of r(6) is .14558. A random number is drawn from 1 to 100,000. If the number drawn is less than or equal to 14,558, then the incumbent is retired; if the number is greater than 14,558, then the incumbent is not retired.

18

retired at the end of the term and is replaced by a new congressman in the next term. This implies that the

simulated planning horizons of congressmen vary across congressmen according to their ages at the time

they first were elected. For calculating T we choose a mandatory retirement age of R=80.27

5.2.2. Joint distribution of preference parameter and age for the challenger population

In order to draw a successor for a retired or defeated congressman we need to specify a joint

distribution of the unobservable preference parameter, η, and the age, Ac, for the population of

challengers. Assuming that the age of the challenger and his preference for serving constituent interests

are not related, the joint distribution of (η,Ac) is therefore obtained by Pr(η,Ac) = Pr(η)Pr(Ac). The

marginal distribution of η is given in (4). The marginal distribution of Ac is an empirical distribution. It is

constructed from data for the entering age of each of the freshmen congressmen from 1974 through 1998.

This marginal distribution has a mean and standard deviation of 43.5 and 8.7, has minimum and

maximum values of 25 and 71, and is skewed to the left. Challengers who replace defeated or retired

incumbents are randomly drawn from the population of challengers described by the joint distribution of

(η,Ac)

5.2.3. Expected number of congressmen with ts terms served at the time of defeat

We first choose initial values of the parameters of the model. The initial congress has 435

27 The number of congressmen in our data set who did not run for reelection (i.e., “retired”) is 621. The mean age and standard deviation are 56.2 and 11.4, and the minimum and maximum are 32 and 89. The use of the term “retirement” is misleading to some extent as is the relatively low mean of 56.2. Some of those “retiring” congressmen did not run for reelection because they ran for higher office while others took cabinet positions or judgeships. Some died or developed health problems. Some were induced to retire prior to 1992 by legislation eliminating a congressman’s right to keep his unused campaign funds as personal funds unless he had served in Congress before 1980 and retired before 1992 (Burnett, Paul, and Wilhite (1997)). Redistricting involuntarily retired 45 congressmen by eliminating their districts, while an unknown number of other congressmen might have chosen not to run because redistricting radically altered their districts. Taking into account the above causes of “retirement” and the at least partially stochastic nature of these causes, it appears that the job of congressman is one that the jobholders often desire to hold to an old age. Of the 621 congressmen who retired, there were 11 congressmen who retired at age 80 or greater. Of the 957 incumbents in our data set 52 were actually elected for the first time between the ages of 60 and 71. It should be kept in mind, however, that the choice of this endpoint is arbitrary. It should also

19

representatives. Each congressman is characterized by a value of η, a value of ts (terms served), and a

value of As, where As=(Ac+2ts) is the age of the sitting congressman. These values of η, ts, and As for

each congressman in the initial congress are randomly drawn from a joint distribution of η, ts, and As

whose construction is described in the next subsection. By the process described earlier k congressmen

are retired leaving 435-k to run for reelection. During the congressional term each congressman chooses

his optimal degree of shirking, x*(t).28 In the light of the values of α and λ, this choice of x*(t) determines

the probability that the congressman will be reelected to serve in the next term. Given this probability, a

random number generating procedure determines whether the congressman will actually serve another

term. If so, then the congressman's number of terms served is augmented by one and his age as a sitting

congressman, As, is augmented by two.29 If not, then the congressman is replaced in office by a successor

randomly drawn from the pool of challengers characterized by the joint distribution of (η,Ac). When

simulated for all congressmen, the values of η, ts, and As are generated for the members of the next

congress. This process continues for a sequence of thirteen consecutive simulated congresses including

the original simulated congress. We compute the number of defeated simulated congressmen in each

terms-served (ts) category over the thirteen elections. The number of defeated simulated incumbents in

any specified ts category as a fraction of the total number of simulated defeated incumbents for all the ts

categories is the estimated probability that a defeated incumbent will have that specified number of terms

served.

Our model, however, is a stochastic, not deterministic, model. The parameter values determine

probabilities of outcomes at every stage of the model, but the actual outcomes are determined by a

random number generating process in conjunction with those probabilities. Therefore, the estimated

be kept in mind that this is an endpoint for the decision-making time horizon of the individual congressman and not the congressman’s actual retirement age, which can only be known ex post. 28 When optimizing according to equation 3 we use a one-year real opportunity cost, r, of .05, which yields a two-year (one term) real discount rate of δ=(1+r)-2=.907. This reflects investment in a congressional career as more risky than investment in Treasury bonds. 29 If the augmented value of As equals or exceeds the mandatory retirement age, R, then the congressman will not run for any more future reelections and will be replaced by a challenger at the next election.

20

probability that a defeated incumbent will have a specified number of terms served based on just one

simulated run of the model is a random draw from a probability distribution of such estimated

probabilities. However, as the number of replications of the simulations of the thirteen congresses for the

set of specific parameter values increases, the mean value of this estimated probability approaches the

true probability for this set of specific parameter values.30 The expected number of defeated incumbents

with this specific value of terms served is the product of the mean estimated probability for this ts

category and the total number of actual defeated incumbents for all ts categories.31 Having tabulated the

number of actual defeated incumbents for each terms-served category from our data set for the thirteen

elections from 1974 through 1998 and having calculated the expected number of defeated incumbents for

each category for thirteen elections, we can test the null hypothesis that the actual number and simulated

number of defeated incumbents by terms-served categories or cells are generated by the same process by

calculating the value of the chi-squared statistic according to equation (6).

Although there are defeated incumbents with terms served ranging from 1 to 21, we use 12 terms-

served categories or cells instead of 21. This is because the distribution of actual incumbents defeated by

terms served becomes sparse after the cell for eleven terms served.32 The chi-squared test statistic

becomes unreliable when the distribution is sparse. This problem of a sparse distribution can be handled

by combining cells into aggregated cells (Moore (1977)). We therefore choose cells for 1, 2, 3, …, 11,

and 12 through 21 terms served.

After calculating the value of the chi-squared statistic given our initially chosen parameter values,

we choose new values of the parameters and repeat the process. We continue the process searching the

30 Our criterion for the number of replications was to increase the number of replications until the value of the chi-squared statistic demonstrated stability with respect to additional replications. Stability became evident at fifty replications per set of parameter values. 31 Note that for the chi-squared test statistic the determination of the expected number of defeated incumbents in any specific terms-served category is conditioned on the total number of actual defeated incumbents in the data set. See Moore (1977). Also note that it is necessary to round the expected number of defeated incumbents in each terms-served category to an integer. 32 There are seventy-nine defeated incumbents with one term served. This number declines almost monotonically until there are ten defeated incumbents with eleven terms served. There are only three defeated incumbents with twelve terms served. There are no more than four incumbents in any of the cells for terms served of twelve or more. Two of these cells have values of zero and four have values of one.

21

reasonable parameter space until we have the values of the parameters that minimize the value of the chi-

squared statistic.

5.2.4. Joint distribution of preference parameter, terms served, and age for the initial congress

The assignment of the values of η, ts, and As to the congressmen in the initial congress cannot be

done arbitrarily because any arbitrary assignment necessarily biases the estimated values of the

parameters. For example, if we assign values of the unobserved η that are lower than the actual η values

of the congressmen in the 1974 congress, then our estimated value of λ will likely be biased upwards in

order to obtain simulated reelection rates as close as possible to the actual reelection rates for the thirteen

congresses from 1974 through 1998. More generally, any attempt to construct an initial congress that is as

close as possible to the 1974 congress arbitrarily puts more weight on the 1974 congress than on the 1976

through 1998 congresses in the estimation of the parameter values. It is necessary to construct a joint

distribution of η, ts, and As for the initial congress and assign values of η, ts, and As to the

congressmen in this congress by random draw.

The marginal distribution of η is described in equation (5).33 The age of a sitting congressman is

determined by the age at which he was first elected (i.e., when he first became a successful challenger)

and the number of terms he has served by As = Ac + 2ts, where the marginal distribution of Ac is the

previously discussed empirical distribution. The joint distribution of (ts,η,As) follows directly from the

joint distribution of (ts,η,Ac), which is obtained by Bayes' law -- i.e., Pr(ts,η,Ac) = Pr(ts|η,Ac)Pr(η)Pr(Ac).

We now consider the conditional distribution of ts.

The conditional distribution of ts on η and Ac can be approximated as a limiting distribution as

follows. We randomly draw 435 values of η from the marginal distribution characterized by specific

33 Note that the parameters, c and d, of the marginal distribution described in (5) are estimated jointly with the other parameters, α and λ of the sorting function in (2) and a and b of the marginal distribution described in (4), using all thirteen congresses. Consequently, the initial or 1974 congress is given no more additional weight in the estimation of the joint distribution of ts, η, and As than any of the other twelve congresses.

22

values of c and d in (5). We then pair these η values with 435 values of Ac randomly drawn from the

marginal distribution of challenger ages. We now have 435 congressmen, each characterized by an η

value and an Ac value. Each congressman starts with a ts value of 1 and an As value equal to his Ac+2.

The congressman runs in five hundred consecutive elections34 with the voter’s sorting function

characterized by specific values of α and λ. If the congressman is reelected, his ts value is augmented by

one and his As value is augmented by two for the next election provided that the augmented As value is

less than the mandatory retirement age of R; but if his augmented As value is greater than or equal to R or

if he is not reelected, his ts value is set equal to 1 and his As value is set equal to his Ac+2 for the next

election. For this congressman with his specific η value and Ac value, we can calculate the fractions of the

five hundred times that ts takes the values 1, 2, 3, …, 21 and treat these fractions as the probabilities that ts

equals 1, 2, 3, …, 21. When done for all 435 congressmen, we will have the conditional distribution of ts

on η and Ac.

Given the conditional distribution of ts on η and Ac, the marginal distribution of η for

incumbents, and the marginal distribution of Ac, application of Bayes' law yields the joint distribution of

(ts,η,Ac) for the initially chosen values of parameters α, λ, c, and d. The initial congress is constructed by

randomly drawing 435 3-tuples of ts, η, and Ac from this joint distribution. These 3-tuples are directly

converted into 3-tuples of ts, η, and As by As=Ac+ 2ts.

5.2.5. Numerical methodology for the parameter value search

An initial set of values of the parameters, α, λ, a, b, c, and d, are selected. As described earlier the

terms served, η values, and ages of the congressmen in the initial congress are randomly drawn from the

joint distribution of (ts,η,Ac), where the construction of this joint distribution depends upon the specific

values of the parameters, c and d, of the marginal distribution of η in (5) and upon the specific values of

34 Ideally we would like to run a thousand or even more consecutive elections for estimating the limiting distribution but the demands on computer time would be too great.

23

the parameters, α and λ, of the voters' sorting function in (2). Given these specific values of α and λ this

initial congress goes through thirteen reelection cycles with reelected congressmen having their ts values

augmented by one and retired or defeated congressmen replaced by challengers with ts values equal to 1

and η and Ac values randomly drawn from the joint distribution of (η,Ac) for challengers, where the

precise marginal distribution of η in (4) depends upon the specific values of the parameters, a and b.

Based on fifty replications of the selection of the initial congress and the thirteen successive election

cycles, the value of the chi-squared statistic is calculated for the thirteen congresses.35 The values of the

parameters, α, λ, a, b, c, and d, are then perturbed and the whole process of creating the initial congress,

simulating the thirteen reelection cycles, and calculating the chi-squared statistic is repeated. The process

is stopped when the value of the chi-squared statistic is minimized.36 If the value of this chi-squared

statistic indicates that the null hypothesis that the actual data and the simulated data of electoral defeats by

terms-served categories are generated by the same process cannot be rejected, then the parameter values

that generated this minimum chi-squared value are accepted as the parameter values of the model.

6. ESTIMATED PARAMETER VALUES

Based on the above methodology, the estimated parameter values are: a = 2.0097, b = 5.6512, c =

1.6878, d = 6.3485, α =0.9692, and λ = 0.9217. The value of the chi-squared statistic based on the fifty

replications of the model with these parameter values is 8.207. This is below 9.236, which is the critical

value at the .1 significance level for a chi-squared test with 5 degrees of freedom.37 We cannot reject the

null hypothesis that the data and the simulated data of electoral defeats by terms-served categories are

35 The necessity of replications was explained previously in subsection 5.2.3., and the choice of fifty replications was explained in footnote 30. 36 The optimization algorithm used to search for the parameter values is the alternating variables method. See Fletcher (1987). On those occasions when the search algorithm became stuck in a “flat plain” in the seven-dimensional search space we would discretely change selected parameter values. In making these changes we referred to “diagnostic” output along the lines of Table 1 (presented in the next section). 37 The number of degrees of freedom is equal to the number of cells minus one minus the number of parameters. See Moore (1977). The 5 degrees of freedom is equal to 12-1-6.

24

generated by the same process. Furthermore, the power of the test,38 or the probability of rejecting the null

hypothesis when it is false, is .698.

Table 1 provides a more detailed picture of the ability of the model to fit the data. Columns 2 and

3 of the table present the observed number and expected number of defeated incumbents by terms-served

categories. It becomes quickly apparent that the model does an excellent job of explaining the life spans

in office of congressmen serving 3, 4, 5, …, and 11 terms until electoral defeat. The model has a little

difficulty explaining the thin tail of defeated congressmen with 12 or more terms and a little more

difficulty explaining defeats of congressmen with 1 and 2 terms. Column 3 confirms this by indicating the

contribution to the overall value of the chi-squared statistic from each terms-served category.

The estimated parameter values are consistent with the model. Consider the parameters of the

voters’ sorting function (2). The parameter α is the vertical intercept of the sorting function and indicates

the probability that a perfectly faithful incumbent (x(t)=0) running for reelection is reelected. The

estimated value of α is 0.9692, implying that a perfectly faithful incumbent has almost a 97 percent

chance of being reelected. This suggests, at least in the case of an incumbent who does not shirk at all,

that the voters perceive the incumbent’s performance with relatively little noise. The parameter, λ, which

has an estimated value of 0.9217, reflects the degree to which the voters punish shirking by reducing the

probability of reelection.

In order to give an idea of how the estimated parameters of the model imply the probability of an

incumbent successfully running for reelection, an incumbent who shirks with a degree of unfaithfulness

of x(t)=0.02 would have a probability of reelection of 0.951 according to equation (2). In contrast,

incumbents with x(t) values of 0.20 and 0.50 would have reelection probabilities of 0.806 and 0.611. In

order to give an idea of how the model in general performed, the mean probability of reelection of .939

for the thirteen simulated congresses compares favorably to the mean probability of .918 for the thirteen

congresses comprising our data sample.

38 See Brownlee (1965, pp. 98-99) for a general discussion of the power of a test, and see Agresti (1990, pp. 241-

25

The parameters, a and b, are for the beta distribution of η for the population of challengers and

have estimated values of 2.0097 and 5.6512. These values indicate a unimodal distribution that is skewed

to the left with an expected value of 0.26. Potential challengers whose interests are closer to the interests

of the constituents are more likely to seek office than potential challengers whose interests are not. As

explained in section 4 skewness to the left is consistent with rational self-selection by challengers based

on an expected return to running for office criterion39 in the light of sorting of incumbents by the voters.

Indeed, in the absence of sorting there would be no incentive for challengers to self-select and the best

guess for the distribution of the beta distribution of η for the population of challengers would be a

uniform distribution (i.e., a=1=b).

The parameters, c and d, are for the beta distribution of η for congressmen in the initial

legislature and have estimated values of 1.6878 and 6.3485. This distribution is unimodal and skewed to

the left with an expected value of 0.21. The expected value of η for the congressmen in the initial

legislature is less than the expected value of η for the population of challengers. This is consistent with

sorting by the voters because those challengers with the lower η values who are elected are likely to

survive longer in office than those challengers with higher η values who are elected .40

7. ANALYSIS OF SHIRKING

Our analysis of shirking focuses on three issues. First, to what degree does the voters’ sorting

process or the reelection constraint directly limit incumbent congressmen’s shirking? Second, to what

243) for a discussion of power for the chi-squared test. 39 The consistency of our estimated parameter values with an expected utility criterion for seeking office also lends support to Osborne and Slivinski (1996) and Besley and Coate (1997) to the extent that candidates are policy-motivated as well as office-motivated. 40 While consistent with sorting, the smaller expected value of η for the incumbents in the initial congress has to be interpreted with some care. In the absence of an analytic solution for the composition of the congress in the steady state, there is no way of knowing with certainty whether the initial congress in the data set is representative of the congress in the steady state. The composition of the initial congress may have been the result of an exogenous shock to the electoral system. Indeed, the steady state may not be a specific composition of the congress but instead may consist of oscillations around a particular composition. These caveats notwithstanding, it is still accurate to state that the smaller expected value of η for incumbents in the initial congress than for challengers is consistent with voter sorting.

26

degree does the sorting process indirectly limit shirking by inducing political challengers to self-select on

the basis of their likelihood of shirking if elected? Third, what explains the literature’s finding of no

significant last-period shirking?

We start by using the estimated parameter values to simulate the thirteen congresses. Based on

ten replications of the simulations of the thirteen congresses41 mean values of incumbent preferences (η)

and incumbent shirking (x(t)) are calculated for each congress. Figure 1 presents plots of these mean

values.

The mean value of η starts at about 0.21 in simulated congress 1 and gradually increases to about

0.24 by congress 13. This is expected. The distribution of η for the initial congress has an expected value

of 0.21, while the distribution of η for challengers has an expected value of 0.26. As incumbents leave

office over time because of voter sorting or retirement they are replaced by new congressmen from the

challenger population. Over time the expected value of η from the challenger population will become the

upper bound of the possible mean η values of the incumbent population. This turnover of the congress

will push the mean value of η for the congress towards this upper bound of 0.26. Sorting will cause the

mean value to be less than this upper bound as over time voters sort out of office more quickly those

congressmen who are more inclined to shirk.

The mean value of x(t) is fairly constant at about 0.05. In the absence of the reelection constraint,

a congressman will optimally shirk by an amount equal to η. The mean value of η for the congressmen is

0.24, and the expected value of η for the population of challengers is 0.26. Sorting by the voters not only

limits shirking to a small absolute amount of 0.05 but also limits shirking to about twenty percent of what

its unconstrained level would be.

This is, however, only the direct impact of sorting on shirking. The expected value of η of 0.26 is

itself indirectly determined by voter sorting. In the presence of voter sorting there is a higher return to

41 There were negligible differences in the plots of figure 1 when going from three to four replications. Ten replications are therefore more than sufficient to be confident that any noise in the plots is quite small.

27

holding office for those congressmen with lower values of η. Such congressman will receive more utility

per term served and can expect to serve more terms than congressmen with higher values of η. Self-

selection by challengers on the basis of expected return therefore will cause the distribution of η values

for challengers to be skewed to the left and to have an expected value less than 0.5 as is the case for the

estimated distribution of challenger η. In the absence of sorting there would be no reason for challengers

to self-select on the basis of their preferences because the return to holding office would be the same for

all congressmen. Each would continue in office according to a random process, and each would optimally

shirk by an amount of x(t) = η. In the absence of specific information about the distribution of preferences

in the general population, the most likely distribution of η would be uniform with its expected value of

0.5. Shirking would be nontrivially higher given such a distribution of challenger preferences. Sorting by

the voters therefore not only limits shirking by constraining the behavior of incumbent congressmen but

also by inducing potential congressmen to self-select on the basis of the degree to which their interests

coincide with their constituents’ interests. In short, sorting mitigates both the moral hazard problem and

the adverse selection problem.

The literature has found that there is no significant amount of last-period shirking.42 These studies

do not directly measure shirking but instead indirectly identify the existence of shirking. Specifically,

these studies focus on the change in the congressman’s voting record, as measured by one or more of the

interest group voting index scores such as Americans for Democratic Action, in the last term in office

before retirement. Since in the last term before retirement there is no reelection constraint, a change in

voting index score in that last term that is not larger than either the normal term-to-term change in the

voting records of the congressman or continuing congressmen in general is reasonably interpreted as

evidence of a lack of shirking. The inference is that the voters have been able to detect shirking early

enough in the congressmen’s careers so as to sort out of office those congressmen whose interests do not

42 See Lott (1987;1990), Van Beek (1991), and Lott and Bronars (1993).

28

coincide with the constituents’ interests before the last period.43 However, these studies use the change in

voting score index in the last term as an empirical proxy for the existence of shirking but do not actually

estimate the degree of shirking.

This paper has the ability to estimate the degree of shirking, not only in the last period but in

every period. Figure 2 presents a plot of the mean value of shirking by congressmen, x(t), for all

congressmen in their term j in office, j=1,2,3,...,27, regardless of which of the thirteen congresses in our

sample that the jth term occurred.44 Note that figure 2 is not measuring last-period shirking per se by

congressmen who have served n terms because the shirking by a congressmen who serves n terms in the

sample period contributes to the mean value of shirking for up to n different categories of terms served.

Further note that figure 2 takes into account shirking by congressmen whether they are reelected,

defeated, or retire.45

The plot of figure 2 indicates that mean shirking by congressmen is in the range of approximately

0.04 to 0.05 for one through nineteen terms served. Shirking decreases dramatically after nineteen terms

and reaches zero by twenty-seven terms. This is a strong result, but it is also a result that appears

unintuitive. The individual congressman optimally increases his shirking as his number of terms served

increases, which by itself suggests that the plot of mean shirking in figure 2 should be monotonically

increasing with terms served. In contrast, sorting by the voters more severely limits the time in office of

congressmen with higher rather than lower preferences to shirk, which by itself suggests that the plot of

mean shirking in figure 2 should be monotonically decreasing with terms served. The plot of mean

shirking in figure 2 appears to be inconsistent with the behavior of the individual congressman as well as

43 This analysis first appeared in Lott (1987). 44 To illustrate, if the congressman was beginning his fourth term during the initial simulated congress and served through his twelfth term, then his shirking values for each of the terms four through twelve respectively would be used to calculate the mean value of shirking for congressmen in their fourth term, the mean value of shirking for congressmen in their fifth term, and so on through the mean value of shirking for congressmen in their twelfth term. The shirking of a congressmen first elected in the eleventh simulated congress and who was still serving during the thirteenth simulated congress would be used to calculate mean shirking for congressmen in their first, second and third terms. The shirking of a congressman first elected in the seventh simulated congress and that served two terms would be used to calculate mean shirking for congressmen in their first and second terms.

29

with the sorting of shirking congressmen by the voters. In fact this apparent inconsistency is not a real

inconsistency. Figure 2 reflects both of the partial effects described above at work.

The individual congressman has an incentive to shirk more as his tenure increases because the

cost of shirking decreases as the number of potential terms remaining in the congressman’s time horizon

decreases. In the limit, shirking during the last term before retirement is costless and the congressmen will

shirk by an amount equal to his unconstrained value of shirking or x(t)= η. Furthermore, the entire time

path of optimal shirking is higher for those congressmen with greater preference to shirk.

Figure 3a, 3b, and 3c present illustrations of the optimal shirking by a congressman in each term

served and the probability of serving n terms for an arbitrarily chosen twenty-term time horizon given that

the values chosen for α and λ are their estimated parameter values of 0.9692 and 0.9217 and that the

values of η in the three figures respectively are 0.26, the expected value of the distribution of η for the

population of challengers and therefore for freshman congressman, and 0.41 and 0.11, which are equal to

the expected value of the distribution of η plus and minus one standard deviation. In figure 3a the value of

optimal shirking is 0.04 for terms 1 through 5, increases to 0.06 for terms 6 through 14, and then increases

to 0.26 by term 20. For a twenty-term horizon, this is the time path of optimal shirking of the “average”

challenger elected to office. Even though this average freshman congressman has an unconstrained

optimal value of shirking of 0.26, sorting by the voters dramatically decreases his constrained optimal

shirking in the earlier stages of his career. Furthermore, there is a probability of 0.76 that this average

freshman survives through term 5 but a probability of only 0.36 that he survives through term 14.46 Even

in term 15 the value of shirking is only 0.08. As illustrated in Figure 3b a congressman with large value of

η of 0.41 has a value of shirking of only 0.10 for the first twelve terms and has a probability of only 0.26

of surviving through term 12. In contrast a freshman with a value of η of 0.10 in figure 3c has a value of

45 Since there is no formal model relating retirement to terms served in the literature, we have stochastically retired congressmen in the light of the empirical distribution of retirement by terms served when estimating the parameters of our model. We therefore look at all congressmen in figure 2. 46 Probability of surviving for n terms is calculated from equation (2) and treating the probabilities of reelection in successive terms as independent.

30

shirking of only 0.02 for terms 1 through 15 and a probability of 0.50 of surviving through terms 15.

These calculations indicate two things. First, even congressmen with relatively high preferences

for shirking optimally will severely limit their shirking in the earlier stages of their potential careers as

congressmen. Second, sorting by the voters is efficient enough to give these relatively high potential

shirkers a quite low probability of remaining in office long enough for them to increase their constrained

optimal shirking nontrivially. Indeed, the same can be said about those congressmen with an average or

even lower preference to shirk.47 Taken together, the relatively small amounts of shirking by congressmen

in the earlier parts of their potential careers, even by those congressmen with average or above-average

preferences for shirking, and the lower expected number of terms in office by those congressmen with

greater preferences for shirking accounts for the relatively low and fairly stable levels of mean shirking

through the first nineteen terms served in figure 2. The increase in shirking by each congressman as his

number of terms served increases is offset by the voters’ sorting out of office those congressmen with

greater preferences for shirking. The rapid decrease in mean shirking as the number of terms served

increases beyond nineteen implies that only those congressmen with very low preferences for shirking

can survive that long.48

The direct measures of mean shirking in figure 2 support the conclusions as well as the reasoning

of those researchers who infer from analyses using proxies for shirking that the amount of last-period

shirking is insignificant. The plots in figure 2, however, present results that are more general than just

last-period shirking. Since the cost of shirking decreases as the congressman serves more terms and

approaches the end of his time horizon, the last term before retirement is just the limiting case of a more

general process. In this sense figure 2 is presenting evidence on the each-and-every-period problem, not

just the last-period problem. Figure 2 is consistent with voters having the ability to constrain congressmen

47 This ability to sort out of office congressmen who even shirk a small amount is consistent with Lott and Bronars (1993). 48 The spikes in the plot of mean shirking in figure 2, particularly in this downward sloping portion of the plot, likely reflect the complexity of the model and the small number of simulated congressmen who achieve tenure greater than nineteen terms. Even though the plot of figure 2 was based on one hundred replications of the thirteen simulated

31

to choose time paths of optimal shirking that lie well below their unconstrained levels of shirking and the

ability to sort out of office politicians with higher time paths of optimal shirking more quickly than

politicians with lower time paths of optimal shirking.

8. CONCLUSIONS

This paper has developed and estimated a stochastic dynamic model in which the degree of

political shirking by congressmen can be directly calculated using the estimated parameter values. The

model not only analyzes optimal shirking by incumbent congressmen but also self-selection by

challengers. The estimation of shirking is done without resorting to the use of an empirical proxy for

shirking despite the fact that the degree of shirking is not directly observable.

The model is quite parsimonious, consisting of a relatively small number of behavioral equations

and requiring a relatively small number of parameters to be estimated. Each parameter has an explicit and

precise interpretation regarding the voters’ sorting function or the self-selection of challengers. Given that

the electoral process being modeled is complex, dynamic, and stochastic and that the choice variable of

the incumbent, the degree of shirking, is not directly measurable, we do not estimate the parameter values

by traditional regression analysis. Instead, the parameter values of the model are numerically estimated

using simulation by the method of minimum chi-squared.

The null hypothesis that the simulated and actual numbers of defeated congressmen by terms-

served categories are generated by the same process cannot be statistically rejected. Estimated parameter

values are consistent with voters sorting incumbents and with self-selection by challengers according to

the degree to which challengers’ interests coincide with constituents’ interests.

Estimated shirking is small absolutely and relative to what its level would be in the absence of

sorting by the voters. The sorting process not only directly constrains shirking by incumbents but also

indirectly constrains shirking by inducing challengers to self-select on the basis of their likelihood of not

congresses, only 2,777 of the 565,500 observations of terms served were for greater than nineteen terms served. Only 255 observations were for greater than 23 terms served.

32

shirking. Finally, estimates of mean shirking by terms-served categories are consistent with the

literature’s empirical findings using proxy variables of insignificant last-period shirking. We find that the

mean value of shirking over the whole congress at first is both low and fairly constant and then decreases

with the number of terms served even though optimal shirking by an individual congressman increases

with terms served. Both the voters’ threat of not reelecting shirking congressmen and the actual sorting

out of office of shirking congressmen reconciles these two apparently contradictory findings. Indeed, the

last-period problem is the limiting case of the each-and-every-period problem.

33

References

Agresti, Alan, Categorical Data Analysis. New York: Wiley, 1990.

Barone, Michael and Grant Ujifusa, The Almanac of American Politics (1976 through

2000 editions). Washington, D.C.: National Journal.

Bender, Bruce, “A Reexamination of the Principal-Agent Relationship in Politics,” Journal of Public

Economics (January, 1994), vol. 53, 149-163.

Bender, Bruce and John R. Lott, Jr., “Legislator Voting and Shirking: A Critical Review of the

Literature,” Public Choice (April, 1996), vol. 87, 67-100.

Besley, Timothy and Stephen Coate, “An Economic Model of Representative Democracy,” Quarterly

Journal of Economics (February, 1997), vol. 112, 85-114.

Bernhardt, M. Daniel and Daniel Ingberman, “Candidate Reputations and the ‘Incumbency Effect’,”

Journal of Public Economics (1985), vol. 27, no. 1, 47-67.

Brownlee, K. A., Statistical Theory and Methodology in Science and Engineering (second edition).

New York: Wiley, 1965.

Burnett, J., C. Paul, and A. Wilhite, “Political Campaigns as Rent-Seeking Games: Take the Money and

Run,” Public Finance Review (September, 1997), vol. 25.

Carson, Richard T. and Joe A. Oppenheimer, “A Method of Estimating the Personal Ideology of Political

Representatives,” American Political Science Review (March, 1984), vol. 78, 163-178.

Dougan, William R. and Michael C. Munger, “The Rationality of Ideology,” Journal of Law and

Economics (April, 1989), vol. 32, 119-142.

Fenno, Richard F. Jr., Home Style: House Members in Their Districts. Boston: Little, Brown, 1978.

Fiorina, Morris P., Representatives, Roll Calls, and Constituencies. Lexington, MA: Heath, 1974.

Fletcher, Roger, Practical Methods of Optimization. Chichester, UK: Wiley, 1987.

Francis, Wayne L., Lawrence W. Kenny, Rebecca B. Morton, and Amy B. Schmidt, “Retrospective

Voting and Political Mobility,” American Journal of Political Science (November, 1994), vol.

38, 999-1024.

34

Goff, Brian L. and Kevin B.Grier, “On the (Mis)measurement of Legislator Ideology and Shirking,”

Public Choice (June, 1993), vol. 76, 5-19.

Ingberman, Daniel, “Reputational Dynamics in Spatial Competition,” Journal of Mathematical and

Computer Modelling (1989), vol. 12, no. 4/5, 479-496.

Jackson, John E. and John W. Kingdon, “Ideology, Interest Group Scores, and Legislative Voters,”

American Journal of Political Science (August, 1992), vol. 36, 805-823.

Jung, Gi-Ryong, Lawrence W. Kenny, and John R. Lott, Jr., “An Examination of Why Senators from the

Same State Vote Differently So Frequently,” Journal of Public Economics (May, 1994), vol. 54,

65-96.

Kalt, Joseph p. and Mark A. Zupan, “Capture and Ideology in the Economic Theory of Politics,”

American Economic Review (June, 1984), vol. 74, 279-300.

Lott, John R., Jr., “Political Cheating,” Public Choice (1987), vol. 52, 169-186.

Lott, John R., Jr., “Attendance Rates, Political Shirking, and the Effect of Post-Elective Office

Employment,” Economic Inquiry (January, 1990), vol. 28, 133-150.

Lott, John R., Jr., and Stephen G. Bronars, “Time Series Evidence on Shirking in the U. S. House of

Representatives,” Public Choice (June, 1993), vol. 76, 125-149.

Lott, John R., Jr., and Mark L. Davis, “A Critical Review and an Extension of the Political Shirking

Literature,” Public Choice (December, 1992), vol. 74, 461-484.

Lott, John R. Jr. and W. Robert Reed, “Shirking and Sorting in a Political Market with Finite-lived

Politicians,” Public Choice (April, 1989), vol. 61, 75-96.

Moore, David S., “Generalized Inverses, Wald’s Method, and the Construction of Chi-Squared Tests of

Fit,” Journal of the American Statistical Association (March, 1977), vol. 27, 131-137.

Neyman, Jerzy, “Contributions to the Theory of the χ2 Test.” In Proceedings of the First Berkeley

Symposium on Mathematical Statistics and Probability, Jerzy Neyman, editor. Berkeley:

University of California Press, 1949.

35

Osborne, Martin J. and Al Slivinski, “A Model of Political Competition with Citizen-Candidates,”

Quarterly Journal of Economics (February, 1996), vol. 111, 65-96.

Peltzman, Sam, “Constituent Interest and Congressional Voting,” Journal of Law and Economics

(April, 1984), vol. 27, 181-210.

Romer, Thomas and Howard Rosenthal, “Voting Models and Empirical Evidence,” American Scientist

(September, 1984), vol. 72, 465-473.

Schansberg, D. Eric, “Moving Out of the House: An Analysis of Congressional Quits,” Economic

Inquiry (July, 1994), vol. 32, 445-456.

Stigler, George, “Economic Competition and Political Competition,” Public Choice (Fall, 1972), vol. 13,

91-106.

Sutter, Daniel, “Leviathan at Bay? Constitutional Versus Political Controls on Government,” Economic

Inquiry (October, 1998), vol. 36, 670-678.

United States Congress, Congressional Biographical Directory. Washington, D.C.:

U. S. Government Printing Office, 2000. (http://bioguide.congress.gov).

Van Beek, James R., “Does the Decision to Retire Increase the Amount of Political Shirking?”

Public Finance Quarterly (October, 1991), vol. 19, 444-456.

36

Table 1

Observed and Expected Number of Defeated Incumbents and Contributions to the Value of the Chi-Squared Statistic by Terms-Served Categories

Terms-Served Categories

Observed Number of Defeated Incumbents

Expected Number of Defeated Incumbentsa

Contribution to the Value of the

Chi-Squared Statisticb

1 79 65 3.015 2 41 51 1.960 3 44 43 0.023 4 34 34 0.000 5 20 24 0.666 6 17 20 0.450 7 14 16 0.250 8 12 14 0.285 9 11 11 0.000

10 8 9 0.111 11 10 9 0.111

12-21 16 12 1.333 Notes: a The expected number of defeated incumbents over all the terms-served categories is 308, whereas the actual number of defeated incumbents is 306. The difference reflects rounding to integer values when calculating the expected number of defeated incumbents for each of the terms-served categories. b The sum of the contributions to the value of the chi-squared statistic in the fourth column of the table of 8.204 differs from the chi-squared value of 8.207 reported in the text because each of the contributions reported in the table is truncated to three decimal places.