more statisitical properties of deasm340/research/deastatpropsfeb2004.pdfcubbin: more statistical...

Cubbin: More statistical properties of DEA

Some more statistical properties of Data Envelopment Analysis

Abstract Data envelopment analysis (DEA) is a popular tool for estimation of technical efficiency in a wide range of applications. There has been relatively little work on the finite sample properties of the resultant estimates. This paper reports on Monte Carlo simulations for a number of DEA models to investigate the likely impact of several aspects of the data generation process on the size of confidence intervals. These include the probability distributions of the underlying data, the form of the production correspondence, the dimensions of the problem and specification error. It concludes that, for the case of the base model and variations around it:

− Probability distributions and production correspondences have relatively little influence − The number of dimensions relative to the number of observations, and specification errors

could be a major source of inaccuracy − The variable returns to scale model seems to perform poorly in a range of DGPs

Key Words: Data Envelopment Analysis, Monte Carlo, statistical properties Contact details: John Cubbin Economics Department City University Northampton Square London EC1V 0HB [email protected] + 44 20 7040 8533

1. Introduction1 Large numbers of efficiency analyses are reported each year in the economics and managerial science literature, based on mathematical programming methods, the most popular of which appears to be Data Envelopment (or Envelope) Analysis (DEA). Many of these are purely academic studies, but the technique has also seen growing use by regulators of network utilities (especially electricity distribution networks). The quality of the resulting estimates can have knock-on effects not only for the price of electricity, but also the investment decisions made by firms operating under regulatory frameworks using this type of quantification. However, we know comparatively little about the statistical properties of the estimates. This paper aims to add to that small pool of knowledge by reporting on some Monte Carlo simulations of DEA estimates under a range of assumptions. In particular, although we know that DEA is consistent and can give good estimates given a large enough number of observations and the correct specification, we know little about its properties with the numbers of observations that might actually be available for these studies, and given the inherent uncertainty about whether the correct specification has been chosen. 1 I am grateful to Graham Shuttleworth, Gian Carlo Scarsi, and colleagues at City University for comments on previous drafts. Remaining errors are my own.

1

mailto:[email protected]


The paper is organised as follows: Section 2 sets out the basic DEA framework; Section 3 describes previous work on statistical aspects of DEA; Section 4 sets out the conceptual framework; Section 5 describes the simulation process; Section 6 sets out the statistical measures being used; Section 7 reports on some results; and Section 8 discusses the findings.

2. DEA Generally DEA deals with a number of outputs and inputs. For the sake of this paper we shall restrict our input to one which may be a single input or else an appropriate aggregation such as cost. This reflects the approach adopted in most regulatory situations. Figure 2.1 shows a combination of outputs and inputs observed in an empirical application. Data envelope analysis involves selecting those observations that together form a convex hull around the remaining data points. This hull may be regarded as simply a description of the data which provides information on relative efficiency or as an approximation to some theoretically attainable frontier given by the current state of technological knowledge. Figure 2.1: The envelope

Output y

Input x

(x0,y0)

(x1,y1) (x2,y2)

(x3,y3)

(x0’,y0)

The figure shows the simplified two-dimensional case of a single input x and a single output y. There is a set of n observations (xi,yi) and the figure identifies five of these.

The observation we are interested in is (x0,y0). One of two approaches is commonly employed:

− Input minimisation: for a given y0, by how much could xo be reduced?

− Output maximisation: for a given x0, by how much could yo be increased?

We focus here on the input minimisation problem. This is typically the orientation adopted in regulatory situations, and is represented in the figure by a shrinkage down the vertical dotted line to (x0’,y0). This a weighted sum of the two observations in 0’s reference set.

This solution can be found by solving a linear program, which is the dual of the program defined by Charnes Cooper and Rhodes:

2


Choose variables θ, λ1, λ2,… λn to minimise θ = x0’/x0 (1) Subject to λ1x1 + λ2x2 + … λnxn ≤θx0 (2a) λ1y1 + λ2y2 + … λnyn ≤ y0 (2b) In the variable returns to scale case (shown in Figure 2.1) we add the condition: λ1 + λ2 + … λn = 1 (3) With k1 inputs and k2 outputs we simply add an extra constraint for each input and output. θ is defined as the technical efficiency of unit 0. In discussing the data generation process it is usually easier to work with the inefficiency or distance measure Φ =1/ θ. Although the basic ideas have been developed with physical inputs, the logic of the above applies equally to an aggregate input such as cost. Φ is then the ratio of actual to efficient cost.

3. Previous work Grosskopf (1996) provided a useful survey on the issue of statistical inference in DEA under different assumptions about the distribution of Φ, and drew attention to the problems in adopting two stage DEA, where the second stage involves regressing inefficiency estimated from a DEA first stage on explanatory variables. Banker (1993) showed that the properties of DEA depended on the distribution of inefficiencies in the data generating process, but that a monotone decreasing probability density function is sufficient to guarantee that DEA is a maximum likelihood estimator of efficiency. However, this does not guarantee the usual MLE properties of consistency, asymptotic efficiency, and normality, since the number of parameters to be estimated increases with the number of observations. Banker shows the DEA estimator to be consistent for a broader class of distribution, provided that the probability distribution function for the inputs is everywhere positive within a compact subset of Rk. Korostelev et.al.(1995) provide further results on the conditions under which DEA is a maximum likelihood estimator for one output and p inputs. Kittelsen (1993) makes use of a result from Banker to calculate tests of specifications for Norwegian electricity distribution under the assumption that the distribution of Φ is (a) half normal and (b) exponential. Gijbels et al (1999) examine the special case of one input and one output. They note that analytic results for bias in DEA estimates of efficiency may be intractable for problems of a higher dimensionality. However, they make the general point that the degree of curvature of the underlying production or cost function will affect the bias. Although one of DEA’s advantages is its ability to deal with an arbitrary shape for the transformation curve, this is bought at a potential cost.

3


Banker et al (1989) suggest a rule of thumb that the number of observations should be at least three times the total number of inputs and outputs. Kittelsen (1995) addressed the problem of bias in DEA using Monte Carlo analysis and concluded that “bias is important, increasing with dimensionality [i.e. k] and decreasing with sample size, average efficiency, and a high density of observations near the frontier.” Pedraja-Chaparro, Salinas-Jimenez and Smith (1997) have found, in a Monte-Carlo simulation, that the mean bias could be reduced, and correlation with true efficiency scores improved, by imposing restrictions on the weightings of the inputs and outputs so that they were more similar for different observations. Pedraja-Chaparro et al (1999) set out some performance measures according to:

− The distribution of true efficiencies; − Sample size; − Number of outputs and inputs; and − The correlation between the variables

Their results show that the performance of the DEA estimates can be sensitive to model specification. For example, the loss in performance due to increased numbers of variables is substantially reduced when there is a correlation of 0.8 between the variables, and the Banker et al rule would reject some models with reasonable levels of performance. One of the most comprehensive explorations is by Kittelsen (1999), who carries out a series of Monte Carlo studies using 5-10 runs of 1000 samples generated with different true values of parameters such as the elasticity of scale. He uses a single output, multiple input Cobb-Douglas production function. Although Kittelsen reports values of bias, his main pre-occupation is with the power of certain tests of significance that have been proposed. He examines the impact of sample size, the average efficiency level, the inefficiency distributional form, the distribution of output, and the number of inputs. He examines inefficiency distributions of the following forms: half-normal, gamma, and exponential. The outputs are distributed normally in most trials, except for one where he tries uniform and lognormal distributions. Finally, Simar and Wilson have, in several papers, developed approaches to bootstrapping. The normal, or naive bootstrap, which simply takes repeated subsamples from the existing subsamples has been shown to be of limited value (Simar and Wilson, 1999). Instead, Simar and Wilson (1998, 2000)) advocate resampling from a semi-artificial sample generated from a density function for the inefficiency values based on a smoothed representation of the observed distribution of efficiency estimates. In empirical examples they find their confidence intervals robust to the bandwidth parameter used in their smoothing algorithm. More recently, they (Simar and Wilson 2002) propose bootstrap procedures which address the problem of serial correlation in the two stage procedures which seek to investigate the determinants of technical efficiency.

4


4. The conceptual background This section present heuristic arguments describing some sources of inaccuracy for mathematical programming measures of efficiency. DEA measures may be inaccurate for any of the following reasons:

− The reference group are themselves inside the frontier; including the special case where an observation references itself even though it is not on the frontier.

− Measurement error − Specification error (exclusion of a relevant variable)

Conversely, DEA estimates will be accurate to the extent that:

− The relevant part of the frontier is well-populated for all observations − Measurement error is absent; and − The specification is correct

This paper will focus on the first of these issues. We shall look at the special case where measurement error is absent. For a treatment of measurement error see Lasdon et al (1993) and for an application of “stochastic DEA” where panel data are used to estimate measurement error see Fehti, Jackson, and Weyman-Jones (2001). The problem of specification error is frequently faced in practical DEA applications. In this paper we shall look briefly at the effects of excluding variables “of minor interest”. We can identify the following main influences on the ability of the data to generate a sufficient frontier:

− The number of observations − The probability distribution of (in)efficiencies; − The dimensions of the problem; − The true shape of the frontier; − The distribution of the variables; − Specification errors

4.1 The number of observations As with almost all measurement techniques, increasing the number of observations results in improved estimates. However, frontier methods are likely to be more sensitive to observation numbers since they depend for their calibration on a subset of the observations – those on or near the frontier.

4.2 The probability distribution of the inefficiencies This is an extension of the above point: the more observations at or close to the frontier the better will be the resultant estimates of distances from it. Figure 4.1 shows two alternative distributions for Φ – the uniform distribution and exponential. Although they have the same mean one might expect the latter to produce a lower bias than the former, since it has relatively more observations clustered near the frontier. Figure 4.1 Alternative probability distributions for inefficiency

5


Probability density

0 Φ0 Φ1

4.3 The dimensions of the problem The dimensions of the problem refers to the number of inputs and outputs, plus an extra dimension if variable rather than constant returns is assumed. The more dimensions, the more points are required to populate the frontier. The so-called “curse of dimensionality” is the fact that the number of observations required to obtain a given degree of accuracy varies, not just proportionately, but with the power of the number of dimensions. Korostelev et. al. (1995) show that a measure of the discrepancy between θ and its DEA estimate varies according to θ(n-2/p+2) for large n, where n is the number of observations and p is the number of dimensions.

4.4 The true shape of the frontier The true shape of the frontier determines the number of efficient observations necessary to provide sufficient reference groups that no observations are mistakenly classified as efficient. With constant returns to scale and one input and output only one observation is needed. If there are variable returns to scale, or if there is another output or input, additional efficient observations are required to define the frontier. To define it exactly requires an arbitrarily large number of observations if we do not have a parametric form for the shape of the frontier. For example, consider the case of two outputs and one input. Figure 4.2 shows three cases – a linear frontier A and two curved frontiers B and C. The points P and Q represent extreme data points on the ends of the frontier. In case A the two observations P and Q define the frontier exactly within the cone OPQ. For all observations lying within the cone the DEA measure will be accurate. If the frontier is really like B an additional efficient observation mid-way between the two might be enough to give a high degree of accuracy. However, if the true frontier is like C it may take four observations on the frontier to describe it accurately. Figure 4.2: The impact of curvature on the accuracy of DEA

60 Output 1

Output 2

C

A B

P

Q


If four efficient observations are required to approximate the frontier in two dimensions, we require 16 in 3 dimensions, 64 in four dimensions, and generally 4d-1 in d dimensions. Factoring in the proportion of observations which are at or near the frontier at random increases the required numbers by orders of magnitude.

4.5 The distribution of the variables The number of observations required to describe the frontier depends on the range of the data. In Figure 4.3, the closer to each other are the extreme values P and Q the more accurately will a piecewise linear envelope describe the frontier. On the other hand, where there are outliers (see point S in figure 3) they may be falsely reported as efficient simply because of the absence of a truly efficient reference group. Figure 4.3: Outliers and the frontier

0 Output 1

Output 2

S

P Q

True Frontier

Estimated Frontier

4.6 Specification error The main types of error are:

− Omission of a relevant variable; − Inclusion of an irrelevant variable; and − Choice of the wrong functional form

(In DEA there is a fourth possibility – the wrong classification of a variable as an input or output.) Omitting a relevant variable will reduce accuracy to the extent that the omitted variable affects the level of output or input requirements. Adding in an irrelevant variable will increase the bias and raise the proportion of falsely attributed efficiency scores of 1. Since DEA only assumes a quasi-concave production function the only way this can be violated is if the production function is convex – for example, if there are diseconomies of scope (economies of specialisation). Although this may seem an unlikely scenario some empirical econometric applications imply such a functional form. However, we do not examine this in the current paper. Since DEA allows for both constant and variable returns to scale specification error may also occur through using the wrong model.

7


5. Simulation In order to be able to generate a large number of replications the simulations were carried out using a program written by the author in Fortran 90. The LP is based on the simplex procedure described in Press et. al. (1986). The random number generators are also derived from procedures set out in the same book. The core model was one of a single input (“cost”) and three outputs. The following cost functions were applied: Linear: Cost = (a + b1x1 + b2x2 + b3x3) (4) Quadratic (Constant returns): Cost = (a + b1x1

2 + b2x22 + b3x3

2)0.5 (5) Quadratic (variable returns to scale): (a + b1x1

2 + b2x22 + b3x3

2)0.5c (6)

In all cases actual cost = Efficient cost × Φ Three types of efficiency distributions are used, uniform, normal, and exponential. The default parameters and specifications are set out in Tables A1 – A3. Using different random number seeds as the starting point it was found that 250,000 replications were required to give performance results invariant to two significant figures. The base case involves only one input and three outputs with varying weights. This represents about the minimum number of variables which would make it worthwhile to perform DEA.

6. Statistical measures of performance There are several ways of estimating the quality of the DEA measure. For our first investigation we report the following: Bias: For each observation the estimated efficiency will be not less than the true value as long as all relevant factors are included. Our measure of bias is the average estimated value less the average true value. False efficiencies: Proportion of observations whose true efficiency is less than 95% but which are reported as >99.9% efficient. Pearson correlation coefficient: This is simply the Pearson correlation coefficient between the true and estimated efficiency. In many applications one is willing to settle for a measure of efficiency relative to the observed frontier rather than the (possibly unobservable) true frontier. Root mean square error (RMSE): An estimate of the typical distance between the true and observed efficiency Confidence intervals: The widths of the 99, 95, and 90% confidence intervals are reported. These are calculated as follows: the largest and smallest deviations were stored in order of size; for every 10,000 replications the upper and lower half percentile, 2.5% and 5% points were collected. To obtain 250,000 replications

8


requires 25 of these cycles of 10,000. The reported confidence intervals are the averages of 25 confidence intervals for each of the cycles. Since the efficiency scores are calculated on a scale of 0-1.00, the measures of bias, RMSE and confidence are all measured on the same scale. We occasionally refer to them also as percentages found simply by multiplying the raw numbers by 100. Reporting all these measures is potentially wasteful of space and confusing so, having compared the alternative measures for Investigation 1 we focus on reporting the width of the 95% confidence interval for the remainder of the paper. Other statistics are reported when noteworthy results are found.

7. The investigations

7.1 Investigation 1: Effect of numbers and inefficiency distribution For this analysis we assumed initially a uniform distribution of outputs. (This might be considered the most favourable distribution for DEA, since it would tend to populate the frontier more evenly than the others. As we see below this is not necessarily the case.) Bias is the most obvious measure of performance. In Table 7.1 below we consider also the proportion of observation which are falsely classed as efficient. Table 7.1 BIAS FALSE EFFICIENCY% Inefficiency distribution Inefficiency distribution No of observations Normal Uniform Exponential Normal Uniform Exponential

8 0.167 0.211 0.157 43.51 43.43 39.8316 0.121 0.156 0.111 26.34 26.17 24.2632 0.084 0.109 0.075 14.54 14.80 13.3564 0.055 0.072 0.048 7.30 7.78 6.78

125 0.036 0.047 0.030 3.52 3.88 3.21250 0.022 0.030 0.019 1.56 1.79 1.41500 0.013 0.018 0.011 0.65 0.77 0.58

1000 0.008 0.011 0.007 0.26 0.32 0.22 With only 32 observations the average bias is of the order of 10% with one on six observations being mis-classified as efficient. The type of error distribution seems to make little difference. By contrast the effect of the number of observations is dramatic. Table 7.2 reports two other measures of performance: root mean square error and the Pearson correlation coefficient. A correlation coefficient of 0.73 for eight observations looks reasonable until one realises that the RMSE is of the order of 20%.

9

John Cubbin

This section based on analysis contained in “Investigation1d.xls”


Table 7.2 Numbers, inefficiency distributions, correlation coefficients, and RMSE Correlation coefficient Root mean square error No of observations Normal Uniform Exponential Normal Uniform Exponential

8 0.731 0.631 0.820 0.195 0.240 0.189 16 0.833 0.764 0.895 0.147 0.184 0.139 32 0.896 0.852 0.938 0.108 0.135 0.099 64 0.937 0.909 0.964 0.076 0.096 0.070

125 0.964 0.946 0.980 0.054 0.068 0.048 250 0.980 0.969 0.990 0.037 0.047 0.032 500 0.990 0.984 0.995 0.025 0.032 0.022

1000 0.995 0.992 0.998 0.017 0.021 0.014 Since we could not guarantee the distribution of errors was well behaved (e.g. a normal distribution), we calculated the confidence intervals directly from the calculated errors. They tell the same general story as the RMSE, although the average ratio (Confidence Interval/RMSE) was 2.55. Table 7.3: Effect of numbers and inefficiency distribution on confidence intervals

95% Confidence interval Number of Observations Normal Uniform Exponential

8 0.38 0.43 0.40 16 0.31 0.37 0.31 32 0.25 0.30 0.24 64 0.20 0.24 0.18

125 0.15 0.18 0.13 250 0.11 0.13 0.09 500 0.07 0.09 0.07

1000 0.05 0.06 0.04 The above tables imply that the number of observations matters more than the type of distribution for the inefficiency term. The relationship between the number of observations and the width of the confidence interval is shown in Figure 7.1.

10


Figure 7.1: 95% Confidence intervals

Different inefficiency distributions

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

0.45

0 200 400 600 800 1000 1200Number of observations

95%

con

fiden

ce in

terv

al

NormalUniformExponential

As expected the exponential distribution performed best overall. The normal distribution came a fairly close second, despite not having a concentration of observations near the frontier. The uniform distribution led to the worst performance statistics. This result is consistent with Kittelsen’s (1999) findings, although he employed the half-normal, gamma, and exponential distributions2.

7.2 Investigation 2: the number of dimensions and the distribution of the outputs Given that performance of the base case seems to be most sensitive in the region of the number of observations we shall limit the analysis to two cases: 40 and 100 observations. We also examine the possible significance of the type of distribution for the outputs on this factor. As Table 7.4 and Figure 7.2 show, the choice between normal and uniform distribution of outputs made little difference, although outputs distributed according to the exponential distribution added a few percentage points to the confidence interval when there were 6, 7 and 8 outputs.

2 See Kittelsen (1999) Table 5.

11

John Cubbin

Based on file ..dimensions\InvestigationM.xls


Table 7.4: Effect of output distribution 95% confidence intervals Number of outputs n=40 (Norm) n=100 (Norm)n=40 (Unif)n=100 (Unif)n=40 (Expo)

1 0.041 0.017 0.041 0.018 0.0412 0.165 0.101 0.166 0.101 0.1563 0.231 0.162 0.232 0.162 0.2264 0.269 0.205 0.271 0.210 0.2735 0.297 0.235 0.295 0.238 0.3106 0.316 0.257 0.312 0.256 0.3387 0.335 0.274 0.327 0.272 0.3608 0.346 0.288 0.337 0.283 0.378

Key: n is the number of observations. Norm, Unif and Expo refer to normal, uniform, and exponential distributions respectively for the outputs Figure 7.2

Effect of dimensions and output distribution on confidence intervals

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

0 2 4 6 8 10

Number of outputs

95%

Con

fiden

ce in

terv

al

n=40 (Norm)

n=100 (Normal)

n=40 (Uniform)

n=100 (Uniform)

n=40 (Exponential)

Table 7.4 and 7.2 demonstrate a well known phenomenon of DEA - that additional variables can bring significant extra requirements in terms of numbers of observations required to produce a given degree of precision. Because of this there is a temptation to drop variables where their relevance has not been conclusively established. The next section examines some possible consequences.

Investigation 3: specification errors So far we have assumed that the researcher knows what the true data generation process is, at least as far as being able to identify correctly the relevant inputs and outputs. In practice this may not be so easy. Others have investigated the possibility of hypothesis testing both within the DEA model (Banker, 1993; Kittelsen 1993) and in a two-stage bootstrapping process (Simar and Wilson 2000). We can identify the following kinds of specification error

12


− An irrelevant variable is wrongly included − A relevant variable is wrongly omitted − The wrong choice in made between constant and varying returns to scale

formulation. The inclusion of an irrelevant variable has the tendency to increase the upward bias in DEA scores, as does the inappropriate use of the varying returns to scale model. Omitting a relevant variable or unjustifiably using the constant returns model will create a bias in the opposite direction. Adding in an irrelevant variable Table 7.5 Experiment 3.1 Adding in a irrelevant variable. 95% CI Outputs Correct 1 Irrelevant

1 0.04 0.272 0.17 0.343 0.23 0.374 0.27 0.385 0.30 0.406 0.32 0.417 0.34 0.42

Note: based on 40 observations; reported for normally distributed outputs and inefficiencies. The irrelevant variable in question had the same distribution as the other outputs. Figure 7.3

Specification error: one irrelevant variable

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

0.45

0 1 2 3 4 5 6 7 8

Number of true outputs

95%

Con

fiden

ce In

terv

al

Correct specification

1 Irrelevant variable

The effect is greatest where there are few true outputs: with up to five true outputs the effect is to increase the confidence interval by at least ten percentage points.

13

John Cubbin

See SpecError\Investigation3a.xls


Exclusion of a relevant variable The danger of including an irrelevant variable is balanced by the possibility of excluding a relevant variable. The effect of this will of course depend on the impact the variable has on costs. Table 7.6 and Figure7.4 show the effect on the 95% confidence interval for our model with 40 observations. Table 7.6: Effect of omitted variable on bias with different numbers of outputs

95% confidence intervals Bias No of true outputs 1 output omitted


2 0.78 0.17 -0.22 3 0.49 0.23 -0.03 4 0.53 0.27 0.01 5 0.46 0.30 0.07 6 0.40 0.32 0.11 7 0.37 0.34 0.14 8 0.35 0.35 0.16

Note: based on 40 observations Figure 7.4

Effect of omitted variable

-0.40

-0.20

0.00

0.20

0.40

0.60

0.80

1.00

0 2 4 6 8 10

No of true outputs

95%

Con

f. In

t. b

ias

1 omitted variable

Bias


The incremental effect of the omitted variable on the confidence interval diminishes with the number of true variables. Indeed, with eight true variables, the addition of the last one increases the bias but does not appear improve the confidence intervals3. As may be expected, increasing the number of observations does little to neutralise the effects of a missing variable. Table 7.7 shows that even with 1000 observations the

3 Note the non-monotonicity of the upper graph in Figure 7.4 is due to the fact that the weight of the output omitted at 3 true outputs is only 20%. It is 25% at 4, 20% at 5, etc.

14

John Cubbin

See SpecError/Investigation4.xls

John Cubbin

Investigation5.xls


width of the 95% confidence interval remains at around 40% when a variable with a 20% weight in costs is omitted, and the bias becomes more negative, approaching -12% at 1000 observations. Table 7.7: Omitted variable bias and numbers of observations

Number of observations 95% CI BIAS

8 0.58 0.0841416 0.54 0.0281432 0.51 -0.0153964 0.47 -0.04873

125 0.45 -0.07371250 0.42 -0.09323500 0.41 -0.10782

1000 0.40 -0.11867 Figure 7.5

Increasing sample size in the presence of specification error

-0.20

-0.10

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0 200 400 600 800 1000 1200

Number of observations

95%

con

f int

, bia

s

95% CI

BIAS

Variable returns to scale The DEA modeller may choose between a constant or variable returns to scale formulation. The results of choosing variable returns to scale when constant returns are appropriate is analysed in Table7.8 and Figure 7.6.

15

John Cubbin

Investigation6.xls


Table 7.8: Effect of incorrectly using variable returns to scale model 95% Confidence Interval No of observations Variable returns to scale Constant returns

8 0.48 0.38 16 0.46 0.31 32 0.43 0.25 64 0.40 0.20

125 0.36 0.15 250 0.30 0.11 500 0.24 0.07

1000 0.18 0.05 Note: The true data generation process was based on a linear constant returns model Figure 7.6: VRS specification error

Effect of using VRS model

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0 200 400 600 800 1000 1200

No of observations

95%

Con

fiden

ce In

terv

al

VRS

CRS

These indicate that the use of the variable returns to scale model when the data generation process involves a constant returns model may substantially reduce the precision of the estimates. What happens if the data generating process involves varying returns to scale? To test this we created a quadratic variable returns to scale model: C = (0.1 + .45x1

2 + 0.35x22 + 0.2x3

2)0.75 (7) For the radial expansion path x1 = x2 = x3 this is plotted in Figure 7.7

16

John Cubbin

QVRS.xls


Figure 7.7

Quadratic VRS function

0.00

0.20

0.40

0.60

0.80

1.00

1.20

1.40

1.60

0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4Scale

Cos

t

Cost

Note: Scale is defined as x= x1 =x2 =x3 in equation (7)

What happens if the data generating process (DGP) involves variable returns to scale? Table 7.9 and Figure 7.8 show, for our base case, how the confidence interval varies with numbers of observations. The effect is similar to the constant returns DGP in that the VRS model performs worse than the CRS model.

17

John Cubbin

Investigation8QVRS.xls


Table 7.9 : VRS as the correct specification: Width of 95% confidence interval Number of Observations Constant returns to scale Variable returns to scale

8 0.37 0.48 16 0.30 0.46 32 0.24 0.43 64 0.20 0.39

125 0.17 0.34 250 0.14 0.28 500 0.13 0.21

1000 0.12 0.15 Figure 7.8

VRS as correct specification: base case

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0 200 400 600 800 1000 1200Number of observations

95%

con

fiden

ce in

terv

al CRS

VRS

In this case the incorrect CRS specification always performs better than the correct VRS specification! This is possibly because the true cost function is too close to a CRS function. To see how this result holds up with more pronounced returns to scale, we varied the constant term for 40 observations and the results are reported in Table 7.10. Table 7.10: Raising the constant term 95% CI Constant CRS VRS

0.15 0.22 0.410.30 0.29 0.410.45 0.37 0.410.60 0.42 0.41

Note: 40 observations The higher the constant term the less appropriate is the CRS model and it performs correspondingly worse. The performance of the VRS model does not get significantly

18


worse. However, at no stage can it be said to provide an accurate estimate of the efficiency of the units concerned.

8. Conclusions, limitations and possible extensions It is well known that the accuracy of DEA depends on many factors. In this paper we have looked at several aspects. For our base model and small variations around it we came to the following conclusions about the significance of various factors. In descending order of significance these appear to be:

1. Large dimensions with limited observations; 2. Specification error (omission or inclusion, wrong type of scale assumption); 3. Underlying probability distributions for the variables and inefficiency

scores; and 4. The form of the cost function

Within our model the form of the underlying cost function does not seem to matter much in comparison to the other factors as long as it is convex or quasi-convex and shows constant returns to scale. As expected the exponential distribution of outputs, with its greater production of “outliers,” generates a DEA performance slightly worse than the normal or uniform. Conversely having an exponential inefficiency distribution generates a better performance for the DEA measure. These effects are small, however, compared with the potential effects of specification error, whether errors of omission or commission. One of the most problematic areas was the performance of the variable returns to scale model. The existence of economies and diseconomies of scale reduced the performance of the CRS model as expected. Less expected, however, was the finding that the VRS model did not improve the precision of our estimates within the region of our base model. This creates a dilemma where the use of DEA is contemplated for a combination of a limited numbers of observations and varying returns to scale. We have omitted mention of other potentially significant factors: for example, correlation between the outputs and errors in variables. In DEA, the presence of multicollinearity may prove to be an advantage since there is no need to estimate accurately the separate impact of the different variables and, as others4 have reported, it increases relative precision. Errors in variables will work the other way. There is likely to be a trade off between including a relevant variable which is not an accurate proxy and the reduced precision arising from including a variable which is tainted with irrelevant noise. John Cubbin City University April 2003 Revised July 2003

4 Notably Pedraja-Chaparro et. al. (1999).

19


20

Appendix: specifications Table A1: Parameters used for cost functions - base cases Model a b1 b2 b3 bii bij c Linear 0 .45 .35 .20 Linear, >3 outputs 0 1/n 1/n 1/n Quadratic 0 .45 .35 .20 QVRS 0.1 .45 .35 .20 1.5 Note: n = number of outputs in case where n>3 Table A2: default values of parameters Description Value Distribution of outputs: Lower bound of uniform distribution 0.1 Lower bound of truncated normal distribution 0.1 Ratio of mean to minimum 10.0 Distribution of inefficiencies = Φ – 1: Uniform distribution range 0 to 1.0 Exponential distribution mean inefficiency 0.5 Normal distribution: based on Based on N(0.5,0.5)

(Negative values rejected) Table A3: Base case and alternatives

Aspect Base case Alternatives Cost function Linear Quadratic, QVRS Output distribution Uniform Normal, exponential Inefficiency distribution Normal Uniform, exponential Returns to scale in true cost function

Constant Fixed term (=>increasing returns) + Increasing cost in variable terms

Returns to scale in DEA Constant Variable Number of observations 40 100, 8,32,64,125,250,500,1000 Number of inputs 1 Number of outputs 3 1-8 DEA orientation Input

minimisation n.a.


21

References Banker RD (1993). Maximum Likelihood, Consistency, and Data Envelopment Analysis: A Statistical Foundation. Management Science 39: 1265 – 1273. Banker RD et al (1989). An introduction to data envelopment analysis and some of its models and its uses. Research in Government and Nonprofit Accounting 5: 125-163. Charnes A, Cooper, WW, and Rhodes E (1979). Measuring the efficiency of decision making units. European Journal of Operational Research 2: 429-44. Fehti M, Jackson PM, Weyman-Jones, TG (2001). European airlines: A stochastic study of efficiency with market liberalisation. Paper presented at Seventh European Workshop on Efficiency and Productivity Analysis, University of Oviedo, Oviedo, Spain, September. Ganley JA and Cubbin JS, (1992). Public Sector Efficiency Measurement: Applications of Data Envelopment Analysis, Elsevier. Gijbels I, Mammen, E, Park BU and Simar L (1999). On estimation of monotone and concave production functions. Journal of the American Statistical Association 94: 220-228. Grosskopf S (1996). Statistical Inference and Nonparametric Efficiency: A Selective Survey. Journal of Productivity Analysis, 7, 161-76. Kittelsen SAC (1993). Stepwise DEA; Choosing Variables for Measuring Technical Efficiency in Norwegian Electricity Distribution. University of Oslo, Department of Economics, Memorandum No 6, April. Kittelsen SAC (1995). Monte Carlo simulations of DEA Efficiency Measures and Hypothesis Tests. University of Oslo doctoral thesis, presented at Georgia productivity workshop, 1994. Kittelsen SAC (1999). Monte Carlo Simulations of DEA Efficiency measures and hypotheses tests. Frisch Centre Memorandum 9/1999. Korostelev A, Simar L, and Tsybakov A (1995). On estimation of monotone and convex boundaries. The Annals of Statisitics, 23, 476-489. Pedraja-Chaparro F, Salinas-Jiménez J, and Smith P (1997) ref page 3 Pedraja-Chaparro F, Salinas-Jiménez J, and Smith P (1999). On the quality of the data envelopment analysis model. Journal of the Operations Research Society 50: 636-644. Press WH, Flannery BP, Teukolsky SA, and Vettering WT (1986). Numerical Recipes: The Art of Scientific Computing, Cambridge University Press.


22

Simar L and Wilson PW (1998). Sensitivity Analysis of DEA scores: How to bootstrap in nonparametric frontier models. Management Science 44: 49-61. Simar L and Wilson PW (1999). Of course we can bootstrap DEA scores! But does it mean anything? Logic trumps wishful thinking. Journal of Productivity Analysis 11: 93-97. Simar L and Wilson PW (2000). A general methodology for bootstrapping in frontier models. Journal of Applied Statistics 27: 779-802. Simar L and Wilson PW (2002). Estimation and inference in Two-stage, Semi-Parametric Models of production processes. Mimeo: Institut de Statistique, Université Catholique de Louvain, Department of Economics, University of Texas.

more statisitical properties of deasm340/research/deastatpropsfeb2004.pdfcubbin: more statistical...

Documents