key issues in benchmark baselines for the cdm: aggregation, stringency, cohorts, and ... ·...

68
Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and Updating Prepared by: Michael Lazarus, Sivan Kartha, Steve Bernow Tellus Institute, Boston Stockholm Environment Institute – Boston June 2000 Prepared for: U.S. EPA, Contract No. 68-W6-0055 Analysis of Issues Associated with Joint Implementation U.S. EPA Work Assignment Manager: Shari Friedman Project Manager: Heidi Nelson-Ries, Stratus Consulting, Inc.

Upload: others

Post on 23-Feb-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

Key Issues in Benchmark Baselines for the CDM:Aggregation, Stringency, Cohorts, and Updating

Prepared by:Michael Lazarus, Sivan Kartha, Steve Bernow

Tellus Institute, BostonStockholm Environment Institute – Boston

June 2000

Prepared for: U.S. EPA, Contract No. 68-W6-0055Analysis of Issues Associated with Joint Implementation

U.S. EPA Work Assignment Manager: Shari Friedman

Project Manager: Heidi Nelson-Ries, Stratus Consulting, Inc.

Page 2: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

i

Table of Contents

TABLE OF CONTENTS I

EXECUTIVE SUMMARY ES-1

SECTION 1. BENCHMARK AGGREGATION ISSUES AND OPTIONS 1-11.1 DISAGGREGATION BY FUEL TYPE...........................................................................................................................1-21.2 LOAD PROFILE/DUTY CYCLE...................................................................................................................................1-61.3 RETROFIT PROJECTS..................................................................................................................................................1-91.4 OFF-GRID PROJECTS................................................................................................................................................1-131.5 SPATIAL DISAGGREGATION ...................................................................................................................................1-131.6 AN INTEGRATED APPROACH TO MAXIMIZE ADDITIONALITY AND REDUCE TRANSACTION COSTS............1-151.7 SUMMARY AND FINDINGS.......................................................................................................................................1-18

SECTION 2. MINIMIZING THE RISK OF NON-ADDITIONAL CREDITS 2-12.1 STRINGENCY...............................................................................................................................................................2-12.2 PROJECT SCREENS AND ADDITIONALITY TESTS...................................................................................................2-52.3 CREDIT DISCOUNTING /STANDARDIZED CRITERIA ..............................................................................................2-82.4 CONCLUSIONS.............................................................................................................................................................2-9

SECTION 3. COLLECTING DATA AND DEFINING COHORTS 3-13.1 COLLECTING DATA ...................................................................................................................................................3-13.2 PROJECTIONS AS A SOURCE OF DATA FOR BENCHMARKS...................................................................................3-33.3 DEFINING THE COHORT ............................................................................................................................................3-43.4 CONCLUSIONS.............................................................................................................................................................3-7

SECTION 4. UPDATING ISSUES AND OPTIONS 4-14.1 BASELINES UPDATING AND INVESTOR RISK..........................................................................................................4-14.2 KEY QUESTIONS IN BASELINE UPDATING..............................................................................................................4-44.3 RENEWING BENCHMARKS FOR EXISTING PROJECTS............................................................................................4-64.4 OPTIONS FOR STANDARDIZED UPDATING METHODOLOGIES..............................................................................4-84.5 WHEN SHOULD THE INITIAL BASELINE METHODOLOGY BE RECONSIDERED?................................................4-94.6 REVISING BENCHMARKS FOR NEW PROJECTS.......................................................................................................4-94.7 BENCHMARK UPDATING: EXAMPLES..................................................................................................................4-104.8 CONCLUSIONS...........................................................................................................................................................4-11

REFERENCES:

APPENDIX A: COMPARING CARBON INTENSITY AND CAPACITY FACTOR

APPENDIX B. FUEL SHARES OF ELECTRICITY PRODUCTION, 1995

Page 3: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

ii

Acknowledgments

The authors express their appreciation to Shari Friedman and Maurice LeFranc of the United StatesEnvironmental Protection Agency for their ideas, inspiration, and support. We also thank Jane Ellis,Steve Meyers, Joel Swisher, Eric Smith, Axel Michelowa, and Ingo Puhl, who provided insightful reviewcomments and helped refine some of the baseline approaches presented here. Finally, we thank HeidiNelson-Ries and Joel Smith of Stratus Consulting for managing this effort.

The views expressed in this report are those of the authors. This report does not represent the viewpointof the United States Government or of the United States Environmental Protection Agency.

Page 4: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

ES-1

Executive Summary

The Clean Development Mechanism (CDM) would create a new, global marketplace for greenhouse gas(GHG) emission reduction projects. A central feature of proposed changes to the UN FrameworkConvention on Climate Change, the CDM would enable countries with formal GHG emission reductionobligations – the industrialized “Annex B” countries – to gain credit for projects that reduce emissions incountries that thus far have no formal obligations. In principle, the CDM could unleash significant newflows of investment in developing countries to further sustainable development while reducing globalGHG emissions. These GHG emission reductions, if verified and shown to be additional – that theywould not have occurred but for the CDM – could then be translated into credits that Annex B countriescould use in reaching their emission reduction targets.

Efficient and successful implementation of the CDM – assuming a protocol eventually enters into force –will require the establishment of practical, transparent, and credible methods for estimating projectbaselines. Baselines are the assumed counterfactual situation; in other words the best guess as to whatwould have happened in the absence of the CDM. Since baselines will be used to determine whether aproposed CDM project is considered additional (i.e., that the emission reductions would not haveoccurred anyways), and how many emission credits the project sponsors will accrue, much is at stake.Systematic error in baseline estimation could result in increased global emissions (if baselines are toohigh and CDM credits enable Annex B countries to increase emissions without a compensating offsetfrom the CDM project) or in lost opportunities for GHG reducing investments in developing countries (ifbaselines are too low, reducing credits and the added economic incentive for GHG mitigation projectsthey provide).

Two principal baseline methods are under active consideration: multi-project (or “benchmark”) andproject-specific (Ellis and Bosi, 1999). These approaches are distinguished mainly by their degree ofstandardization. There has been little or no standardization thus far with project-specific baselines, atleast in the CDM’s pilot phase, activities implemented jointly (AIJ). This lack of standardization hasbeen a source of concern and criticism. While a more standardized methodology for project-specificbaseline determination is conceivable, it would entail a degree of specificity and detail that could becostly, lack transparency, give a false sense of precision, invite gaming, and require a difficult andsubjective review process. The multi-project approach uses standardization to address these concerns, bycreating consistent benchmarks that could be applied to broad categories of projects through an open andefficient process.1 Ultimately, all baseline approaches must address the tension between two fundamentalobjectives: maximizing environmental integrity (minimizing unwarranted credits) and maximizinginvestment in good CDM projects (minimizing transaction costs and crediting all additional projects.)

In a previous report, we reviewed key issues and options involved in establishing benchmark baselines(Lazarus, Kartha, Ruth, Bernow, and Dunmire, 1999). We identified several dimensions that are criticalto their effectiveness: aggregation, stringency, data and sample populations, and dynamics (updating). Inthe present report, we examine these four dimensions, looking only at the power sector – likely to be amajor source of CDM credits – using detailed (plant level performance) data collected from five casestudy countries.2

1 Project-specific baselines could also be standardized to some extent, by specifying required procedures (e.g.application of system planning models) or assumptions (e.g. for oil prices).2 Previous report was based on a generic international database with no performance data included (Utility DataInstitute UDI). Plant level data provides insights into the distributions of carbon intensities and the effect ofdifferent methodologies on inclusion and crediting.

Page 5: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

ES-2

Aggregation

The grouping of various types of potential projects into a single category with a corresponding singlebaseline is the defining aspect of benchmarks. Not surprisingly, then, the definition of these categories isone of the principal challenges. To make benchmarking worthwhile, the grouping or aggregation shouldbe broad enough to encompass many CDM projects and reduce transaction costs, but not so broad thatbaseline accuracy is compromised, excessive credits are awarded, or significant investment opportunitiesare lost.

A single benchmark for an entire sector such as power supply would provide an incentive to invest in theleast carbon-intensive options available throughout the entire sector. However, a single sector-widebenchmark may provide no incentive for CDM projects that improve the efficiency of the relativelycarbon-intensive options (e.g. upgrading the efficiency of a coal plant). Such projects are likely to havecarbon intensities above a sector-wide benchmark and would thus be uncreditable under this approach.Conversely, a sector-wide benchmark might overestimate credits if a project is merely improving upon analready low carbon activity, e.g. improving the efficiency of a natural gas plant, where the sector-widebenchmark reflects coal or oil-based generation.

This report examines five possible bases for disaggregating power sector benchmarks:• Duty cycle/load profile: An aggregate, sector-wide benchmark would treat all CDM projects

largely as if they are displacing a mix of new baseload and intermediate power, since thesesources dominate electricity production. This should be a reasonable approximation for mostlarge power supply projects, but less so for projects that produce electricity predominantly duringpeak periods or those small enough to have little effect on what is otherwise constructed (i.e. notdisplacing new capacity). However, the creation of separate baseload and peakload benchmarksto account for the former situation (high peak coincidence) is problematic, as illustrated inSection 1.2, and may not be warranted, given that projects which seek CDM credits based on highpeakload generation or savings are likely to be few. Since the number of hours of peakloadoperation is rather limited, the carbon savings and resulting credits for peakload operation areunlikely to be large enough to be significant in terms of economic incentive or total creditvolume. Nonetheless, a separate approach may be warranted for demand-side and smallrenewable projects, such as the load curve approach suggested by Meyers et al (2000), if theprincipal effect of these projects is to reduce the operation of existing (or new) facilities ratherthan alter what would otherwise be built.

• Retrofit vs. new facilities: Retrofits of existing facilities are a potentially large source of CDMactivity, and an important category to consider separately. It is useful to think of retrofit as twoseparate projects: one that “replaces” the original plant, and another that that is responsible forincreased generation. A power plant retrofit could increase generation by increasing power plantcapacity (e.g. repowering or generator replacement), by lowering operating costs (e.g. improvedmanagement or more efficient boiler), or by extending the operating lifetime. The carbonintensity of the pre-retrofit plant can provide an appropriate benchmark to the extent thatgeneration does not increase. (To account for the possibility that the retrofit activity would haveoccurred anyways in the near future, penetration rate among similar facilities could be tracked.)To the extent that electricity production does increase, this “added generation” should have thesame benchmark (fuel-specific or sector-wide) as new capacity. The two-part methodology isstraightforward and offers consistency with the methodology used for other CDM projects. Aparallel approach could be applied to project-specific baselines as well.

• Off-grid: Off-grid CDM projects could be large in number, but due to the small amounts ofelectricity involved, they are unlikely to generate a significant fraction of total credits generatedthrough the CDM. For this reason, it is important that the costs of baseline determination for this

Page 6: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

ES-3

project category be kept low. A benchmark based on diesel generators, the world’s most commonsource of off-grid electricity, whether used directly or for battery charging, might offer areasonable baseline that could be applied throughout non-Annex 1 countries.

• Spatial disaggregation: The appropriate level of spatial aggregation for benchmarks is difficultto determine without reference to local grid conditions. Where there are few technological(transmission capacity) or political/regulatory constraints for electricity to flow freely amongseveral countries, then a multi-country benchmark would most make the most sense. A CDMproject in one country would then be avoiding new electricity production throughout the region.Conversely, if transmission or other constraints limit interchange among parts of a country, thensub-national benchmarks would be technically preferable. However, subnational benchmarksmay be administratively burdensome in all but the largest non-Annex 1 countries (e.g. India,China, Indonesia). National level benchmarks are a reasonable starting point. Case-by-casedecisions could then determine to whether to opt for regional benchmarks (where regionalinterconnections and planning are strong) or sub-national benchmarks (for countries with distinctgrids or power pools that possess different resource profiles).

• Fuel/technology type: Of the five bases, the question of fuel/technology disaggregation is themost challenging. There are two principal options: single sector-wide benchmarks (all fuels or allfossil) and/or individual benchmarks for each fuel/technology. The single benchmark is simpleand transparent, but only rewards low-carbon projects. Opportunities to improve the performanceof high-carbon fuels and technologies might not be rewarded. These opportunities can becaptured instead by the use of fuel-specific benchmarks. But use of fuel-specific benchmarksalone will not encourage or reward projects that involve switching to lower-carbon fuels.Therefore combining the two approaches may be desirable if benchmark baselines are to becapable of encouraging and rewarding both types of projects: those that improve (supply-side)efficiency and those that switch from a higher to a lower carbon fuel (or to demand-sideefficiency).

For most on-grid new capacity, an integrated approach might prove feasible, which combines the keyadvantages of benchmark baselines (transparency, consistency, reduced gaming) with those of project-specific baselines (additionality testing). One option is to assign a default benchmark to CDM projects,providing that they satisfy a specified additionality test that requires some project-specific assessment.Table ES-1 illustrates some of the key features of such an integrated approach, and differences with thestraight benchmark approaches (both sector-wide and fuel-specific benchmarks) and project-specificapproach. (Additionality testing is addressed in more detail in the following section.)

Page 7: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

ES-4

Table ES-1 How the integrated approach differs from the straight benchmark or project-specific approaches (All metrics relative to other options in a given row)

1. Sector-wide,benchmark, noadditionalitytests

2. Fuel-specific,benchmark, noadditionalitytests

3. Integratedbenchmarkapproach (above)

Pure single-project

Value of the baseline sector-widebenchmark

benchmark foreach fuel

1. or 2. depending onoption used

Project-specific,established throughreview process

Additionality testing None (aside frombenchmark)

None (aside frombenchmark)

Mostly for largeprojects

All projects

Cost of baselinedevelopment andapproval

Low-Medium(depending onscale economiesfrom project flow)

Low-Medium(depending onscale economiesfrom project flow)

Low-High (dependson how additionalitytest is done and howfrequently required)

Medium-High

Non-additionalprojects:• Likelihood of

significantquestionablecrediting

Medium-High(depends partly onstringency)

Low – Medium(depends partly onstringency)

Low Low-High (depending ondeveloper behavior; rigorof the approach andreview process)

• Likely types ofnon-additionalprojects

All low-carbontechnologies(hydro, nuclear,gas in coal-basedcountries)

Better-than-benchmark coal,and to less extentNG, plants

Small-scale low-carbon projects; somehighly cost-competitivetechnologies

Highly cost-competitivetechnologies (NGCC,hydro, wind, etc.) andprojects (efficiencyretrofits)

Potential forovercrediting ofprojects that areadditional

Medium-High(depending onstringency)

Low-Medium(depending onstringency)

Low Low-High (depending ondeveloper behavior; rigorof the approach andreview process)

MissedOpportunities• Potential Medium – High

(esp. if efficientcoal projects areallowed)

Medium – High(if low creditmargins stiflefuel-switchingprojects)

Low-medium Low – medium(depending on whethertransaction costs are asignificant share ofexpected credit revenue)

• Types ofprojects mostlikely to bemissed

Higher-carbonprojects (e.g.efficient coal)

Fuel-switchingprojects (coal togas; any fossil torenewables)

Smaller projects, e.g.renewables (due totransaction costs)

Minimizing the risk of non-additional credits

The risk of non-additional credits is a central concern of the CDM, whether project-specific or benchmarkbaselines are used. With benchmarking, this concern would be strongest with the approach that isarguably the simplest and most intuitively attractive – average performance benchmarks with automaticapproval of all projects having better-than-average carbon intensities. This report discusses three possiblemechanisms to minimize the risk of granting credits to non-additional activity under the CDM: (1)stringency, (2) additionality testing (and project screens), and (3) credit discounting.

Page 8: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

ES-5

If the CDM were an instrument that both credited activity below the benchmark and taxed activity abovethe benchmark, then the average carbon intensity for a given category of projects would be the theoreticalideal benchmark. In essence, such a benchmark would function like a zero-sum, revenue-neutral GHGtax/rebate system and provide an incentive for every investor in the sector to lower GHG emissions.However, the CDM is not a symmetric instrument. An average benchmark would reward all activitieswith carbon intensities below the benchmark, with no compensating penalty for activities above thebenchmark. The CDM might generate large volumes of credits that had little correspondence to real GHGmitigation activity. Therefore, it is important to consider alternative mechanisms whose purpose wouldbe to: a) to avoid unwarranted crediting of free riders and b) to reduce the potential for over-creditingsome projects.

Stringent benchmarks: One approach to constructing benchmarks that reduces the risk of grantingunwarranted credits to non-additional activity is stringency. A more stringent benchmark implies abenchmark that is set at a lower carbon intensity. A more stringent benchmark allows fewer free riderprojects, and awards fewer excess credits to allowed projects, compared with a more lenient benchmarksuch as average carbon intensity. Minimizing excess or unwarranted credits through stringency, however,simultaneously creates an opposite problem. As benchmarks are set at lower, more stringent, levels,credits for some legitimate (i.e., additional) CDM projects would be eliminated or reduced, therebydiminishing the incentives for undertaking these projects, and making them less competitive on theinternational carbon credit market.

Two main alternatives or complements to stringency that could help to balance the objectives ofmaximizing environmental integrity and maximizing investment in good CDM projects are furtheradditionality testing (with project screens) and credit discounting. The principal features of thesemethods are outlined in Table ES-2.

Further additionality tests: Further additionality testing entails some elements of project-specificanalysis, using criteria or assessments to judge the likelihood that a given project would have occurred inthe absence of CDM. Testing methods might include expert judgment, multi-objective assessment, orbarrier removal criteria. (Financial additionality testing has also been suggested, i.e. showing that aproject would be uneconomic but for the CDM credit revenues, but this approach has significantdrawbacks: transparent economic analysis is a formidable challenge and legitimate projects may alreadybe economic but not implemented due to market hurdles.) Since such testing could increase thecomplexity and transaction costs of baseline development, it is desirable to avoid such tests for everyproposed CDM project. For this reason, additionality testing might be accompanied by project screens,which could be applied to: a) limit additionality tests to only those project types with the highest risk ofquestionable credits (large projects or those with already significant market penetration); or b)automatically exclude activities that are considered likely to be non-additional in a given context (e.g.large hydro in countries with low-cost sites). The added transaction costs of additionality tests would belargely borne by sizeable projects, for which they would likely represent a very small expenditure relativeto the project investment.

Additionality testing combined with average performance benchmarks, could be effective at maximizingcredits for additional projects while limiting non-additional ones. The challenge is to develop simple teststhat effectively assess additionality while avoiding the potentially costly analysis and cumbersome reviewprocess of project-specific baselines.

Credit discounting: A third mechanism, credit discounting, could be used to scale down the number ofcredits by a factor based upon the likelihood of non-additionality, as illustrated in Table 2-3. Discounting

Page 9: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

ES-6

is inherently no more complex than stringency or additionality testing: all involve using some judgment(as does any baseline method) to set thresholds or categories.

While several details would need to be worked out, these methods offer practical options for reducing theexcess credits that would be generated by an average performance benchmark. Whichever method isadopted, some means to reduce excess credits will be essential to the environmental integrity of the CDM.

Table ES-2. Contrasting methods for minimizing non-additional creditsMethod:

Criteria:

A. Stringency(Section 2.1)

B. Project screens andadditionality tests(Section 2.2)

C. Credit Discounting/Standardized Criteria(Section 2.3)

Keyparameters toagree on

- Stringency criterion (best X plants,Yth percentile, etc.)

- De minimis threshold, penetrationthreshold- Measures of additionality (rate ofreturn, penetration levels, market barriers,etc)

- Criteria for and extent ofdiscounting

Effectiveness atminimizingexcess credits

Relatively effective if additionality iswell correlated with low-carbonintensity within benchmark category(e.g. fuel-specific, sector-wide), i.e.carbon intensities for additional projectswould be concentrated at low end ofdistribution.

Reliable additionality tests would limitnon-additional low-carbon projects thatmight otherwise elude simpler stringencyand discounting criteria.

Depends on the extent towhich discounting criteriacan accurately reflect theprobability of additionalityof various project types (asillustrated in Table 2.3).

Reducedcrediting of“additional”projects

Potentially significant. If benchmarksare adequately stringent, CDM projectsthat displace high carbon intensityactivities (e.g., gas instead of coal powerplant) might earn far fewer credits undera stringent benchmark than “deserved”.

Less significant. Average performancebenchmarks could also reduce somecrediting relative to project-specificbaseline methods but the magnitude isdifficult to estimate.

Same as above.

Data Adequacyand availability

- Currently available in many, but notall countries. An active CDM regimecould motivate more countries tocompile and release the needed data.

- For project screens: Sizing informationreadily available for a de minimisthreshold; data for penetration thresholdis more complex.- For additionality tests: Financialadditionality could require confidentialdata; penetration rate is simpler for largerprojects (wind farms), more complex anddata-intensive for smaller, distributedinvestments (efficiency)

Will depend on the criteriafor discounting.

Cost ofanalysis andreview

- Lower than project-specific due toeconomies of scale if volume of projectsis significant.

- Costs of additionality test could besignificant if more complex methodsused. Costs would be similar to project-specific baselines, but only applied tolarger projects.

Could be relatively lowcost, if criteria are simple.

Administrativefeasibility

Simple, once stringency criterion is set. All projects passing screen will requirereview similar to project-specificbaselines

Simple, once discountingrules are set.

Collection and use of data for benchmarkingAll baseline methods will require the collection and analysis of reliable data and reasonable assumptionsregarding presumed counterfactual activities, or proxies thereof. Benchmarks for power sector projectswill require specific types of data to be collected about existing power plants, plants under construction,and possibly those planned or otherwise expected to be built in the near term. A promising way toestablish accurate benchmarks is to define and track the performance of a cohort of power plants thatreflect the characteristics of the presumed counterfactual to a CDM project.

Page 10: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

ES-7

Defining cohorts as an empirical source of data for baselines: The counterfactual to a CDM projectcannot itself be observed and measured, but it a good approximation can be made by defining anappropriate cohort. The cohort would be a sample of actual plants that adequately reflects the presumedcounterfactual (in terms of vintage, fuel, technical characteristics, etc), to provide an empirical basis fordefining and updating benchmarks. For example, if a CDM project consists of a new efficient powerplant, and the presumed counterfactual is a new conventional power plant, the cohort could be defined,say, as the set of baseload power plants constructed within the last 3 years, under construction, orplanned. Or, if a CDM project consists of improving the efficiency of an existing coal power plant thatcame into operation in 1990, the appropriate counterfactual is the power plant itself without a boilerupgrade, and reasonable cohort would then be a set of 1990-vintage coal-based power plants. Changes inthe performance of the cohort (and, by assumption, the counterfactual) can be expected to occur ifefficiency autonomously improves, operating behavior changes, or fuel quality or type changes.

Using projection-based data: Especially under conditions of relatively rapid change, recent history mightnot be an accurate reflection of near-term power sector behavior. Projections based on electric sectorplanning assessments or expert judgment can yield data about power plants that are under construction,planned, or expected in the near term (~2 years), supplementing historical data for deriving benchmarks.One the other hand, longer-term forecasts of new facility types or plant efficiency are generally notaccurate, and do not provide a credible basis for benchmarks that are fixed for more than a few years.

Data requirements: Although certain data are assembled globally through international bodies or privatefirms, it is necessary to go to the primary sources of data within the electric sectors of individual countriesto acquire the requisite data. Presently, few countries collect the full complement of required data andmake them publicly available; it is reasonable to assume that a CDM regime could provide the impetusfor more complete and systematic collection of power sector data, filling many or all of the existing gaps.

Updating benchmarks

Updates using new data and predetermined methodology. Compared with ad hoc project-specificbaselines, standardized methodologies like benchmarks offer important advantages for keepingbenchmarks accurate over time. Revision of benchmarks for new projects and renewal of benchmarks forexisting projects incorporating new empirical data, but use the same predetermined, standardizedmethodologies. This lowers costs and allows more frequent revision and renewal, which improve baselineaccuracy. This procedure would limit investor uncertainty to measured changes in ambient conditions,rather than methodological changes. Baseline changes are therefore less likely to be excessivelyunpredictable. (Under some conditions, it may be necessary to adjust the methodology itself, whichmight cause baseline changes that are more rapid and harder to anticipate – for example, methodologicalchanges might be warranted if the required data becomes unavailable due to privatization.)

Providing investor certainty without compromising environmental integrity . Although reducing the riskof a changing baseline will not fully eliminate project risk, it can provide some degree of certainty forproject investors. Because dynamic baselines can be more accurate than static baselines, which may relyon unreliable forecasting, it is important consider instruments that would allow investors to manage therisk of dynamic baseline. Two mechanisms are mentioned in this report. First, investors could be offeredthe choice between an annually updated baseline and a fixed baseline that is comparatively conservative.Second, risk pooling mechanisms such as insurance or government guarantees could enable investors torecover financial losses due to a declining dynamic baseline. Using either approach, the cost of reducinginvestor risk and uncertainty is internalized as a straightforward, predictable and manageable cost. Thiscan help make dynamic baselines acceptable to investors, and thereby avoid the conflict betweenproviding investor certainty and ensuring environmental integrity.

Page 11: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

ES-8

Using cohorts to update benchmarks: Our observations suggest that benchmarks, if based on a cohortapproach, can reflect future changes in counterfactual activity. Gathering the data to track a cohort maynot be overly onerous. In many electric sectors, performance data is already collected annually, so theincremental technical effort for updating baselines based on a cohort’s evolving performance could berelatively small. However, if the monitoring the cohort proves costly, updates could be limited tointervals such as every 5 years. This would help incorporate new trends, but keep the transaction costslow.

Differentiating updating requirements by project category: Ultimately, decisions will have to be madeabout the frequency of updating (renewing and revising) baselines, balancing the need for baselineaccuracy, the technical and administrative costs of updating, and concerns about investor uncertainty (tothe extent that they remain unaddressed by other risk management instruments as discussed above). Itmay be helpful to tailor updating requirements to the conditions faced by different types of projects.§ Retrofit projects are likely to have more ephemeral counterfactuals, thus it might be necessary to

renew their baselines annually.§ New power projects are likely to have more consistent counterfactuals, although even in these cases

there might be important variations over a CDM project’s lifetime.§ Where market barriers play an important role, conditions could change rapidly and frequent (annual)

updating might be necessary.§ Where counterfactuals are especially sensitive to changeable parameters such as fuel availability,

technological advances, or regulations, baselines should be updated frequently.

Page 12: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

1-1

Section 1. Benchmark Aggregation Issues and Options

Aggregation of many potential project activities into a single category with a corresponding baseline isthe defining aspect of benchmarks. Not surprisingly, then, the definition of these categories is one of theprincipal challenges.

A single benchmark based on aggregation of an entire sector (i.e., electricity supply) would provide anincentive to invest in the least carbon-intensive options available throughout the entire sector. However, asingle sector-wide benchmark may provide no incentive for CDM projects that improve the efficiency ofthe relatively carbon-intensive options (e.g., upgrading the efficiency of a coal plant). Such projects arelikely to have carbon intensities above a sector-wide benchmark and would thus be uncreditable underthis approach. Conversely, a sector-wide benchmark might overestimate credits if a project is merelyimproving upon a lower-than-benchmark activity (higher efficiency natural gas plant measured against anoil baseline) or would have been undertaken without the CDM.

Greater disaggregation of project types thus has many attractions, such as:• closer matching of baselines to individual project circumstances in cases where potential project

activities can be disaggregated into distinct unrelated types, and as a result;• fewer missed opportunities and uncredited emission reductions, in principle; and,• fewer free riders and unwarranted credits, again, in principle.

At the same time, more aggregate (sector-wide or regional) benchmarks have their advantages, such as:• potentially improved estimation of avoided emissions where counterfactual project activities

might span the full range of activities in a sector (e.g., aggregate multi-country benchmarks whereelectricity markets cross national or regional boundaries, due to strong interconnections);

• reduced transaction costs imposed by baseline development activity; and,• greater simplicity and transparency for investors and other interested parties.

Resolution of the aggregation question requires a balancing of these factors and a closer examination ofthe costs and benefits of alternative configurations. This section examines five possible bases fordisaggregating power sector benchmarks:

1) Fuel/technology type: Under what conditions (if any) should benchmarks be determined andapplied specific to an individual fuel and/or technology?

2) Duty cycle/load profile: Should there be separate benchmarks (or a benchmark algorithm) toreflect the load profile of a CDM project? (i.e., peakload vs. baseload, dispatchable vs. must run)

3) Retrofits vs. new facilities: How should one account for the fact that a retrofit could bothimprove upon an existing facility and displace others if its life is extended or productionincreased?

4) Off-grid: How can benchmarks be applied to projects that are not connected to the main grid?5) Spatial disaggregation: What is the appropriate geographic reach of a power sector benchmark:

nation, power pool, or other unit?

This section concludes by offering a possible hybrid approach, that combines criteria to determine whattype of baseline method should apply to a given project type with added safeguards to minimize theoccurrence of free-riders and lost project opportunities.

Page 13: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

1-2

1.1 Disaggregation by Fuel Type

The diversity of fuel choices and generation technologies gives rise to a wide range of potential mitigationoptions in the electric sector. An effective benchmark methodology must recognize the resulting varietyof potential CDM project activities (and counterfactual possibilities), and define multi-project categoriesaccordingly. A key question that must be resolved is the following:

Should a single benchmark category encompass all fuels, or should each fuel have its owncategory and its own benchmark? (Or are there viable approaches that can combine the best of thesetwo options?)The choices of fuel, and to a lesser extent, technology, are the primary determinants of a project’s carbonintensity. Different fuels have very different carbon contents (per unit of energy), and generationtechnologies convert fuels into power with very different efficiencies. Renewables and nuclear areapproximately zero carbon3, natural gas using an efficient combined cycle produces electricity at less than0.5 kg CO2 / kWh, and typical coal plants can exceed 1.0 kg CO2 / kWh. Where fuel switching is anoption, then a single benchmark across fuels can encourage switching to the least carbon-intensiveoptions. But, where use of high-carbon fuels (e.g., coal or oil) is unavoidable, a single benchmark mightnot encourage CDM projects to make their use more efficient (e.g., integrated coal gasification combinedcycle, or IGCC, facilities). Therefore, since investments in higher-carbon fossil technologies may besupported by the CDM – though some in the climate policy community argue that CDM resources bedirected solely toward lower-carbon fuels and technologies – alternatives to sector-wide benchmarksshould be considered.

Figure 1.1 illustrates the way that fuel-specific and sector-wide benchmarks could be applied to a set offour indicative CDM projects. The figure shows several power supply technologies positioned on a lineaccording to their illustrative carbon intensity values. Several typical electric sector investments areshown: approximately zero-carbon options (renewables, hydro, or nuclear), and lower-carbon andconventional technologies for both coal and natural gas. Some of these options could potentially be viablemitigation activities, depending on what would have occurred otherwise.

The large arrows on right side of the figure show the “actual carbon savings” of four illustrative CDMprojects, pointing from a presumed counterfactual activity to the carbon intensity of the project. The tableat the bottom of Figure 1.1 lists the projects and the activities they displace. We identify the displacedactivities or counterfactual investment for pedagogic purposes only; it is impossible in reality to knowthem with certainty (otherwise baseline setting would be a straightforward exercise!). For simplicity, weassume that each CDM project displaces a single generating option, although in reality it could be a mixof options.

The fine double-headed arrows reflect the credit earned by each project activity relative to a particularbenchmark. Projects A and B are compared to their respective fuel-specific benchmarks, and projects Cand D are compared to a single sector-wide benchmark.

3 The indirect, upstream emissions can be non-trivial and should be included where they are estimated to besignificant.

Page 14: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

1-3

Figure 1.1 A possible integrated approach to benchmarks. (Same-fuel) efficiencyimprovement projects would be eligible for a fuel-specific benchmark. Fuel switching projectswould be eligible for a sector-wide benchmark. Simple rules would determine whichbenchmark would apply.

generation option kgCO2/kWh

Coal steam (standard) 1.01

Coal steam (advanced) 0.90

Coal IGCC - (Project A) 0.75

Gas (standard) (Project C) 0.48

Gas CC (advanced) (Project B) 0.35

Renewables (Project D) ~0.00

two fuelswitching

projects (C & D)(compared to a sector-wide benchmark)

two efficiencyimprovement

projects (A & B)(compared to fuel-specificbenchmarks)

coal benchmark (0.95)

sector-wide benchmark (0.65)

gas benchmark (0.45)

B

D

C

A

- possible CDM activity

- possible counterfactual - credit against benchmark

- “actual” reductions

CDM project activity kgCO2/kWh displaced activity kgCO2/kWhProject A IGCC coal power plant 0.75 Coal steam (standard) 1.01Project B Advanced Gas CC 0.35 Gas (standard) 0.48Project C Gas (standard) 0.48 Coal steam (standard) 1.01Project D Renewables (e.g. biomass, wind) 0.00 Coal steam (standard) 1.01

0.53

0.26

0.13

1.01

A

Page 15: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

1-4

Table 1.1 below shows the numerical results in terms of credits earned per kWh generated for eachproject, depending on whether a fuel-specific or sector-wide benchmark is used. The first column lists theprojects and their carbon intensities. The second column gives the “actual” emissions reduction, relativeto the actual displaced activity, were we able to know it. The third column shows the credits that wouldbe awarded under fuel-specific benchmarks, and the fourth column shows the credits that would berewarded under a single-sector wide benchmark. For example, Project A, a highly efficient IGCC coalplant (0.75 kgCO2/kWh) would be saving carbon at a rate of 0.26 kgCO2/kWh (second column), if itwere “actually” replacing a standard coal plant (1.01 kgCO2/kWh). It would be awarded credits at a rateof 0.20 kgCO2/kWh (third column) against a fuel-specific benchmark (0.95 kgCO2/kWh). Or it might beawarded no credits at all, were the sector-wide benchmark (0.65 kgCO2/kWh) to apply (fourth column).

Table 1.1 Crediting illustrative CDM projects against alternative benchmarks (kgCO2/kWh)

Credit earnedproject carbon

project activity intensity

“actual”emissionsreduction

fuel specificbenchmark

sector-widebenchmark

Project A (advanced coal IGCC) 0.75 0.26 0.20 0.00*Project B (advanced gas CC) 0.35 0.13 0.10 0.30Project C (conventional gas displaces coal) 0.48 0.53 0.00* 0.17Project D (renewables displaces coal) 0.00 1.01 0.00* 0.65*Projects receive no credit when their carbon intensity exceeds the benchmark.

Items in italics are not shown in Figure 1.

Not surprisingly, the fuel-specific benchmarks provide a more accurate reflection of reductions forprojects involving efficiency improvements (projects A and B), but a less accurate reflection in the caseof shifts between fuels (projects C and D). The fuel-specific approach might be most appropriate insituations fuel choice is limited the sector-wide benchmark more accurately reflects emission reductionsresulting from shifts from higher-carbon to lower-carbon fuels (projects C and D). Although theseexamples are only illustrative, these tendencies should be found in most situations. The major exceptionwould be where the sector-wide benchmark poorly reflects the type of generation actually avoided by aproject. For example, if a country is soon running out of hydro resources, and planning to rely on coal inthe future, a benchmark that is partly based on recent capacity additions (all hydro) may underestimate thebenefits of a project like C above (gas displaces new coal). This situation may call for looking largely atplants under construction or expected to be commissioned in the near future (see Section 4.1).

How could the benchmarks be set?

Fuel-specific benchmarks: The population of existing power plants for a given fuel will span adistribution of carbon intensities that is relatively continuous, simple and narrow compared to the broadand complex distribution of the sector as a whole. Therefore, it will be often be possible to define abenchmark based on a percentile threshold in many cases. For this example, fuel-specific benchmarkswere set for coal and natural gas at the top 20% of power plants for each fuel. Based on data for SouthAfrica and India shown in Figures 1.2 and 1.3 below, the 20th percentile for coal power plants correspondsto approximately 0.95 kgCO2/kWh. For natural gas, data are not available for South Africa or India tosimilarly determine the 20th percentile – we have assumed 0.45 kgCO2/kWh for the exercise here.

Page 16: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

1-5

Figure 1.2 Cumulative generation versus carbon intensity for coal power plants in SouthAfrica

Source: SAIC, 1999

Figure 1.3 Cumulative generation (MWh) versus carbon intensity for coal power plants inIndia, 1992-1998

Source: SAIC, 1999

Sector-wide benchmarks: Setting the sector-wide benchmark involves determining what new capacity(or mixture of new capacity) is likely to be built (or backed down) at the margin. (Lazarus, Kartha, Ruth,Bernow, Dunmire, 1999) A benchmark set at this level might be appropriate, providing that a reliablemethod is in place to reduce the risk of significant excess crediting. Such methods – stringency,additionality testing, and credit discounting – are discussed in Section 2. For this example, wehypothesize a power sector with 50% coal, 30% gas, and 20% hydro on the margin (recently built and/orplanned and under construction), which yields a sector wide benchmark of 0.65 kgCO2/kWh.

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

1.8

0 50 100 150 200 250 300 350

Cumulative Generation 1998 (TWh)

kgC

O2/

kWh

(19

92-9

8 av

g) 20th percentile

0.94 kgCO2/kWh

-

0.50

1.00

1.50

2.00

2.50

0 20 40 60 80 100 120

Cumulative Generation, 1998 (TWh)

kgC

O2/

kWh

(90-

98 a

vg) 20th percentile

0.94 kgCO2/kWh

Page 17: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

1-6

ConclusionSince CDM projects in the electric sector span such a wide range of possible mitigation options (i.e., fuelswitching and efficiency improvements based on a diversity of fuels and technologies), it is important tocarefully consider the appropriate level of multi-project category aggregation. This example demonstrateshow each of the two different benchmark approaches, fuel-specific and sector-wide, can be the moreaccurate one depending on the type of project, same-fuel efficiency improvements or fuel switching,respectively. This outcome suggests that an effective benchmark methodology might combine these twoapproaches, and include a method to limit excess credits. Such a methodology is explored in the finalsection.

1.2 Load Profile/Duty Cycle

The application of sector-wide benchmarks to different types of CDM projects would ignore differencesin the timing and control of power plant operation. Power plants differ in terms of duty cycle, i.e.whether they tend to produce electricity evenly throughout the year or primarily at peak times. Theformer are referred to as baseload, and the latter as peakload facilities. Baseload plants typically operate60-80% of the year, while peakers often have capacity factors of around 10% or less; as implied,intermediate plants operate somewhere in between these levels. Other facilities are defined by theirintermittence, such as wind, solar, and some biomass power plants. These facilities typically operatewhen resources are available – the sun shines or the wind blows – and cannot be dispatched as either peakor baseload. The variation in a CDM project’s duty cycle and dispatchability will determine the type ofelectricity generation and resulting emissions it is likely to displace.

Since aggregate statistics for the power sector are dominated by baseload and intermediate power plants,which often account for 90% or more of all generation, benchmarks derived from aggregate statistics maybe inhospitable to CDM projects that disproportionately produce or save peakload electricity. Forexample, a sector-wide benchmark or a fuel-specific natural gas benchmark might be too low to reward aCDM project that installs a high-efficiency CT. Such benchmarks may also under reward DSM projectswith high peak coincidence, such lighting or air conditioning programs.

Therefore it is worth considering whether the accuracy of CDM benchmarks could be improved bydistinguishing between peak and off-peak generation. Two alternative approaches could be used:

1. The first method could be called duty cycle benchmarking. One could classify plants intobaseload, intermediate, and peaking categories and develop separate benchmarks for each, usingprocedures similar to those described in the previous section. A CDM project would then becredited according to the mix of baseload, intermediate, and peakload electricity it is likely todisplace. A CDM project such as a geothermal plant, biomass facility (with reliable, low costfeedstock), or high-efficiency fossil plant might use the baseload benchmark exclusively. Theunderlying notion would be that such a project would avoid (or delay) the construction of asimilar amount of other baseload capacity. Other projects like a demand-side management(DSM) project that displace more peakload than baseload electricity, would be credited accordingto a ratio that depends on its peak coincidence. A similar method could be applied to must-runwind or solar projects. 4

4 One could construct a simple algorithm for allocating part of a power plant’s output to baseload, part to peaking,and calculate baselines and emissions reductions accordingly. For example, one could define the peak periods of ayear based on when certain designated plants were run or when a certain system-wide generation level wasexceeded, and credit CDM projects separately for generation within each period according to a baseline calculatedfor each. Or on could designate certain period of the day as peak time [e.g. 7am-9am, 6pm-10pm], the remainingperiod as baseload, and credit accordingly.

Page 18: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

1-7

A major challenge with the duty cycle benchmarking approach is classifying existing plants andCDM projects as baseload, peakload, or intermediate. This might be done on the basis of (a)generation technology (e.g. assume a CT is always considered as peaking or a coal steam alwaysas baseload), (b) capacity factor, or (c) reported running cost (or generation profile). Since it istypical basis for system dispatch (deciding when a plant should run), running cost would be thepreferred basis. However, running cost information is typically difficult to obtain, especiallywhere competition among generators exists or is emerging. Furthermore, in countries withseveral grid systems there may be multiple (non-additive) load curves.

The first approach, technology type, is not always definitive. Technologies commonly built forpeaking purposes (gas turbines) in some regions are often used for baseload/intermediate duty inothers, because of under-capacity on the system (they need to run), capital constraints (they're

Box 1.1 Duty cycles and carbon emissionsBecause they are built to provide inexpensive capacity rather than electricity, peaking technologies aregenerally less fuel-efficient than baseload technologies using similar fuels.1 For example, natural gascombustion turbines (CTs) are less efficient than combined cycle (CC) power plants, and most dieselgenerators are generally less efficient than oil steam plants. Not only are the technologies typically lessefficient, but the manner in which peakers are operated – e.g. frequent startup and part-load operation –further degrades their efficiency. Therefore, leaving aside differences in fuel type, peakload plants areusually significantly more carbon intensive than their baseload counterparts. This can be seen in the tablebelow, which shows carbon intensities for baseload/intermediate and peakload technologies, comparinggeneric values for new technologies with average performance of existing technologies in the case studycountries. These case study countries and the data collection process are described in Section 3.

Table 1.3 Carbon intensities of typical fossil power plantsBaseload/Intermediate Technologies Peakload Technologies

Carbon Intensity(kgCO2/kWh)

C intensity(kgCO2/kWh)

Generic,New

Case study,Existing

Generic,New

Case study,Existing

Natural Gas Natural GasSteam 0.56 V (0.68) CT current (<30%CF) 0.70* V (1.17)

CC current 0.47 CT advanced 0.47*

CC advanced 0.36 OilCT current (>30% CF) 0.70* J (0.65),

V (0.81)CT current 0.97 J (1.04),

V (1.36)

CT advanced 0.47* CT advanced 0.65

Oil Diesel V (1.36)

Steam 0.77 J (0.81)

CC current 0.65

CC advanced 0.52

CoalSteam SA (0.99),

I (1.17)

Steam advanced 0.91

IGCC 0.76Case study data are weighted average carbon intensities for India (I), Jordan (J), South Africa (SA), and Venezuela (V).CF = Capacity factorCT = Combustion turbineCC = Combined cycleSources: SAIC, Tellus, 1999.

Page 19: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

1-8

cheap to build), or access to very cheap fuel (at the gas wellhead). This situation is found, forinstance, in Venezuela, as described in Appendix A. Appendix A presents a more detailedinvestigation of duty cycle benchmarks and their development using available data. Capacityfactor, as Figures A.1 and A.2 show, is also problematic for developing duty cycle benchmarks.

2. The second method could be called a load curve approach. A load curve would be assembledfrom running cost information and/or expert judgment. The load curve would show the unit typeson the operating margin (highest cost) during different load periods (peak, off-peak, etc.). Basedon when a CDM project operates during the year, it would be credited according to the carbonintensity of the plant on the operating margin. This is similar to an approach currently beinginvestigated by researchers at LBNL. (Meyers et al., 2000, forthcoming)

Unlike the benchmarking methods explored in much of this report, the load curve approachimplicitly assumes that the CDM project would displace existing units with highest running costsduring each time slice, rather than affect new capacity that might otherwise have been built. Thisis a reasonable approximation for projects like DSM that are small and short-lived enough theyhave little effect on capacity expansion, i.e. what would otherwise be built or when it would bebuilt.

There are two principal challenges to this approach. The first is obtaining running cost data orotherwise using expert judgment to assemble an approximate load curve. This may provechallenging in some countries as noted above. The second is the potential discontinuity betweensmaller and larger projects. For larger projects, the operating margin would be problematic. Forexample it is more likely that a new geothermal or NGCC project inspired by the CDM would bedisplacing or delaying another investment, rather than simply displacing generation from otherexisting facilities. (Note that if the CDM project were to produce electricity for otherwise unmetdemand, then no carbon savings would result.) The load curve approach also implies that asignificant fraction of the electricity displaced by a baseload facility will always be from peakingand intermediate facilities, but this seems rather unlikely over the long run.

ConclusionsAlthough attractive at first glance, the development of separate benchmarks for peak and baseloadgeneration is unlikely to prove feasible and worthwhile. The duty cycle benchmark approach is likely toprove more effort than it is worth. The number of projects that seek CDM credits based on high peakloadgeneration or savings is likely to be few, since the number of hours of peakload operation is by definitionrather limited, the small additional carbon savings and resulting credits are unlikely to motivate many“additional” projects. In any case, for such peakload projects, fuel/technology-specific benchmarks couldprovide an appropriate alternative. For instance, a benchmark specific to natural gas combustion turbinescould be used to credit high-efficiency NGCT projects under the CDM.

Since, in most countries, most of the electricity is generated by plants operating at above a 20% or 30%capacity factor, an aggregate (generation-weighted) benchmark will, in effect, treat CDM projects as ifthey are displacing baseload or intermediate power. This could be a reasonable approximation for mostpower supply projects, especially larger ones. For demand-side and small renewable projects, the loadcurve approach warrants closer consideration.

Page 20: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

1-9

1.3 Retrofit5 Projects

Improvements in the carbon intensity of existing power plants are likely to prove an abundant andinexpensive source of prospective CDM projects. Retrofit opportunities range from simple improvementsin fuel quality and power plant management to the repowering of gas turbines into combined cyclefacilities.

Because of their distinct character, retrofit projects deserve a somewhat different baseline approach thannew projects. Like wholly new capacity projects, they place a power plant with new emissionscharacteristics into operation. However, unlike new projects, they also simultaneously “remove” anexisting facility from operation. This suggests an obvious baseline: the pre-retrofit operatingcharacteristics of the existing facility, mostly notably its carbon intensity (CIold). This baseline mightsuffice for simple “fuel-saving retrofits”. Examples of fuel-saving retrofits might include boilerreplacements or improved fuel quality, where the post-retrofit plant produces the same (or less) electricityas it would have otherwise. (See discussion below of using cohorts to adjust CIold for what might havehappened anyway. For instance, a coal retrofit CDM project might lose half its credits, once half of acohort of similar plants were to retrofit similarly.)

However, many retrofits will also increase generation levels above those of the pre-retrofit plant by either:1) by improving reliability and lowering production costs (due to better fuel efficiency) or 2) byincreasing the output capacity of the plant, e.g. by improving turbine efficiency or by adding a steamturbine to convert a CT into a CC6. In some cases, these renovations may prolong the plant’s operatinglife. For this additional generation, the pre-retrofit plant is not longer a reasonable counterfactual; theretrofit plant not displacing generation from the pre-retrofitted plant7, rather it is displacing the need fornew generation on the margin. For the additional generation, the retrofit plant is largely indistinguishablefrom other new capacity. Therefore, consistency suggests that the same baseline methodology be appliedto both. Under the benchmarking approaches discussed in this report, a CDM project would then becredited for additional generation using the new capacity benchmark, either sector-wide or fuel-specific.

The two distinct impacts of retrofit projects can be reflected in two separate equations for calculatingcredits. Up to the level of generation of the pre-retrofit plant (GENold), “fuel-saving credits” can becalculated as:

Fuel-saving Credits = min(GENold ,GENretrofit) x (CIold – CIretrofit), (Eq. 1)where

GENretrofit is the ex post measurement of plant generation (post-retrofit) in MWhGENold is the average of most recent years’ generation (pre-retrofit) in MWhCIretrofit is the ex post measurement of carbon intensity (post-retrofit) in tCO2/MWhCIold is the average of most recent years’ carbon intensity (pre-retrofit) in tCO2/MWhAnd min is the lower of the two values in parenthesis

5 We include the terms “brownfield” and “greenfield” in the title for familiarity, but these terms are less appropriatethan “retrofit” and “new” for much of this report. “New” activity can also occur at a brownfield site, e.g. where anew generating unit is built without affecting the operation of other units at that site. Such a project should not betreated as a retrofit.6 A retrofit that extends the life of a plant should be treated as new capacity, since the old plant would otherwisehave stopped generating. (GENold in Eq.1 and Eq.2 would in this case be zero.)7 One exception would be where the old plant would have increased its capacity factor anyways due to tighteningreserve margins, e.g. in response to increasing electricity demands. The converse situation is also possible: the oldplant could decrease production in the future, as lower cost plants come on line. These effects may roughly cancel,and could be accounted for using cohort-based approach, that tracks future changes in similar plant types.

Page 21: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

1-10

If GENretrofit > GENold, then

Credits for Additional Generation = (GENretrofit – GENold) x (CIbenchmark – CIretrofit), (Eq. 2)where

CIbenchmark is the benchmark (sector-wide or fuel-specific) for new capacity in tCO2/MWh

and

Total Credits for Retrofit = Fuel-saving Credits + Credits for Additional Generation (Eq. 3)

Where the retrofit results in significant additional generation, the second term (credits from additionalgeneration) in Eq.3 could become quite large. In cases where a developer is improving an already lowcarbon facility (e.g., replacing turbines on a dam or NGCC) and the carbon intensity of the retrofittedplant is lower than the benchmark, the added credits could be quite significant and positive. Conversely,where if the carbon intensity of the retrofitted plant exceeds the benchmark, the effect could be a loss ofcredits.

Assume, for instance, a CDM project involves retrofitting an old natural gas steam plant (0.50 kgCO2 perkWh ) to increase generator performance and efficiency (0.45 kgCO2 per kWh). The first term in Eq. 3would award the project for the 10% reduction in fuel use. However, if lower fuel costs led to increasedgeneration in a context where new NGCC plants are the current plant of choice (resulting in a fuel-specific natural gas benchmark of 0.40 kgCO2 per kWh), then the second term would result in negativecredits, because this additional generation is essentially displacing NGCC rather than old NG steam plantelectricity, partially or wholly offsetting the fuel-saving credits.

Table 1.3 provides a more detailed illustration how the two-part methodology would work for an existing1000 MW coal plant. Several alternative retrofit cases are shown. For this example, we assume that asector-wide benchmark of 0.5 tCO2/MWh applies.

Page 22: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

1-11

Table 1.3 Example of retrofit method applied to hypothetical coal plant retrofitsCASE 1 CASE 2 CASE 3 CASE 4

Existing Boiler eff. Generator Both Gen eff. ↑↑ 10%Facility ↑↑10% eff. ↑↑10% ↑↑10% CF +10%

Basic ParametersA Fuel Use MtCoal/yr 3.1 2.8 3.1 2.8 3.2B Electricity Generation (GENnew) GWh/yr 6132(GENold) 6132 6745 6745 7709C Effective Capacity MW 1000 1000 1100 1100 1100D Capacity Factor 70% 70% 70% 70% 80%E Total Carbon MtCO2/yr 6.13 5.57 6.13 5.57 6.37F Carbon Intensity (CInew) tCO2/MWh 1.00(CIold) 0.91 0.91 0.83 0.91

Equation 1: Fuel-saving Credits

G Credit Rate (CIold-CInew) tCO2/MWh 0.09 0.09 0.17 0.09H Fuel-saving Credits (G*GENold) MtCO2/yr 0.56 0.56 1.06 0.56

Equation 2: Additional Generation Credits

I Add'l Generation GWh/yr 0 613 613 1577J New capacity benchmark (CIbenchmark) tCO2/MWh 0.5 0.5 0.5K Credit Rate (CIbenchmark-CInew) tCO2/MWh -0.41 -0.33 -0.41L Credits for Add'l Generation (L*K) MtCO2/yr 0.00 -0.25 -0.20 -0.65

Equation 3: Total CreditsM Total Credit (H+L) MtCO2/yr 0.56 0.31 0.86 -0.09

• Case 1 is a 10% improvement in boiler efficiency. Overall capacity (MW) and capacity factor remainunchanged, while fuel use decreases. This is a purely “fuel-saving” retrofit. Only Equation 1 applies,resulting in credits of 0.56 million tCO2/year (rows H and M).

• Case 2 is a 10% improvement in generator efficiency, where overall generation (and capacity) goesup by 10%, while fuel use remains the same as pre-retrofit levels. Because the additional generationis more carbon-intensive than the benchmark, application of Equation 2 results in a debit of 0.25million tCO2/year (row L). Deducted from the fuel saving credits, the total credits come to 0.31million tCO2/year (row M).

• Case 3 combines the two changes in a single plant. As a result, the overall efficiency and carbonintensity improves by nearly 20%, yielding more fuel savings (1.06 million tCO2/year), and fewerdebits (0.20 million tCO2/year). Total credits come to 0.86 million tCO2/year.

• Case 4 combines the same 10% improvement in generator efficiency as Case 2 with an increase ofcapacity factor from 70% to 80%, which might occur as the result of lower operating costs. Theadditional coal generation results in a debit of (0.65 million tCO2/year), sufficient to offset the fuelsaving effects of the retrofit (0.56 million tCO2/year). As a result, the retrofit project actuallyproduces net emissions 0.09 million tCO2/year, assuming the benchmark reasonably represents thegeneration avoided.

What about the possibility that the plant would have altered its operating characteristics in theabsence of the CDM project? Can the CIold still be used as the baseline for fuel saving credits?For some facilities, especially where prevailing conditions are changing rapidly, operating characteristicsmay change sufficiently to render CIold an inappropriate baseline. For instance, in many former Sovietcountries, coal- and fuel oil-based district heating facilities are being retrofitted for improved efficiencyand cleaner fuels (natural gas and biomass) through AIJ initiatives. These facilities are also now switchingon their own, especially in the Baltics where natural gas availability has improved. (Kartha et al., 1998)Therefore, in the absence of AIJ, the carbon intensity of the targeted plants may have improved anyway.A parallel situation may exist for many inefficient and older power plants, for which retrofit activitiesmay already be under consideration, especially if life extension is being considered. For these reasons,

Page 23: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

1-12

some level of review of the unretrofitted project’s prospects for continued operation should be part of anybaseline approach.

One method to simulate what would have happened to the plant in the absence of the CDM investment isto create a control sample or cohort of other similar, yet still unretrofitted, facilities at the time of projectapproval. The operation of these facilities over time can then be tracked over time. If these facilitieswere subject to premature shutdown (due to high fuel costs, etc.) or significant changes in operatingpractice (fuel switching, retrofits, use of washed coal, etc.), then such changes could influence the valueused for CIold in Equation 1 above. For instance, if a cohort of coal plants similar to example in Table 1.3was used, and half installed the Case 1 retrofit (10% improved boiler efficiency) then CIold could beadjusted downward by 5%. (See Sections 3 and 4 for further discussion of cohorts and updating).

Whether such a reduction in credits would be warranted depends on whether the CDM project itselfinfluenced the adoption of carbon saving measures at other facilities. These types of “positive spillovereffects” have occurred in the context of some AIJ projects, such as Baltic fuel-switching projects, wherethe AIJ projects created market infrastructure for biofuels that non-project facilities have since used.(Kartha et al., 1998)

A simpler alternative would be to limit the number of years for which the original carbon intensity couldbe used as a baseline, e.g. 5 or 10 years, or to discount the credits. This would eliminate the need toestimate penetration rates or to track a cohort of similar facilities, while guaranteeing a credit rate for afixed period.

ConclusionThis section presents a transparent and workable methodology, as illustrated in Table 1.4, for dealing withpower sector retrofits in a manner distinct yet consistent with new CDM projects. It involves theapplication of up two separate carbon intensity baselines, one readily derived from historical dataregarding the plant prior to retrofit, and the other being equivalent to the benchmark baseline derived fornew capacity. Cohorts or limited crediting period could be used to account for the possibility thatimprovements in carbon intensity (or plant shutdown) might have occurred in the absence of the retrofit.

Table 1.4 Comparing methods for new and retrofit projects

Credits earned = electricity generation ×

(kWh)

[ benchmark

(tCO2/ kWh)

– project C intensity ]

(tCO2/ kWh)

New projects electricity generation ismeasured ex post basedon plant operating data

sector-wide or fuel-specific measured ex post

Retrofit projects Same as above Based on original plant(for generation <= original)

Same as above(for generation > original)

Same as above

Page 24: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

1-13

1.4 Off-grid Projects

The prevailing conditions and opportunities for off-grid (or remote) CDM projects are so vastly differentfrom those for on-grid projects that separate baselines are clearly warranted.8 Off-grid CDM projectssuch as solar home systems, renewable/hybrid mini-grids, etc. will typically compete against dieselgenerators or a combination of non-electric resources, such as kerosene and candle lighting, batteryrecharging, etc. Often off-grid projects deliver electricity services where none were previously availablenor would they be soon. Off-grid generation is increasingly pursued in the recognition that the grid maytake decades to reach many remote parts of developing countries. In rare instances, at the fringe of thegrid, off-grid electrification may serve as direct alternative to grid electrification. As a result, on-gridelectricity is typically an inappropriate counterfactual and the benchmarking methods described abovewould be inappropriate for off-grid CDM projects.

Should off-grid projects be subject to single-project or multi-project baselines?Given that off-grid CDM projects will tend to be much smaller scale than on-grid projects, theminimization of transaction costs is a strong argument in favor of multi-project baselines. Although site-specific circumstances will vary , diesel generators can easily be said to be the prevailing off-gridgeneration technology in most developing countries.9 Typical carbon intensity for a small dieselgenerator operating at 20% efficiency is approximately 1.3 kgCO2/kWh. In many instances, lower-income households rely on car batteries charged by the grid or by diesel units; one estimate puts thefraction of unelectrified households using lead-acid batteries at 10%. (Kaufman et al., 2000) In thesecases, batteries introduce additional losses from the battery discharge and any unused charge, therebyincreasing the effective carbon intensity. Where off-grid electrification is used primarily to deliverlighting services, and kerosene use (for lamps) is displaced, carbon savings will be even larger due to thevery low efficiency of kerosene lamps. Therefore, diesel provides a reasonably conservative benchmark.

ConclusionWe conclude that a standard diesel-based benchmark could be used for off-grid electrification projects.Applied across non-Annex 1 countries, a common benchmark would have the benefit of keepingtransaction costs to a minimum, while risking little in terms of the magnitude of potentially excess credits.

1.5 Spatial Disaggregation

What is the appropriate spatial scale for developing benchmarks?National level benchmarks are attractive for several reasons. Countries generally have a common andunique set of prevailing social, institutional, economic, fiscal, legal, regulatory, and financial conditionsthat together influence the types of CDM projects that might occur and the activity these projects mightdisplace. Nations may also possess their defined energy, forestry, or even climate strategies, resulting incommon national incentives (or disincentives) to specific high or low-carbon technologies, which in turnmight influence baseline activity. A good example of this Costa Rica’s avowed policy of investing in aheavily renewable electricity supply. Another example is China’s historical policies that have favored

8 Off-grid or remote generation can be defined as any generation that is not grid connected. Generation can be non-grid connected for several reasons – because the grid is to distant, because the application is temporary, or becausethe application is for grid back-up power, which is necessary in many capacity-constrained regions with frequentload-shedding.9 In most off-grid project documents, particularly those of the Global Environmental Facility, off-grid renewableprojects are assumed to displace more costly grid-connected electricity, stand-alone diesel generation, or somecombination of non-electric resources (kerosene and candle lighting, battery recharging, etc.).

Page 25: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

1-14

coal and hydro development. In addition, national geographic boundaries may be less complicated ifcountries are allowed to chose to adopt multi-project baselines or project specific baselines.

There may be situations that where wider or narrower boundaries for benchmark setting are desirable,especially for the power sector. Multi-national benchmarks may be appropriate where:

• The existence of regional electricity grids and planning entities. An example in SADELEC in theSouthern African region, which is heavily interconnected. Southern African utilities have beenactively considering development of dams in hydro-rich countries (Congo, Mozambique, etc.),which could provide an important source of new capacity, especially for the major load centers inSouth Africa. ESKOM is also considering construction of new coal facilities in South Africa asregional demand grows. Either way, in interconnected regions such as Southern Africa, if a newpower plant is proposed as a CDM project, “what would have otherwise occurred” may haveoccurred wholly or in part in another country.

• The scale economy of fewer benchmark development efforts.

Smaller than national scale benchmarks may be appropriate where:• The existence of multiple electric grids, distinct power markets, or state-level planning agencies

(e.g. in India) within a country, where coupled with:• Differential access to (or cost of) generation resources among these regions.

Where these conditions exist, benchmarks set at the level of power market, planning unit, orinterconnected grid might provide better approximations of the carbon intensity of avoided electricitygeneration.

How might power markets be defined for the purpose of CDM benchmarks?It would be ideal if there were a standard threshold measure of interconnectedness (e.g., maximum importload flow/total demand) and differential resource costs (e.g., % difference in prices or availabilities) todefine power markets for benchmarking purposes in a uniform and consistent manner across countries.However, in practice, transmission systems are highly complex, and power markets are rarely clearlycircumscribed. Regional power markets and pools do exist within and across many countries. Howeverthe criteria for their definition can differ based on the particulars of system dispatch, generation markets,transmission networks, transmission pricing rules, and political relations within a given country or region.In the US, for instance, FERC has established elaborate rules for defining power markets, but these aretoo cumbersome for benchmarking purposes and too idiosyncratic for widespread replication.

In many large countries, power systems have naturally evolved into regional markets with transmissionconstraints among them. These traditional power markets could naturally serve as “benchmark regions”,if benchmarking at this level of disaggregation were viewed as: a) necessary for sufficiently accuratebenchmarks, given regional differences in access to fuels and technologies; and b) feasible in terms ofcost and effort. Sub-national benchmark setting could be coordinated, so that a single, national datacollection and analysis effort could yield sub-national benchmarks without significantly increased effort(assuming the data requirements, i.e. performance data for all plants, were essentially the same).

Does the optimal spatial scale differ between fuel/technology-specific and sector-widebenchmarks?Much of the above discussion applies largely to the setting of sector-wide benchmarks, where differingresource availabilities among power markets affects the carbon intensity of capacity and generation on themargin. However, for benchmarks specific to fuel/technology combinations, there is likely to be far lessdivergence among regions in emissions performance. Such differences will be defined more by differing

Page 26: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

1-15

resource costs (due to proximity to mines, ports, etc.) than by power markets. For instance, in countrieslike Argentina with differing regional natural gas prices, NGCTs have been built and operated as baseloadfacilities where gas is very cheap, while NGCTs elsewhere may be used largely for peaking andintermediate load, with NGCCs as baseload.

Despite these occasional variations, the argument to make initial fuel/technology-specific benchmarksnational, if not regional, would appear rather strong. Fuel/technology benchmarks already represent asignificant level of disaggregation, and defining sub-national regions or conditions for separatebenchmarks would not only be difficult to do consistently but a potential costly and tricky to implement.Fuel/technology benchmarks might even be set usefully as international or regional standards toencourage adoption of best available technologies.

ConclusionAt least at early stages of the CDM, it might be simplest for nations to act at the default spatial level ofaggregation for benchmark setting. National decision-makers could then decide whether to developcoordinated sector-wide benchmark baselines with other heavily interconnected countries (or parts ofthem), and/or whether to develop separate benchmarks for regions within the country corresponding todifferent (or weakly interconnected) grids or power markets. Fuel/technology-specific benchmarks at theregional/international level should also be considered.

1.6 An Integrated Approach to Maximize Additionality and Reduce Transaction Costs

The previous Tellus/Stratus report on benchmarks described the potential for project overcrediting andmissed project opportunities induced by multi-project baselines (Lazarus et al., 1999). A simplebenchmark approach will credit all projects that are low emitting relative to some common benchmark,while granting no credit to projects that reduce emissions (relative to what would have otherwisehappened) but to levels that are still above the benchmark. The latter represent potential missed CDMproject opportunities. The crediting of all low-emitting projects, in turn presents the risk of overcreditingprojects that are either non-additional or displacing activities with lower emissions than the benchmark.These concerns are similar to those expressed for single-project baselines, where systematic biases(gaming by project developers) and random biases can lead to significant over- or undercrediting.

We suggest two mechanisms that together could significantly reduce the potential for both overcreditingand missed opportunities resulting from benchmark baselines: (1) specification of criteria whereby eitherthe sector-wide or fuel-specific benchmarks is deemed the “default” baseline, and (2) further additionalitytesting for projects that do not pass a project screen or for which the non-default benchmark is desired.

The default benchmarkThe first mechanism would seek to get the best of both benchmark types: more accurate crediting andfewer missed opportunities. For example, in a hydro-dominated country, such as Brazil or Argentina, anadvanced NGCC plant (with a carbon intensity of 0.35 tCO2/MWh) might not be credited under a sector-wide benchmark for electricity generation (below 0.30 tCO2/MWh). However, if it could be shown thatthe NGCC was in fact displacing a lower efficiency gas plant (e.g., 0.40 tCO2/MWh), then afuel/technology specific benchmark could provide credit. Conversely, the identical same-fuel plantimprovement project would be overcredited were it to take place in a coal-dominated country and asector-wide benchmark were to apply.

Page 27: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

1-16

Left to choose among the two benchmark options – one fuel-specific, the other sector-wide – the projectdeveloper would have the incentive to choose the one with the highest carbon-intensity in order tomaximize credit revenues, all other things being equal. To the extent that this inflates the number ofawarded credits, such a situation could undermine credibility and environmental integrity of the CDM.So what could be done to determine which benchmark option applies most accurately to a given project,short of a costly full project-specific assessment, which itself wouldn’t necessarily yield a better answer?There are at least four choices:

• Option 1. The sector-wide benchmark is the default baseline for all projects.• Option 2. The fuel-specific benchmark is the default baseline for all projects.• Option 3. The fuel-specific benchmark is the default baseline for all fossil-fuel projects and the

sector-wide benchmark is the default baseline for all non-fossil projects.• Option 4. There is no default baseline; the developer needs to show (or the relevant CDM body

must decide) which of the two baselines should apply to a given project. By requiring a reviewfor every project, transaction costs might approach those of project-specific baselines.

The choice between options 1, 2, and 3 could be partly driven by the objective of minimizing the numberof reviews required. To do this, one would need to estimate which instance – fuel switching or same-fuelefficiency improvement -- would be most likely to occur. A reasonable guess is that at the low creditprices (e.g., $10/tCO2) many expect to see in the initial years of CDM, the levels of financial incentiveprovided through CERs will more be likely to affect improvements in efficiency rather switching to newfuels. 10 This would weigh in favor of options 2 or 3.

There are also reasons to favor the default of sector-wide benchmarks for non-fossil options, especially inthe case of technologies with low penetration levels, such wind, solar, or biomass, since these projects areunlikely to be displacing “same-fuel projects”. Together, these factors suggest option 3 above: fuel-specific benchmarks as the default for fossil projects and sector-wide for non-fossil.

Automatic approval of projects passing projectAs discussed in the Section 2.2, a project screen could identify projects that, due to small size and lowpenetration, are unlikely to produce significant excess credits if automatically awarded the defaultbenchmark. All projects falling below this de minimis threshold would become automatically eligible forapproval under the default benchmark. Larger projects or those technologies with already significantmarket penetration would then be subject to further additionality testing.

This simple step could address a critique of benchmark baselines, that they could automatically allow asignificant quantity of questionable credits due to free rider projects. For large projects, e.g. above athreshold of 10 MW, the transaction costs of more rigorous testing of additionality would likely comprisea very small fraction of project investment, and thus should not markedly hinder project development.Technologies that have already achieved significant market penetration, e.g. greater than 5% market

10 One could also get a better sense of the relative markets and cost for each type of project by examining data fromAIJ projects, GEF projects, and potential investments investigated in national GHG mitigation studies. An effort togather such information (and limited analysis thereof) was recently conducted for the Dutch government, with theaccumulated data entered in consistent data base. See ECN, AED, SEI-Boston, 1999. Potential and Cost of CleanDevelopment Mechanism Options in the Energy Sector: Inventory of Options in Non-Annex I Countries to ReduceGHG Emissions, for the Netherlands Development Cooperation (DGIS), ECN Report C--99-095, Netherlands,December.

Page 28: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

1-17

share, as might be the currently case for CFLs in many countries, are the other major potential source offree riders, thus potentially requiring additional review.

Further additionality testsThere are thus two conditions under which a further additionality test would be required:

• projects above the de minimis threshold, and• projects for which the “non-default” benchmark is desired. (In the case of option 3, for a fossil-

fuel project, the default would be the fuel-specific benchmark and the “non-default” would be thesector-wide benchmark. The proponent of a natural gas facility, for instance, may claim thatwithout CDM credits, natural gas would not be otherwise be used for new power generation, andthus the fuel-specific benchmark should not apply.)

Possible methods for this type of testing are discussed in Section 2.2: financial additionality, penetrationrates (if not used for the de minimis threshold), market barrier assessment, or “project-specific”, i.e.whatever submission and review procedures that would be otherwise be used (in the absence of multi-project/benchmark baselines).

Table 1.5 compares the major attributes of three possible benchmarking approaches –straight use of fuel-specific and sector-wide benchmarks without project screens or additionality testing, the integratedapproach presented here, and project-specific baselines. The integrated approach is somewhat lessdemanding than a project-specific approach in type of effort and level of transaction cost for projects thatdon’t pass the initial screen and significantly reduced for those that do pass. In principle, the likelihood ofinaccurate crediting both under and over (due to inappropriate benchmarks or developer gaming ofproject-specific baselines) should be significantly reduced, with the financial incentives of the CDMpreserved.

Page 29: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

1-18

Table 1.5 How the integrated approach differs from the straight benchmark or project-specific approaches (All metrics relative to other options in a given row)

1. Sector-wide,benchmark, noadditionalitytests

2. Fuel-specific,benchmark, noadditionalitytests

3. Integratedbenchmarkapproach (above)

Pure single-project

Value of the baseline sector-widebenchmark

benchmark foreach fuel

1. or 2. depending onoption used

Project-specific,established throughreview process

Additionality testing None (aside frombenchmark)

None (aside frombenchmark)

Mostly for largeprojects

All projects

Cost of baselinedevelopment andapproval

Low-Medium(depending onscale economiesfrom project flow)

Low-Medium(depending onscale economiesfrom project flow)

Low-High (dependson how additionalitytest is done and howfrequently required)

Medium-High

Non-additionalprojects:• Likelihood of

significantquestionablecrediting

Medium-High(depends partly onstringency)

Low – Medium(depends partly onstringency)

Low Low-High (depending ondeveloper behavior; rigorof the approach andreview process)

• Likely types ofnon-additionalprojects

All low-carbontechnologies(hydro, nuclear,gas in coal-basedcountries)

Better-than-benchmark coal,and to less extentNG, plants

Small-scale low-carbon projects; somehighly cost-competitivetechnologies

Highly cost-competitivetechnologies (NGCC,hydro, wind, etc.) andprojects (efficiencyretrofits)

Potential forovercrediting ofprojects that areadditional

Medium-High(depending onstringency)

Low-Medium(depending onstringency)

Low Low-High (depending ondeveloper behavior; rigorof the approach andreview process)

MissedOpportunities• Potential Medium – High

(esp. if efficientcoal projects areallowed)

Medium – High(if low creditmargins stiflefuel-switchingprojects)

Low-medium Low – medium(depending on whethertransaction costs are asignificant share ofexpected credit revenue)

• Types ofprojects mostlikely to bemissed

Higher-carbonprojects (e.g.efficient coal)

Fuel-switchingprojects (coal togas; any fossil torenewables)

Smaller projects, e.g.renewables (due totransaction costs)

1.7 Summary and Findings

Of the five aggregation dimensions reviewed here, the question of fuel/technology disaggregation is themost challenging. There are two principal options: single sector-wide benchmarks (all fuels or all fossil)and/or individual benchmarks for each fuel/technology. Each approach has both distinct advantages andblind spots. The single benchmark is simple and transparent, but only rewards low-carbon projects.Opportunities to improve the performance of high-carbon fuels and technologies might not be rewarded.These opportunities can be captured instead by the use of fuel-specific benchmarks. But use of fuel-

Page 30: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

1-19

specific benchmarks alone will not encourage or reward projects that involve switching to lower-carbonfuels. Therefore an integrated approach may be needed if multi-project or benchmark baselines are to becapable of encouraging and rewarding both types of projects: those that improve (supply-side) efficiencyand those that switch from a higher-carbon to a lower-carbon fuel (or to demand-side efficiency).

While the distinction between peakload and baseload and between dispatchable and non-dispatchableresources is central to power system operation and its resulting emissions, we conclude that it is likely toprove impractical, in most cases, to disaggregate by duty cycle for setting CDM benchmarks. Simplemethods for developing and applying separate peak and off-peak benchmarks could lead to misleadingoutcomes. Furthermore, few supply-side CDM projects are likely to disproportionately and reliablyproduce (or avoid) electricity on peak, and if they do (e.g., superefficient gas turbine; windfarm with veryhigh peak coincidence) they might be effectively handled by using fuel/technology-specific benchmarksfor peakers such as gas turbines. Most benchmark methods described here are generation-weightedaverages, and since baseload and intermediate plants produce the bulk of electricity, will more closelyreflect these types of facilities. The most likely source of highly peak-coincident CDM project isdemand-side efficiency, such as programs for lighting or air conditioner improvement where these areoperated heavily during peak periods. For these and other types of small projects, a load curve approach isworth exploring further.

Retrofits are an important category of projects to consider separately from new projects. Opportunities toimprove the fuel efficiency of power plants are abundant: repowering gas turbines into combined cyclefacilities; improvements in fuel quality, handling, and boiler operations; and so on. In some cases, like arepowering or generator replacement, a retrofit is likely to increase the capacity (maximum production) ofa power plant. In others, like improved boiler operation, lower fuel costs could increase capacity factor(extent of operation). In both situations, it is useful to think of retrofit as two separate projects: one that“replaces” the original plant, and another that that is responsible for increased generation.

To the extent that overall electricity production does not increase, the carbon intensity of the pre-retrofitplant can provide an appropriate benchmark. The retrofit is simply replacing the old pre-retrofit facility,which would have otherwise been operating at its higher carbon emissions level. It is possible, however,that the retrofit activity would occur anyways, without CDM credits, or that the retrofit will extend thelifetime of the power plant. These possibilities could be addressed by several means, such as tracking acohort of similar facilities, or limiting the application of this pre-retrofit carbon intensity benchmark tofixed number of years.

To the extent that electricity production does increase, this “added generation” can be treated as newcapacity, since this added generation would not have occurred without the retrofit activity. Theappropriate benchmark (fuel-specific or sector-wide) for new capacity would then apply only to thegeneration that exceeds previous levels. By accounting for the potential for increased generation at retrofitsites, this methodology also offers important environmental safeguards as well as the opportunity forinvestors to obtain additional credits (where the plant emissions rate is lower than the applicablebenchmark). The two-part methodology is straightforward and offers consistency with the methodologyused for other CDM projects. A similar approach can also be used for single-project baselines.

Off-grid CDM projects could be large in number, but due to the small amounts of electricity involved, areunlikely to generate a significant fraction of total CDM activity. For this reason, low transaction costs forbaselines is particularly important for this category of projects. A benchmark based on diesel generators,the world’s most common source of off-grid electricity whether used directly or for battery charging,might offer a reasonable baseline that can be applied throughout non-Annex 1 countries.

Page 31: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

1-20

The appropriate level of spatial aggregation for benchmarks is difficult to determine without reference tolocal grid conditions. Where there are few technological (transmission capacity) or political/regulatoryconstraints for electricity to flow freely among several countries, then a multi-country benchmark wouldmost make the most sense. A CDM project in one country would then be avoiding new electricityproduction throughout the region. Conversely, if transmission or other constraints limit interchangeamong parts of a country, then sub-national benchmarks would be technically preferable. However,subnational benchmarks may be administratively burdensome in all but the largest non-Annex 1 countries(e.g., India, China, Indonesia). National level benchmarks are a reasonable starting point. Case-by-casedecisions could then determine to whether to opt for regional benchmarks (where regionalinterconnections and planning are strong) or sub-national benchmarks (for countries with distinct grids orpower pools that possess different resource profiles).

This section concludes by presenting a possible integrated approach to balance the key advantages ofbenchmark baselines (transparency, consistency, reduced developer-gaming) and those of project-specific(testing for additionality). One option is for the default benchmark would be fuel-specific for fossilprojects and sector-wide for non-fossil. All projects passing under a project screen -- below a de minimisthreshold in total credits, size (MW or GWh) and/or penetration rate -- could be automatically approvedfor use with the default benchmark. The screen would require further additionality testing of: a) larger ormore common project types; and b) projects whose developers requesting the non-default benchmark(e.g., a gas project requesting the sector-wide benchmark, claiming displacement of coal). As discussedin Section 2, this additionality test might be simple or complex, but in no case more so than a project-specific baseline. Since these added transaction costs would be largely borne by sizeable projects, theywould likely represent a very small expenditure relative to the project investment. The benefits would be,in principle, a lower incidence of free riders and, as a consequence, greater credibility of the CDM.

Page 32: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

2-1

Section 2. Minimizing the Risk of Non-Additional Credits

This section reviews three possible mechanisms to minimize the risk of granting credits to non-additionalactivity under the CDM: stringency, additionality testing, and credit discounting. The risk of non-additional credits is a central concern of the CDM, regardless of whether project-specific or benchmarkedbaselines are used. With benchmarking, the main concern is that the benchmark level will be set at a levelthat allows too many non-additional projects and unwarranted credits. Therefore, it is important toconsider alternative mechanisms whose purpose would be to:

1. To reduce excess credits from free riders – activities that would be happening even in theabsence of the CDM. Any project that has a lower GHG-intensity than a benchmark would beeligible to earn CDM credits. If a benchmark is set at the average GHG intensity for a projectcategory, then half of all activity is, by definition, “better-than-average”. Absent a mechanism toweed out these “anyway” projects, they would be eligible for crediting. The sheer magnitude ofthese unwarranted credits could be quite large – potentially much greater than the credits earnedby real additional reductions – and as a result undermine the credibility of the CDM.

2. To reduce the potential for over-crediting some projects. Over-crediting would result for anyproject that reduces emissions from a level that is already below the benchmark.

These and other baseline methods must balance maximizing environmental integrity (minimizing excesscredits) with maximizing incentives for real reductions (minimizing transaction costs and creditingadditional projects).

2.1 Stringency

One approach to constructing benchmarks that reduces the risk of granting unwarranted credits to non-additional activity is stringency. A more stringent benchmark implies a benchmark that is set at a lowercarbon intensity. A more stringent benchmark allows fewer free rider projects, and awards fewer excesscredits to allowed projects, compared with a more lenient benchmark such as average carbon intensity.

Figure 2.1 illustrates how stringency works. Project B, for example, would qualify under an averagebenchmark, but not under the more stringent, better-than-average, one shown. Project C would qualifyunder both, yet receive significantly fewer credits per MWh with the more stringent benchmark (thecrediting rate is the distance between point C and each benchmark).

Figure 2.1 Average vs. better-than-average benchmarks (A-C are indicative projects)

CarbonIntensity

(tCO2/MWh)

CarbonIntensity

(tCO2/MWh)

• B

• C

Better-than-average benchmark

Averagebenchmark

• A

Page 33: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

2-2

Minimizing excess credits through stringency could create an opposite concern. As benchmarks getlower, credits for some legitimate (additional) CDM projects would be eliminated or reduced, diminishingthe incentives for undertaking these projects, and making them less competitive on the internationalcarbon credit market. A stringent benchmark therefore must be set so that it is not too stringent as toreduce or eliminate the incentive for real reductions.

Stringency is one of several key parameters that define the benchmarking approach, and is discussed atsome length in our earlier report. In this section, we review alternative methods for calculating better-than-average benchmarks, and then demonstrate how stringent benchmarks could be developed, both fuel-specific and sector-wide. (See Section 1.) For each, we show how benchmarks can be developed usinglocal data to reflect technological and operational circumstances.

If stringent means better-than-average performance, then the obvious questions are: Whose average?How much better? And how would this level be determined? With respect to the first question, powerplant performance could be measured relative to experience at the power pool, national, internationallevels, or some combination thereof. The broader question of the spatial/geographic level of benchmarkaggregation is discussed in Section 1. If we assume that benchmarking is done nationally for the powersector, then good performance could be determined based on: a) the actual operating experience of powerplants within a given country; b) design efficiencies of these power plants; or c) international experienceor estimates. The choice among these options is a function of data availability (and cost) and judgment asto whether past experience provides a reasonable indicator of good performance.

Section 3 covers these and other data issues in greater detail. Assuming that a sufficient data sample isavailable, then the following are three options for defining more stringent benchmarks:

• Percentile benchmarks (best X %), which are based on a relative definition of “goodperformance.” The distribution of facilities in terms of carbon intensity (or efficiency) isestablished, and a better-than-average criterion for good performance is set, such as the 25th

percentile. The U.S. EnergyStar program for certifying energy-efficient equipment is an exampleof a percentile-based, better-than-average approach, that uses 25th percentile as the threshold for“good performance”.

• An average of historical good performance, e.g. the best X plants or best Y% in the most recentZ years. Since percentile methods can be subject to “knife-edge” solutions where distributions ofcarbon intensities are discontinuous, this approach will tend to yield more stable and predictablebenchmarks. While discontinuity is more of a problem with sector-wide benchmarks where thedistribution spans different fuels (see Figure 2.3), it can also occur with single fuels, such asnatural gas, where there might be a gap between NGCC and other gas plants. There areobviously many possible variants of this method (values for X, Y, and Z). Ideally, the plantsshould be among the most recent vintage, e.g. past 5 years, to better reflect the characteristics ofmore current technologies. To be meaningful there should be a minimum number of plants of aparticular type; a reasonable requirement would be that X should be less than or equal to half ofthe total number of plants in the past Z years.

• Performance standards. Another, more normative approach would simply be to establish anacceptable level of good practice by fuel type, such as current combined cycle technology forbaseload natural gas plants. This would require little data collection, but rather buildingconsensus on what constitutes good performance.

Stringency and fuel-specific benchmarksThese options apply in the most straightforward fashion to fuel-specific benchmarks. We applied them todevelop some sample stringent benchmarks for Indian coal plants. We then tested these benchmarks

Page 34: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

2-3

against the last five coal plants built in India. Clearly these plants were “non-additional” as they werebuilt in the absence of any CDM incentives, i.e. they would have happened anyways. This approachallows us to determine the extent of free-rider credits, were a similar set of plants were to seek credits byregistering as CDM projects.

Table 2.1 Four alternative benchmarks for the Indian power sector, and the non-additional credits that would have been rewarded to the last five commissioned powerplants

Non-additional credits from last 5 coal plants (ktCO2/yr)

Basis for benchmarkBenchmark(tCO2/MWh)

Only counting years ofbetter-than-benchmark

performance

Net effects including yearsof worse-than-benchmark

performanceAverage of all coal plants11 1.105 1094 1047Last 5 coal plants commissioned 1.008 324 010%ile of all coal plants 0.946 134 -640Best 5 coal plants in last 15 years 0.913 122 -1030Best available technology (IGCC) 0.760 0

Table 2.1 shows five different methods for setting benchmarks, and the amount of non-additional crediteach one generates, for the example of the Indian power sector. These benchmarks are calculated basedon data from the three most recent years of available data (1996-1998). The first column shows the basison which the benchmark was established, and the second column shows the benchmark itself. Figure 2.2indicates the 10th and 25th percentile benchmarks, and the five most recent plants.

The third and fourth columns of Table 2.1 require some prefacing remarks. Carbon intensity of coalstations can vary from year-to-year based on differences in plant operation, fuel quality, and otherconditions.12 As a result, a plant whose average performance over a series of years beats the benchmark,may still have years in which the benchmark is actually exceeded. For instance, reduced operation andincreased cycling due to low demand or forced outages during a given year will deteriorate a plant’s heatrate and carbon intensity. Should the investor’s “credit account” be debited to the extent its project’scarbon intensity exceeds the benchmark in particular years? Note that the general issue of worse-than-baseline performance must be addressed in CDM rules regardless of the baseline method chosen.Environmental integrity would appear to dictate that “negative crediting” should occur in such cases.

The difference between columns three and four of Table 2.1 illustrate the importance of “negativecrediting”. Applying the “counterfactual test” described above, these columns show the quantity of non-additional credit that would be generated if the last 5 coal plants were registered as CDM projects, and ifthey earned credits to the extent that they performed better than the benchmark, based on performanceduring the years 1996-1998. The third column shows what happens if negative crediting is not done, i.e.years in which plants exceed the benchmark are simply ignored. The fourth column assumes that a powerplant will earn negative credits in a given year if its performance does not surpass the benchmark.13

11 Based on generation-weighted average data for 1992-93 through 1998-99. Data collected by SAIC.12 Variations in fuel quality and heat content were not tracked: the Indian coal plant collected by SAIC assumes aconstant heat content for each plant’s coal. Variations in fuel quality may thus lead to overstated changes in carbonintensity.13 The figures should be viewed merely as illustrative. The data years with low enough carbon intensities to fallbelow the benchmark for these recent years are the years of start-up where accounting (when tonnages of coal aremeasured) or other issues appear to create non-sensically low values. Nonetheless, the general points still hold, assimilar variation occur for other well-established facilities.

Page 35: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

2-4

Figure 2.2 Carbon Intensity of Indian coal power plants

To provide a sense of the scale of the non-additional credits generated shown in columns 3 and 4 of Table2.1, consider two figures. First, a financial comparison: 100,000 tCO2/yr of credits at $10/tCO2, would bevalued at $1 million/yr, an amount that could provide enough incentive to motivate the owners of theseplants to register their projects for these non-additional credits. Second, a comparison in terms ofmitigation activity: 100 ktCO2/yr is approximately the amount of credit that would be earned by onemillion solar home systems (at a typical average capacity of 50 watts per home.

Stringency and sector-wide benchmarksOne can also apply a similar stringency approach to sector-wide benchmarks. To set a sector-widebenchmark that reflects the range of fuels available to a particular electric sector, one cannot simply basethe benchmark on the lowest-emitting examples of the sector’s power plants (as one could do for a fuel-specific benchmark). This would most likely yield a benchmark that is determined entirely by the lowest-carbon fuels. In the case of the US, (see Figure 2.3) any benchmark more stringent than the 35%ile wouldreflect only hydroelectric and nuclear generation.

Figure 2.3 Distribution of power plant carbon intensities in the US

0

100

200

300

400

500

600

700

800

900

1000

0 0.25 0.5 0.75 1 1.25 1.5 1.75 2

Carbon Intensity (kg CO2/kWh)

Gen

erat

ion

(T

Wh

)

NGCC Carbon Intensity

Average U.S. Carbon Intensity

Median U.S. Carbon Intensity

0

0.5

1

1.5

0 500 1000 1500 2000 2500 3000

Cumulative Generation (TWh)

Car

bo

n In

ten

sity

(kg

CO

2/kW

h)

Coal, Steam

All Generators

The histogram on the left shows the distribution of plant generation by carbon intensity for a recent historical year. Thecumulative generation chart on the right shows that for all generators the first nearly 1000 TWh (see bar in histogram next to yaxis), or 35% of total generation, has zero carbon intensity. Since a percentile benchmark would be derived by picking a pointalong the “all generators” curve, the value of the benchmark could be highly sensitive to the precise percentile chosen.

0.000

0.200

0.400

0.600

0.800

1.000

1.200

1.400

1.600

1.800

2.000

0 50000 100000 150000 200000 250000 300000

Cumulative generation, 96-98 avg (GWh)

Car

bo

n In

ten

sity

(tC

O2/

MW

h)

10th perc 25th perc

Page 36: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

2-5

Our previous report made a case for looking at the mix of recent additions (including plants underconstruction) as a proxy for what is being built on the margin, and thus as a better estimate of thecounterfactual than the mix of all plants in a given system. (Lazarus et al., 1999) One method to reflectthe mix of recent additions, and apply the notion of stringency is as follows:

1. Derive the generation mix coming from recent additions, using the data sample guidelinesreferred to above.

2. Apply stringent fuel-specific benchmarks to each fuel.3. Calculate the weighted average sector-wide carbon intensity

Table 2.2 demonstrates this approach for the case of India, which has a mix of coal, gas, hydroelectric,and nuclear capacity.

Table 2.2 A sector-wide benchmark for the Indian power sector, constructed from aweighted average of fuel-specific benchmarks based on last five plants commissioned

Fuel fraction ofgeneration

Fuel-specific benchmark based on the best 5plants commissioned in the last 15 years.

(tCO2/MWh)Coal 50 % 0.91Gas/oil 33 % 0.50*Hydro 16 % 0.00*Wind/other N/A N/AWeighted average - 0.63*In the absence of empirical data on performance of these technologies in India, hypothetical values are used.

Note that the same approach can be used with either empirically-derived fuel-specific benchmarks orperformance standards as discussed previously.

2.2 Project Screens and Additionality Tests

Further additionality testing would be a step towards project-specific analysis, using criteria orassessments to judge the likelihood that a given project would have occurred in the absence of CDM.Testing methods might include financial additionality analysis, expert judgment, multi-objectiveassessment, or barriers removal criteria. This approach introduces a project-specific element, because itrequires that projects be individually assessed with regard to specific criteria.

Since such testing could increase the complexity and transaction costs for baseline development, projectscreens could be applied: (i) to limit the additionality tests solely to those project types with the highestrisk of generating questionable credits (large projects or those with already significant marketpenetration); or (ii) to exclude altogether certain activities that are most likely to be non-additional in agiven context (e.g., large hydro in countries with low-cost sites).

For example, a project screen could be used to identify large projects or those with already significantmarket penetration. These projects are then subject to the further additionality test. The notion here is thatthe bulk of questionable credits (i.e., free riders) are likely to come either from larger projects, simply dueto the scale of credits involved, or from smaller projects that employ commonly-used technologies (e.g.,DSM programs promoting compact fluorescent bulbs in countries with high penetration rates).Furthermore, the transaction costs imposed by further additionality testing are likely to be relatively

Page 37: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

2-6

insignificant for large investment projects, such as large-scale power plants, and in any case no greaterthan these projects would face under a project-specific baseline approach. Therefore, additionalitytesting, if designed appropriately — and applied only to larger facilities and those smaller ones with agreater likelihood of non-additionality— might be implemented without significantly inhibiting themarket for valuable CDM projects.

What would constitute reasonable project screens for identifying projects that should undergofurther additionality tests?Project screens seek to shield from the cost of further additionality tests those projects with lowestlikelihood of creating significant numbers of of questionable credits. Because such projects are likely tobe small in terms of total volume of credits created and involve technologies with limited penetration in agiven market, project size and technology penetration could usefully serve as threshold criteria. There aretwo options for project screens that could be used separately or in conjunction with each other:

• Minimum Size: All projects above a certain size would be required to apply the additionalitytest. This de minimis threshold could be defined in terms of either MW, GWh per year, or tCO2of credits per year. Consider, for instance, a threshold of 50 GWh, equivalent to a 10 MW powerplant operating at about 60% capacity factor, a reasonable level at which to distinguish smallfrom large scale power projects . The power plant itself might cost about $5-15 million,assuming capital costs in the range of $500 to $1500/kW typical of most thermal, hydro, andother renewable plants. If the project were credited at 0.2 tCO2/MWh, and the resulting 10,000tCO2/year of CERs were to trade at $5/tCO2, then the annual credit revenue stream would beapproximately $50,000 per year. Assuming the additionality test were to cost, say, approximately$10,000 per project in administrative and analysis fees for the first year only, it would comprise20% of one year’s credit revenue. This figure is at the low end of the size threshold; for a 200MW facility it would comprise only 1% of first year revenues. There are no well-establishedestimates for the key parameters (cost of additionality test and price of carbon), but this ballparkestimate suggests the costs involved might be manageable.

• Penetration rate (for smaller projects): All projects lower than the minimum size, but above aminimum penetration rate, would also need to pass the additionality test. Penetration rates reflectthe extent to which a particular technology has already been adopted. Penetration rates areexpressed as a percentage of the total population of technologies yielding comparable services,and apply to the demand side technologies and activities as well as supply side – e.g., theproportion of lighting services provided by CFLs (as opposed to incandescent fixtures) inresidential buildings, or the proportion of electricity produced from efficient natural gascombined cycles.

The purpose of the penetration rate threshold would be to identify those technologies, which maybe small in individual size, but already widely deployed. A technology possessing significantmarket share is, prima facie , more likely to be a free-rider than other technologies. For electricityproduction, one could make first order penetration rate thresholds by simply using fuel shares,and ignore, for a moment, the question of specific technologies. Appendix B shows, for instance,that if set at a fuel share of 3%, a penetration rate trigger would force an additionality test forsmall geothermal projects in a handful of countries with established resources, like Indonesia andthe Philippines.

• Penetration rate (larger projects). It is also possible to conceive of a stricter penetration ratethreshold that could be applied to larger projects. In general, but for larger projects especially,but penetration rate might be most usefully measured after the project is implemented, since the

Page 38: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

2-7

first project of a certain kind, could easily take the penetration rate from 0% to 20% or beyond ina small country (e.g., an NGCC where it is first natural gas plant).

If setting a discrete threshold seems arbitrary, an alternative is a graduated approach that providespartial crediting for an intermediate degree of penetration – analogous to the method suggested byMeyers below for financial additionality. For example, the approach could provide full credit if atechnology is below some penetration level, zero credit if the technology is above an upperpenetration level, and credit that is discounted linearly if the technology falls between the limits.

The use of penetration rate as an indicator of activity additionality will need to be tailored to theparticular project type in question. The threshold penetration that reflects a transition fromemerging to commercial will depend on the activity and the context. For example, a penetrationrate of 1% might reflect a newly emerging status for a bioenergy technology in a country withlarge unexploited biomass resources, whereas 1% might correspond to nearly complete marketsaturation for hydroelectricity if that country has tapped much of its existing hydro resources.Therefore, the penetration rate threshold would have to be realistic for the activity and thesectoral context, and will have to adequately distinguish emerging from commercial technologies.

How would the additionality test be done?

Several methods have been proposed:

1. Financial additionality. Several observers have suggested the possibility of determiningadditionality based on financial analysis. The rationale is that a truly additional project would notbe financially attractive were it not for the expected financial value of carbon credits. Followingthis reasoning, one should be able to show that the project financally unattractive, e.g. that itsanticipated return on investment (ROI) is below a certain threshold of acceptable return (or bestreturn from any other mutually exclusive project). Meyers (1999) suggests the establishment of arange, which could be used to judge a project’s probability of additionality:

If the ROI falls below some minimum threshold, the project is presumed to be additional with100% probability. If the ROI falls above some maximum threshold, it is presumed non-additionalwith 100% probability. Between the thresholds, the presumed probability of the project’s non-adoption is interpolated. To calculate carbon credits, the amount of emission reductionsassociated with the project would be scaled by the estimated probability of additionality. If theacceptable ROI range for a given sector is 20-40%, for example, a project with an estimated ROIof 30%/year would receive credit for half of the calculated emissions reductions. (p.6)

There are significant challenges with the financial additionality approach. It requires thedeveloper to agree to divulge financial information that is often considered confidential. It doesnot take into account non-financial barriers that make it difficult to implement certain types ofprojects. It may prove difficult to account for the trade-off between risk and reward that may existin a sector. And it is sensitive to assumptions about economic parameters that may not be easy toagree on and that could be subject to gaming unless standardized (fuel prices, credit price, etc).

2. Market barriers. Even if it is cost-competitive, a project may not be adopted because ofidentifiable market barriers. (IEA, 1997) In many cases, market barriers prevent the adoption oftechnologies, and CDM projects can foster truly additional activities by helping to eliminate thosemarket barriers that would otherwise be insurmountable. On the other hand, there might becertain market barriers that could and should be addressed independently of any CDM activities

Page 39: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

2-8

for reasons of economic efficiency. For example, many argue for the elimination of subsidies forfossil fuels, which make energy efficiency or renewable alternatives uncompetitive. Screening aproject to determine whether market barriers are being effectively overcome might be a workablemethodology, especially for small-scale and efficiency projects where market failures are morelikely to exist. Testing additionality by assessing the relevant market barriers will therefore haveto address the issue of what barriers are responsible for the exclusion of a technology from themarket and whether those barriers are legitimate grounds for deeming a prospective CDM projectadditional. (Michelowa and Dutschke, 1999; Puhl, 1999)

3. Project-specific. Leave it up to developer to make its best showing to a review panel.

A simple yes/no determination of additionality will be nearly impossible. In other words, there is someprobability that a given project would have happened anyway. As noted for the first options above andthe penetration rate screen for larger projects, can both reflect a probability of additionality in thediscounting of the credits available (see next section). This probability could also be assessed on the basisof a multi-objective assessment by the CDM review panel.

2.3 Credit Discounting /Standardized Criteria

Another means to minimize excess credits would be to set the benchmark at average performance levels,apply standardized criteria to judge a project’s additionality, and then to “discount” the calculated creditsaccording to the likelihood of non-additionality. Credits would be awarded as follows:

Not discounted: Credits = (Benchmark – Project Carbon Intensity) × kWh generatedDiscounted: Credits = D × (Benchmark – Project Carbon Intensity) × kWh generated

where 0 < D < 1

D would be set so as to offset the impact of non-additional projects. Two options are possible: D can becommon to all projects, so as to reflect the probability of non-additionality on average across the wholesector. The notion is that, on aggregate, the sector as a whole generates no more credits than arewarranted; the amount of credits lost by fully additional projects would be roughly equivalent to thoseprovided to non-additional projects. In theory, the result, if such a D could be estimated accurately, wouldbe environmentally neutral. However, the possible response to any mechanism that automatically lets insignificant amount of non-additional credits (projects that pose little additional cost to developers) couldbe to depress credit prices, and undermine the market for truly additional projects.

Or D could be customized to different project types. This would provide greater accuracy if one had areasonable notion of which project types had the greatest probability of being additional. In this case, Dmight be determined on the basis of technology penetration rate or ROI for generic investments.

Table 2.3 is a hypothetical example of the type of general criteria that could be used for setting D.Projects that are virtually assured of being additional would be assigned values of D equal (or close) to 1,whereas projects that have some probability of being non-additional are assigned values less than 1.

Page 40: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

2-9

Table 2.3 Discounting credits to reflect likelihood of non-additionality: a hypotheticalframeworkDiscounting

factorD

Project criteria Example

1.00Advanced, emerging, not yet proventechnologies in a context where they areclearly not yet cost-effective.

Grid-connected fuel cells where otherless costly technologies are available.

0.75Advanced technologies that are not yetdeployed, though they might be cost-effectiveabsent market barriers, lack of information,riskiness, etc.

Grid-connected wind, where resourcesare good but technical infrastructure islacking.

0.50 Technologies that are not yet deployed, but arecost effective and barriers are small

Compact fluorescent lighting

0.25 Cost-effective technologies that are justemerging in the market

Advanced NG combined cycle powerplants

0.00* Technologies that are already widely deployedin the market

Standard hydro facilities at low-costsites

*Note: setting the discounting factor equal to zero would be equivalent to rejecting the investment as a CDM project.

The appeal of this approach is that it recognizes that evaluations of additionality are uncertain by theirvery nature, and that an unambiguous yes/no assessment of a project’s additionality is difficult. Thisapproach provides a simple way to preserve some level of incentive for projects that are potentiallyadditional, while reducing the impact of false credits generated when projects prove non-additional.However, it could prove hard to agree upon standard criteria and discounting factors, and the outcomecould well deter many valuable CDM projects.

2.4 Conclusions

This section presents three methods for minimizing excess credits: stringency, additionalityscreening/testing, and credit discounting. These are not mutually exclusive methods. The principalfeatures of each are presented in Table 2.4. None of these methods is fully objective or prescriptive; eachpresents key parameters that would require agreement either at the national or international levels:stringency criteria, additionality thresholds, or discounting criteria. Each method is likely to deter someamount or type of potentially valuable CDM activity, while reducing the likelihood of free-rider and over-credited projects.

Stringency appears the most straightforward of the methods; a single value for the benchmark would bedeveloped and applied to all projects in a common manner. There are several ways to define a better-than-average benchmark. These are summarized along with suggested approaches (in bold) in Table 2.5below.

Page 41: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

2-10

Table 2.5 Options for setting better-than-average benchmarksIf sufficient,reliable data… Fuel-specific benchmarks Sector-wide benchmarks…are available • Weighted average

• Percentile basis (best X %)• Best X plants or Y% in the last Z years• Combination of the above

• Same as fuel-specific, all plants• Same as fuel-specific, weighted by mix of

recent additions (last Y plants or Z years)

…are unavailable • Same as above for region or othercountries in region

• Performance standards (set nationallyor internationally)

• Same as fuel-specific, weighted by mix ofrecent additions (last Y plants or Z years)

However, relative to average performance, stringent benchmarks will reduce the amount of creditsgranted to all CDM projects. In cases where the project carbon intensity is only slightly better than thestringent benchmark, the difference could have a significant impact on total credit revenues. Likebenchmarks in general, stringency will only help to distinguish additional projects from non-additionalprojects to the extent that a project’s likelihood of being additional correlates well with how low itscarbon intensity is.

The other two methods – additionality tests (with project screens) and discounting – could be applied withaverage benchmarks, resulting in more credits and wider eligibility (on the basis of carbon intensity) forprojects than more stringent benchmarks would allow. They could be applied with more stringentbenchmarks, as well. However, the challenge is to come up with simple additional tests that obviate thepotentially costly analysis and cumbersome review process of project-specific baselines. Discountingmight prove more politically challenging to adopt it appears to favor certain types of investments: smallerprojects in the case of additionality testing and emerging technologies (for example) in the case ofdiscounting.

In summary, all three methods are worthy of further consideration. While several details would need to beworked out, these methods offer practical options for reducing the excess credits that would be generatedby an average performance benchmark. Whichever method is adopted, some means to reduce excesscredits will be essential to the environmental integrity of the CDM.

Page 42: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

2-11

Table 2.4 Contrasting methods for minimizing non-additional creditsMethod:

Criteria:

A. Stringency(Section 2.1)

B. Project screens andadditionality tests(Section 2.2)

C. Credit Discounting/Standardized Criteria(Section 2.3)

Keyparameters toagree on

- Stringency criterion (best X plants,Yth percentile, etc.)

- De minimis threshold, penetrationthreshold- Measures of additionality (rate ofreturn, penetration levels, market barriers,etc)

- Criteria for and extent ofdiscounting

Effectiveness atminimizingexcess credits

Relatively effective if additionality iswell correlated with low-carbonintensity within category (e.g. fuel-specific or sector-wide), i.e. carbonintensities for additional projects wouldbe concentrated at low end ofdistribution.

If additionality tests are reliable, this isthe most effective method. (Only thismethod will limit non-additional low-carbon projects that pass stringency anddiscounting criteria.)

Depends on the extent towhich discounting criteriacan accurately reflect theprobability of additionalityof various project types.

Reducedcrediting of“additional”projects

Potentially significant (relative toaverage benchmark method) . Ifdifference between stringent andaverage performance benchmarks werelarger than the difference betweenproject carbon intensity and benchmark,then the number of credits would bereduced by 50% or more, with acorresponding reduction in economicincentives.

Less significant. As discussed in Tellus1999, average performance benchmarkscould also reduce some crediting relativeto project-specific baseline methods butthe magnitude is difficult to estimate.

Same as above.

Data Adequacyand availability

- Available in many, but not allcountries

- For project screens: Sizing informationreadily available for a de minimisthreshold; data for penetration thresholdis more complex.- For additionality tests: Financialadditionality could require confidentialdata; penetration rate is simpler for largerprojects (wind farms), more complex anddata-intensive for smaller, distributedinvestments (fuel cells, efficiency)

Will depend on the criteriafor discounting.

Cost ofanalysis andreview

- Lower than project-specific due toeconomies of scale if volume of projectsis significant.

- Costs of additionality test could besignificant if more complex methodsused. Costs would be similar to project-specific baselines, but only applied tolarger projects.

Could be relatively lowcost, if criteria are simple.

Administrativefeasibility

Simple, once stringency criterion is set. All projects passing screen will requirereview similar to project-specificbaselines

Simple, once discountingrules are set.

Page 43: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

3-1

Section 3. Collecting Data and Defining Cohorts

All baseline methods will require the collection and analysis of reliable data and reasonable assumptionsregarding presumed counterfactual activities, or proxies thereof. Benchmarks for power sector projectswill require specific types of data to be collected about existing power plants, plants under construction,and possibly those planned or otherwise expected to be built in the near term. This section brieflyreviews these requirements, and the data collected for case study countries and analytical exercisesdescribed elsewhere in this report. The potential for using near-term projections as sources of “data” forbenchmarking is also discussed.

The setting of empirically-based benchmarks requires defining a cohort, a sample of relevant existing orplanned facilities whose performance characteristics (e.g., carbon intensity) would be used to determinethe benchmark value. Section 3.2 describes how a cohort could be developed, taking into account factorssuch as in-service date and data gaps.

3.1 Collecting Data

The data requirements for setting electric sector benchmarks are fairly straightforward. Ideally, thefollowing parameters should be available for several recent years

§ plant capacity (MW)§ annual generation (MWh)§ plant efficiency (%), heat rate (Btu/kWh), or specific consumption (kg fuel/kWh)§ fuel type and characteristics (carbon and energy content)§ technology type§ in-service date

The process of obtaining reliable data for these parameters is not straightforward.14 No internationalcompilations of electric sector data cover these specific elements. International Energy Agency collectsinformation aggregated across the entire electric sector, but not at the individual plant level.15 The UtilityData Institute’s World Electric Power Plants Database provides information about nearly 100,000generating units in over 220 countries including capacity, design fuel type, and in-service date, butcoverage is incomplete in many countries and no data are provided about actual operation (MWhgenerated, efficiency, fuel consumption).16

Since the requisite data is not presently compiled at the international level, it is necessary to assemble thedata from primary sources of data within the electric sectors of individual countries. Few countriespresently collect the full complement of required data and make them publicly available.17 Therefore, aspart of this present analytical effort, USEPA commissioned SAIC and Tellus Institute to collect this typeof data at the national level. Plant level data were collected for India, Jordan, South Africa, Thailand, andVenezuela. The rationale for choosing these countries had more to do with ready access to data throughexisting working relationships than with any other factor. The process of communication, data collection,

14 The rationale and requirements for collecting plant-level data is discussed in Section 4.2 of Lazarus et al., 1999,(http://www.tellus.org/seib/publications/benchmarking_full.pdf)15 http://www.iea.org16 http://www.udidata.com17 For instance the data made available by the US Energy Information Agency (http://www.doe.eia.gov) is largelysufficient, but such sources are the exception rather than the rule.

Page 44: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

3-2

and processing into an analysis-ready format took 2-3 months, and with total transaction and labor costsfrom approximately $1000-$10,000/country. These costs are provided for reference, but are notnecessarily indicative of the “transaction costs” implied by power sector benchmarking.

Ideally, information would be based on actual operating data. However, an alternative, less accurateoption is to substitute “nameplate” parameters based on manufacturer’s specifications for actual operatingdata. Such data are available were collected for Thailand. A third option is the use of regional or globaldefaults, based on typical values. The latter two are feasible for parameters such as fuel type and plantefficiency.

For none of the countries investigated were we able to gather the full complement of data listed above.Gaps include missing regions (a few states in India), some technology/fuel types (oil and gas plantperformance in India), and some generators (municipal producers in South Africa, independent powerproducers in Thailand, and others). In most cases, the data do exist but could not be collected due to thelevel of effort required to reach individual utilities or plant operators (as was the case in India). Theexistence of common reporting requirements (to system operators, national utilities or governmentagencies) could likely overcome this constraint.

The potential for political and commercial sensitivity and bias in reporting performance data poses moreformidable constraints. Where electric sectors are restructured and competition is introduced, such datacan be regarded as proprietary by competing producers who may wish to keep their cost structures

Box 3.1 Data collected for Indian coal plants: an example

Coal plants in India are potentially an important CDM project category, considering the relativecontribution of Indian coal plants to developing world power sector emissions. The population of Indiancoal plants for which data were collected is shown below. This charts shows the relationship betweencoal plant carbon intensity on power plant age. Each point depicts one plant’s online date and averagecarbon intensity. This data set is a clear example of the wide spread in power plant performance and itsvariation over time.

Indian Coal Power Plants

0.500

0.700

0.900

1.100

1.300

1.500

1.700

1.900

2.100

1945 1955 1965 1975 1985 1995 2005

Power Plant Online Date

Ca

rbo

n i

nte

ns

ity

(k

gC

O2

/kW

h)

These data are sufficient to examine the implications of different benchmarking approaches usinghistorical data. However, results should be regarded as indicative, as more complete data would beneeded for establishing reliable benchmarks (closer attention to heat content, completeness of nationalcoverage, etc.).

Page 45: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

3-3

confidential. Deregulation of the US power sector, for example, has already had an observable affect onthe availability of data, as the US Department of Energy’s Energy Information Agency no longer providesdata on individual power plant efficiencies. Ironically, the existence of a global climate regime mightitself make power sector data less available, as data on energy use and GHG emission non-Annex Icountries becomes important in the negotiations around both baselines and national commitments.Finally, even in cases where data is available, there are serious concerns about the data reliability, sinceboth the host and investor parties see a strong financial incentive to submit data that inflates a CDMproject’s emissions reductions. (Repetto, 2000) It should be emphasized that all baseline techniques aresubject to these concerns, although some level of data verification could be incorporated into baselinereview.

Though data collected for our sample countries would not be an adequate basis for setting actual baselinesfor CDM projects, they suffice for analysis of benchmark methods. Furthermore, it is reasonable toassume that a CDM regime could provide the impetus for more complete and systematic collection ofpower sector data, filling many or all of the gaps identified here.

3.2 Projections as a Source of Data for Benchmarks

Electric sector plans and projections are available in many countries. These documents could provide thebasis for creating (or adjusting) baselines that consider likely future changes not accounted for byhistorical data and experience. For example, a country may be nearing the end of its exploitable hydroresources or just beginning to tap natural gas resources available due to recent discovery or a newpipeline.

Or take the case of gas and oil-fired power technology, which is currently undergoing considerablechange. Until recently, most oil and gas plants have consisted of steam and combustion turbines.Combined cycle facilities, offering 20-40% lower fuel efficiencies and carbon intensities, are rapidlybecoming the technology of choice for new baseload and intermediate capacity. CC technology itself isimproving rapidly with advanced systems offering 10% or more in fuel savings in the years to come (seeTable 1.3). Even a very stringent fuel-specific benchmarks, if set on historical data alone, would likely behigher than the carbon intensity of new combined cycle facilities currently being built around the world.

Over the short-term, e.g., up to five years ahead, forecasts and expert judgment might be reasonablyreliable at predicting the mix of new capacity types and their general performance characteristics. Due tothe construction time of many larger facilities (e.g., 2-5 years), major changes in the type of new facilitiesbuilt can generally be well anticipated over this time horizon. But changes in fuel prices, emergence ofnew technologies, regulatory changes, and structural changes (such as the shift to deregulated electricitymarkets) render it difficult to make credible longer-term forecasts.

Projections inevitably embody the subjective perceptions and biases of their makers. In some cases,projections are not attempts to forecast future electric sector developments, but are planning targets thatare admittedly over-ambitious. Sometimes, projections are biased by financial motives, for example whenregulated electricity rates – and hence utility revenues – are investment-based. If the CDM becomes animportant source of revenue and electric sector baseline are based on projections, then there will be afurther bias due to the incentive to maximize credit revenues by publishing projections that inflateestimates of carbon emissions.

However, it may be unnecessary to use longer-term projections for CDM baseline setting. If one assumesthat a CDM project is replacing another generation investment or delaying it for a few years, then abenchmark need only reflect the counterfactual conditions around the time the CDM project decision ismade. (If one assumes instead that no investment is displaced, and the CDM project merely avoids

Page 46: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

3-4

generation from the then-current units on the operation margin, then a long-term forecast or fully dynamicbaseline would be necessary to reflect future generation mix.) Information about power plants that areunder construction, planned, or expected over the coming X (e.g. 5) years18, can then supplementhistorical data for deriving a benchmark.

3.3 Defining the Cohort

Baselines are supposed to reflect the source of electric power that is likely to be used in the absence ofCDM activities. Obviously, this counterfactual source of power is only a conjectured scenario that cannotactually be observed and measured. But ideally, the counterfactual source of power can be inferred bylooking at the real world electric sector, and defining a control sample, or cohort, for the counterfactualactivity. This cohort would be a sample of plants that are of the appropriate vintage, technicalcharacteristics, etc.

The concept of a cohort is relevant for both project-specific baselines and multi-project baselines. For theformer, the cohort would be a sample population that embodies a particular counterfactual, which isdefined on a project-specific basis. For the latter, the cohort would be a sample population that reflectswhatever generation category the benchmark is designed for, (such as sector-wide generation, fuel-specific generation, fossil-only, renewables, off-grid, regional sub-sectors, etc). The following discussionpertains primarily to cohorts in the context of benchmarks, but is largely relevant to either situation.

For example, if a CDM project consists of a new efficient power plant, and the presumed counterfactual isa new conventional power plant, the cohort could be defined, say, as a set of ten baseload power plantsconstructed within the last 3 years. As a second example, consider a CDM project that consists ofimproving the efficiency of an existing coal power plant that came into operation in 1990 (say, byupgrading the boiler). The appropriate counterfactual is the power plant itself without a boiler upgrade,and reasonable cohort would then be a set of similar vintage coal power plants.

Which plants should be included in the cohort for new power plants?As discussed in other sections of this report, a cohort could be defined in many ways. A few examplesare:

- the best 3 plants in the most recent 10 years- the most recent 5 plants built and under construction- the 25th percentile of plants built in the last 15 years

To most closely reflect the counterfactual activity, the ideal cohort would consist of new capacity that iscontemporary with the CDM project, i.e. that was commissioned or came into service around the sametime as the CDM project.19 This suggests several options for categories of plants that could to providedata on plants contemporary with the CDM project in question, in order of increasing inclusiveness:

1. Operational plants for which data is already available.2. All operational plants.3. All operational plants and plants under construction.4. All operational plants, plants under construction, and planned plants

18 Where a competitive power sector environment exists and has displaced the planning process, expert judgmentcan be used in the place of planning documents to gain insights on likely construction activity.19 Defining what is “contemporary” with the CDM project is subject to debate. Is a contemporaneous project onethat would have been conceived or operational at the same time? Time of conception would like make the mostsense, both for practical reasons as well as better approximating the counterfactual.

Page 47: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

3-5

The first category offers the most certainty to the project developer, since all the required information isalready available and benchmarks can be established before the finalization of the project. Complete andofficial national-level energy data (e.g., IEA, UDI, or national compilations) lags by anywhere from a fewmonths to two years depending on the situation, which might exclude from consideration plants thatwould otherwise be good cohort members. However, for data as specific as the operational performanceof specific power plants, data need not necessarily lag more than weeks or a few months, as evidenced bydata collected (by SAIC) as part of this research effort. (It might be necessary to discard data for a plant’sfirst year or two of operation, because early performance can suffer because of start up conditions that areunrepresentative of long-term behavior.)

The subsequent three categories increasingly rely on ex post data, requiring that benchmarks be set afterthe project approval process is complete. This poses some uncertainty to the investor, but this risk wouldbe limited if the investor had a good sense of what other plants are under construction or planned.

As the fourth category suffers from unreliability and the potential for gaming of forecasts, the third optionis probably the best that can be done to develop a cohort as contemporary as possible with the project.

Some general guidelines for defining a cohort for new power plantsTo construct a meaningful sample of local plant performance data, a number of guidelines andadjustments would be required, examples of which are the following (with possible values given inparenthesis):

• Consider plants coming into operation within the most recent Y (5) years – starting with themost recent year for which full data are available. This period should be extended up to Y* (15)for countries/regions with limited activity in this plant type, or until the total sample reaches aminimum number (10) of plants. The method for determining the sample will depend on thestringency method (percentile or average – see below) and updating issues (see Section 1 of thisreport).

• For carbon intensities20, use generation-weighted averages21 over the most recent years ,from one to Y** (5) years22. Exclude the 1-2 years of operation, if start-up performance (heatrate and availability) is poorer than expected over the longer term.

• Estimate average capacity factors if necessary, since new plants often take several years toachieve typical performance characteristics. Rather than rely on initial operating characteristics,benchmarks should use expected or national average capacity factors for these plants wherecalculating generation-weighted averages. For instance, for India, we used capacity factors of60% for new coal plants less than 3 years old where calculations are marked “adjustedgeneration” below.

20 Carbon intensity is typically derived from largely from four key parameters: electricity output, fuel consumption(in physical terms, e.g. tons), heat content of the fuel, and carbon content. Where data are available, the first twoparameters are generally accurate to within a few percent (fuel consumption less so, electricity output more). Heatcontent of the fuel however, especially for unwashed coal, can be highly variable from one fuel delivery to the next,especially if multiple mine sources are used.21 All calculations of carbon intensity should be weighted by plant generation, so that each kWh equally contributesto the average.22 Using more than one year is important. Temporarily low and high reserve margins due to economic upturns ordownturns, prolonged droughts, fuel supply changes, and unplanned outages can lead to a few years of operatingdata to be somewhat unrepresentative as an indicator of future conditions. Using an averaging period of 3-5 yearsshould smooth over deviations that are merely transient.

Page 48: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

3-6

• Include plants under construction. If a plant is considered X% (10%) complete, then it tooshould be included in the analysis. Design heat rates should be used, along with characteristics ofexpected fuel source (national average fuel characteristics could be used if not determined).

• Specify the expected dominant fuel (over the next 5 years) for dual fuel facilities,• Adjust for new environmental controls . Where environmental requirements have negatively

affected power plant carbon intensities (e.g., scrubbers on newer coal plants) over the time periodconsidered, then the performance of older uncontrolled facilities should either be adjusted tosimulate the addition of controls or excluded altogether.

• Exclude other CDM-supported facilities• Include non-CO2 GHGs if emissions are significant (e.g., methane from some reservoirs). Full

fuel-cycle analysis however, is not generally recommended due to its inherent complexity andsite-specificity. However, since off-site emissions can be significant, it might be advisable to usegeneric upstream emission factors provided by a credible source (such as the IPCC).

What about other timing/data issues?There are many further nuances to issues of which plants and years of data should be considered inbenchmark/baseline setting. Some of these are linked to unresolved questions of timing related to projectsubmission, approval, commencement of operation, and crediting. For instance, over how many yearsshould plant performance be averaged? When should the original baseline be set (project submission orcommencement)? Should updating occur during the year prior to its application for project crediting, orcan it occur during the same year (e.g., once the previous year’s data are available)? Most issues of thistype will need to be addressed regardless of what type of baseline approach is used. The process ofupdating adds a few additional decision points, but these should be relatively simple to resolve.

What can be done if sufficient empirical data to define are lacking to define a cohort?In some countries, available data may be insufficient for setting empirically-based cohort for a specificcategory of project. Many countries do not compile or make available information about power plant fuelconsumption (or, equivalently, efficiency or carbon intensity). In some cases this is simply because suchdata has not previously been worth having, and a system is not in place to gather it. In other cases this isbecause this information is withheld because it is considered proprietary. The latter circumstance is couldbecome increasingly common as power generation becomes more widely privatized. In still other cases,in a country/region that has had little need to build new capacity in recent years due to overcapacity (e.g.,South Africa), slow growth, or limited demand (smaller, less developed countries), there may insufficientrecent activity to generate a cohort.

To help countries develop preliminary benchmarks in cases where sufficient data is unavailable, thefollowing methods could be used:

• Regional or neighboring country benchmarks, assuming sufficient data are available at theselevels, and relevant conditions (technology availability, fuel quality, operating conditions) aresufficiently similar

• Performance standards, which can be based on best available technology, efficiency standards,carbon content standards, consistency with environmental policies, or other criteria. For instancethe standard for natural gas could be set a natural gas combined cycle units operating at averagenational or international levels. Performance standards could even be established for fuel-specificbenchmarks at the regional/global level using an impartial expert panel convened by the CDMgoverning body. Default values should be rather stringent, giving countries an incentive to makedata available.

Page 49: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

3-7

Using cohort data for benchmark updatingOnce the cohort is defined, its performance can be tracked over time and changes taken into account whenupdating the benchmark. (See Section 4) Take the example of a new efficient power plant, where thecohort is a set of ten baseload power plants constructed within the last 3 years. The performance of thisset of ten plants would be tracked over time, and their performance reflected in the baseline. Even thoughthe individual plants in the cohort will remain constant (to the extent possible), their performance couldevolve for a variety of reasons. For example, performance of baseload coal power plants in some electricsectors has been seen to improve because of the introduction of coal washing and the adoption ofimproved boiler maintenance. Or these plants could shut down as the result of new regulations or powersector restructuring. Since it is reasonable to assume that a counterfactual coal plant displaced by theCDM project would have also improved in performance (or also shut down), it would be appropriate toupdate the benchmark so as to reflect these performance improvements.

However, other methods (declining baselines, shorter credit lifetimes for retrofits) should also beconsidered, as in some instances, it will be inappropriate for changes in cohort performance to bereflected in adjusted benchmarks. For instance, performance could decline at some of the plants in thecohort due to mismanagement or unusual operating conditions; the resulting benchmark would increase,allowing the CDM project to garner more credits. Conversely, a CDM project might induce performanceimprovements at other facilities, as a positive spillover effect, and the benchmark would decrease,reducing credits. To lessen the chances that a project is penalized for positive spillover effects orrewarded for poor performance at other facilities, some level of review may be required.

3.4 Conclusions

Data requirements: Although certain data are available through international bodies such as theInternational Energy Agency, or private firms such as Utility Data Institute, the requisite data for settingbenchmarks is not presently compiled at the international level. It is therefore necessary to assemble thedata from primary sources of data within the electric sectors of individual countries, however fewcountries presently collect the full complement of required data and make them publicly available. For thepresent research effort, data were assembled for India, Jordan, South Africa, Thailand, and Venezuela.The process of communication, data collection, and processing into an analysis-ready format took 2-3months, and with total transaction and labor costs from approximately $1000-$10,000/country. This effortyielded much, but not all, of the required complement of data, and left some important gaps remaining.This price may not be indicative of the cost for a host country to collect the data. One may reasonablyexpect that a host country could collect their data at a lower cost than an independent outside source.

Projections: Electric sectors continually evolve -- because of changes in fuel prices, emergence of newtechnologies, regulatory changes, and structural changes.. Historical data may be unable to reflect thelikely performance of new power plant technologies where they are undergoing rapid change, as in thecase of oil and gas where combined cycle facilities are rapidly becoming the technology of choice, with20-40% improvements in fuel efficiency over their predecessors. Projections based on electric sectorplanning assessments or expert judgement can help to reveal how an electric sector might develop in thenear term, and how well new plants are likely to perform. Due to the construction time of many largerfacilities (e.g., 2-5 years), major changes in the type of new facilities built can generally be wellanticipated over this time horizon, and information about power plants that are under construction,planned, or expected can then supplement the historical data for deriving a benchmark. One the otherhand, longer-term forecasts of new facility types or plant efficiency are generally not accurate orobjective, and do not provide a credible basis for future benchmarks or benchmarks that are fixed formore than a few years.

Page 50: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

3-8

Cohorts: The counterfactual to a CDM project cannot itself be observed and measured, but it can beapproximated by defining an appropriate cohort. The cohort would be a sample of actual plants thatadequately reflects the presumed counterfactual, in terms of vintage, fuel, technical characteristics, etc.The cohort would provide empirical data for the purpose of defining and, if appropriate, updating thebenchmark. For example, if a CDM project consists of a new efficient power plant, and the presumedcounterfactual is a new conventional power plant, the cohort could be defined, say, as the set of baseloadpower plants constructed within the last 3 years, under construction, or planned. Or, if a CDM projectconsists of improving the efficiency of an existing coal power plant that came into operation in 1990 (say,by upgrading the boiler), the appropriate counterfactual is the power plant itself without a boiler upgrade,and reasonable cohort would then be a set of 1990-vintage coal-based power plants. In both theseexamples, the benchmark would be set based on the performance of the cohort, and could be updated ifimportant changes over time in performance are observed. Such changes can be expected to occur ifefficiency autonomously improves (e.g., boiler retrofits), operating behavior changes (e.g., improvedmaintenance), or fuel quality or type changes (coal washing or switching to natural gas, etc.). Cohorts canalso provide an empirical basis for determining the appropriate project lifetime..

If sufficient empirical data for defining a cohort is unavailable (for example, because a region has fewother operating plants, or does not assemble the needed data), other options for setting the benchmarkinclude – (1) using other regional data for defining and observing a suitable cohort, and (2) relying onexpert judgement of an impartial third party to provide performance standards.

Page 51: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

4-1

Section 4. Updating Issues and Options

In the context of benchmark baselines, two occasions for updating arise. The first is the revision ofbenchmarks for new projects. The second is the renewal of benchmarks for existing projects. Revision ofbaselines for new projects is more straightforward than renewal of baselines for existing one. Revisedbaselines should be as up-to-date as possible, with time and effort being the primary constraints infrequency of revision. Baselines for existing projects should be kept up-to-date as well, but doing this canbe complicated by the need to respect the concerns of investors in existing projects. For investors,renewing baselines might appear risky since it might reduce credit revenues they may be counting on.In this section, we consider the following questions:

• How can the desire for investor certainty be balanced with the benefits of baseline updating?• Should baseline setting and revision be done on ex ante or ex post basis?• When and how often should baselines for existing projects be renewed?• When and how often should baselines for new projects be revised?• How should the frequency and procedures for updating depend on project type or characteristics?

Most of these questions pertain to both benchmarks and project-specific baselines.

4.1 Baselines Updating and Investor Risk

Ever since the notions of joint implementation and the CDM were first broached, the questions of whetherand how baselines should be renewed have been widely debated. Some observers have argued thatbaselines should be fixed (or static) instead of revisable (dynamic), for the sake of maximizing investorcertainty (as well as to reduce transaction costs). Fixed, ex ante baselines would be agreed upon when theproject is conceived and would remain unchanged for an extended period, if not the full lifetime of theproject. The investor could estimate beforehand how many credits the project will earn – based onassumptions about the project’s anticipated performance. Project performance would still need to bemonitored to determine the actual amount of credits accrued, but the baseline against which the project isjudged would remain unchanged for the life of the project.

Others have argued that baselines should be dynamic, for the sake of environmental integrity. Conditionsaffecting the counterfactual situation could change -- e.g., due to policy shifts, price shocks, or economicdownturns. Allowing adjustments to the baseline to reflect relevant changes would, in principle, improvethe accuracy of estimating emission reductions. It would be important to capture only those changes thatwould have affected the counterfactual activity, had it taken place (e.g. using the cohort approachintroduced in Section 3.3. above).23 If adjusted in a relevant fashion, dynamic baselines would minimizethe risk that projects would be over-credited (generating false credits that would permit the unwarrantedemissions of GHGs elsewhere) or under-credited (reducing the incentive for valid CDM projects).

23 Future changes in the mix of new power plant technologies would not necessarily be relevant. For example, anemerging technology that dominates new power plant construction five years after the CDM project is implementedmay have no relevance on the counterfactual situation, since this technology could not have been installed at thetime of the CDM project. It would be relevant, only if the operation of assumed counterfactual facilities (e.g. a coalplant) is altered because of the presence of the new technology (e.g. if the coal plant is no longer economic tooperate).

Page 52: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

4-2

However, dynamic baselines would also increase uncertainty and risk for investors, by making thenumber of future credits less predictable. Thus, the goals of investor certainty and environmentalintegrity are often portrayed as a tradeoff. Fixed baselines provide investors with a fully predictable flowof credits (though not their economic value). On the one hand, if fixed baselines become outdated andinvalid, they could risk the environmental integrity of the CDM, potentially diminishing its cost-effectiveness as a tool for reducing global emissions. On the other hand, the additional risk anduncertainty of revisable baselines could stifle some investor interest in the CDM.

There are at least two approaches that could reduce the need to compromise between investor certaintyand baseline accuracy: risk pooling and conservative, fixed baselines. Risk pooling could consist ofinsurance, credit funds, or guarantees as described in Box 4.1. Box 4.2 describes how investors couldtrade the uncertainty of dynamic baselines for fixed, but lower baseline.

Page 53: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

4-3

Box 4.1 Managing investor risk through risk pooling

Investors routinely take steps to manage their risk, and CDM investors are likely to do the same.Instruments are likely to evolve in the initial stages of the CDM to help project investors manage the severaltypes of risk they face, such as (a) uncertainty about project performance (e.g., fewer kWh generated thanexpected), (b) uncertainty about the future market value of carbon credits, and (c) uncertainty about howbaselines might change when renewed, due to either volatility in ambient conditions (fuel price changes,technology advances, regulatory changes etc) or future changes in methodology (due to institutionalfactors, improved knowledge, etc).

There are several ways that investors could manage uncertainty in by pooling risk. For example, investorscould purchase “carbon credit insurance” against unexpectedly low credit revenues, could pool risk amonga large number of diverse projects through common carbon credit fund, or could rely on government“carbon credit guarantees”. A project developer who is concerned about risk can resort to one of thesemechanisms for protection in the case of lower-than-anticipated carbon credit revenues.

Carbon Credit InsuranceCredit insurance can be provided by private sector suppliers, who underwrite insurance based on acompetent assessment of the risk of a given project. When project investors suffer “losses” due tounanticipated changes to baselines, the insurer will assess those claims and disburse compensation withcredits purchased on the open market. This brings the resourcefulness of the private sector to thechallenge of managing risk for CDM project developers. The cost of insurance, and the willingness ofinsurers to offer it, will provide an informative signal about the viability of CDM projects and the credibility oftheir emissions reduction claims.

In the near term, Annex I governments that are eager to jumpstart CDM activities could step in, offeringcredit insurance, or providing incentives for private sector involvement. If governments are concerned thathigh insurance costs could be a barrier to early CDM investments, they could lubricate the process bysubsidizing the insurance during the initial phases.

Carbon Credit FundAlternatively, a form of credit insurance could be embodied in the CDM itself. Risk could be pooled amongprojects, with projects paying premiums in the form of credits into a common carbon credit fund(administered by a body appointed by the CDM Executive Board). Claims would be paid from this fund.

Carbon Credit GuaranteesMany Annex 1 governments are eager to see an active CDM market arise. Just as governments oftenextend various types of investment guarantees to catalyze overseas investments, they could provide creditguarantees to CDM project investors. Annex I governments have at their disposal a large reserve ofemissions allowances (i.e., their national targets) from which they can pay out claims.

These three mechanisms differ in terms of the roles played by individual project investors, private sectorinsurance institutions, national governments, and international bodies – i.e., who bears the costs ofmanaging risk and who is responsible for the actuarial assessments regarding “premiums” and “claimpayments”. These or similar mechanisms would provide investors a way to manage risk in a manner thatdoes not interfere with the accuracy of baseline setting, nor compromise the environmental integrity of theCDM.

Page 54: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

4-4

Box 4.2 Managing investor risk through conservative fixed baselines

A convenient way to think of the challenge of reducing investor risk without sacrificing environmentalintegrity is to consider this a standard problem, which investors routinely face, of balancing risk andreturn. An annually renewed baseline might preserve environmental integrity as accurately as possible,but prove more risky to investors. Instead, an investor might prefer to trade some of this risk for a secure,but smaller, stream of credit revenues. A fixed, lower baseline could be offered by either discounting thecredits by some fraction or setting a lower baseline. This reduction in crediting would be the equivalent of“environmental insurance”, with the intent that the sum of reduced credits would be sufficient to cover anybaseline inaccuracies.

Between the two approaches – i.e., an annually renewed baseline and a more conservative fixedbenchmark – is a continuum of options that blend the two approaches. For example, baselines could bedefined that provide full credit for annual renewal, partial reduction for renewal every two years, largerreduction for renewal every three years, etc.

In cases where project developers opt for a more conservative fixed baseline, it might be important tokeep track of not only the credits awarded under this baseline, but also the credits that would have beenawarded under the regularly renewed baseline. This will be useful in revealing how the fixed baselinesperform relative to the more accurate renewed baselines, and ensuring that fixed baselines areappropriately designed.

Using either approach, the cost of reducing investor risk and uncertainty – much-touted barriers to CDMinvestment – can be internalized as a predictable cost. In contrast, if projects are simply offered fixedbaselines (that eventually become outdated and inaccurate), the cost of reducing investor risk anduncertainty is externalized as higher environmental damages and/or higher mitigation costs for others. Inmaking decisions about how to keep baselines updated and accurate, the Parties will be deciding in partabout whether to internalize or externalize these risk costs.

If investor uncertainty about baseline revision can be translated into a predictable cost, then the tradeoff inchoosing between fixed and revisable baselines becomes more a question of transaction cost andaccuracy. What are the relevant changes in conditions and how can they be captured in the baselinerevision process (and thus provide greater accuracy)? And is the revision process worth the added cost?The more frequently baselines are reviewed and updated the more accurately they can reflect changingconditions, but the greater the costs. These questions are touched on in the following sections.

4.2 Key Questions in Baseline Updating

The central question that updating seeks to address (once CDM project is operational) is: How havechanges in conditions since the baseline was set affected the type of counterfactual activity that the CDMproject originally displaced? This last phrase -- the type of counterfactual activity originally displaced –is extremely important, and best illustrated by an example that we’ll call “Project A” :

• Project A (in 2002): In 2002, say, an investor proposes CDM project A, which involves constructionof a new 100 MW baseload geothermal plant in Country X. The plant is expected to be operational inthe year 2004, and continue to operate for at least 20 years. As of 2002, when the CDM governingbody approves the project and baseline, all recent baseload capacity additions in Country X have beencoal-based steam turbines, which also have a 2 year lead time. The coal trend is expected to continue,as reflected as well in plants under construction. This baseline is benchmarked to the carbon intensityof the most recent 3 plants built (1998, 1999, 2001).

Page 55: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

4-5

• Project A (in 2010): Now it’s 2010, and due to the construction of LNG terminals and new gaspipelines into Country X, natural gas combined cycle has become the new plant of choice.Furthermore, throughout Country X coal plants have been upgraded. Fuel washing has improved thequality of coals, and combined with improved boiler maintenance, the overall effect is a 5%improvement in the efficiency – and lowering of carbon intensity – of many existing coal plants.(These trends are similar to those currently being witnessed today in China and India).

Should Project A’s baseline in 2010 remain the same as it was in 2002? If not, how should thebaseline be revised?In order to reflect changes in conditions (improved fuel quality and boiler operation) that have affectedthe type of counterfactual activity originally displaced (baseload generation from a new coal plant), theyear 2010 the Project A baseline should be revised accordingly. The fact that new capacity additions in2010 are all natural gas is immaterial, if the fact is that coal capacity would have been installed had thegeothermal plant not and this coal capacity would still be operating in the year 2010, as it is in thisexample. What we are then concerned about is making as good an estimate as possible as to how thiscoal capacity would be operating in the year 2010. In this example, changing conditions (the diffusion ofcoal washing and improved boilers throughout the electric sector) led to a general improvement in thecarbon intensity of coal-fired power plants.

The following are examples of other possible year 2010 conditions that might need to be accounted for inthe baseline:§ coal plant carbon intensity has deteriorated by a few percent due to the addition of efficiency-

lowering and power-consuming scrubbers to meet local air pollution concerns, or merely due to theaging of plant equipment.

§ coal plant carbon intensity has improved due to technological advances or improved plant operations(e.g., improved boiler maintenance).

§ coal plants now operate at about half the load factor they did a decade earlier, due to improved gridmanagement and plant dispatch practices, or other reasons such as those listed above.

§ over half of existing coal plants have shut down, including 2 of the 3 coal plants upon which thebenchmark was based, perhaps due to changes in fuel costs or fuel availability, loss ofcompetitiveness to other new generation sources under a restructured market, environmentalregulations, etc. Their contribution to the grid has therefore been supplanted by some othergeneration source, which should be accounted for in the baseline.

If conditions that would have influenced the performance of the counterfactual coal capacity havechanged, then these changes need to be reflected in updates to the baseline. In ideal circumstances, suchchanges could be assessed by observing the empirical changes seen to actual coal plants that can serve asa control sample to the counterfactual plant, i.e. plants of the same vintage, technical characteristics, etc.In some circumstances such a perfect control sample might exist, but even in cases where it does not itshould be possible to identify a cohort that could serve as a reasonable control sample.

Page 56: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

4-6

4.3 Renewing Benchmarks for Existing Projects

If benchmarks are set using a cohort-based method described in section 3, it may be straightforward torenew benchmarks regularly, assuming transaction costs can be kept manageable. In some cases, cohortsmight evolve relatively rapidly (e.g., in the context of power sector restructuring, where plant efficienciesmight be expected to steadily improve as a result of management reform and newly available investmentflows for capital improvements), which would warrant frequent updating.

The cost and effort involved in updating baselines includes a technical component (collection and analysisof the relevant data) and an administrative component (coordinating the review and updating of thebaseline). In many electric sectors, performance data is already collected annually, so the incrementaltechnical effort for updating baselines could be relatively insignificant. The amount of administrativeeffort will depend on factors such as: whether many different projects can benefit from a singleadministrative structure, and whether several other administrative tasks can be packaged together(baseline updating, project monitoring, verification, crediting, etc.). If ambient conditions affecting thepower sector have hardly changed, and power sector CDM investment in a given region has been limited,it may not be worthwhile to assemble the administrative effort to update since changes to baselines mightbe minimal anyway.

Should benchmarks be determined ex post or ex ante?In theory, benchmarks could be set either ex ante or ex post, i.e. either before or after the period of timeover which they are to apply. An ex post approach is more accurate, in that it relies on measured datarather than projections, allowing the benchmark to reflect unanticipated changes (given that thebenchmark cohort is set in advance). If updates are performed frequently (e.g., annually), then thedifference between benchmark determined ex post and ex ante will be minor. With sufficiently frequentrenewal of ex ante benchmarks, the slight decrease in accuracy might be a small price to pay to provideinvestors with greater certainty.

How might updating needs differ among project types?The importance and impact of regular updating depends on the type and characteristics of the presumedcounterfactual activity. If the counterfactual activity is presumed to be a long-lived investment in analternative generating technology, then it is likely to evolve slowing over time. On the other hand, if thecounterfactual involves decisions that could be made anytime about small investments in efficiency orfuel costs, it could evolve more rapidly. Some of the main categories of activities are discussed here.

Investments in new long-lived capital. Frequent updating might be less important where the expectedcounterfactual activity involves investment in long-lived capital that is likely to operate without majorchanges for a long period. CDM investment in new baseload power plants is a likely example of thissituation. Even if technological advances (or other unanticipated sectoral changes) alter the mix ofpreferred new investments in future years, baseload plants – which typically involve relatively highcapital investment and low running costs – are likely to continue operation and comprise a relativelyunchanging cohort. Even if it were no longer economic to build a particular baseload plant anew, it islikely to remain economic to operate due to its low operating costs. In general, if operating costs are low,as in the case of most baseload power plants, then counterfactual activities are less likely to be subject torapid, unanticipated change.

Page 57: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

4-7

Nonetheless, the updating process for new baseload capacity CDM projects remains important.24 TakeProject A, for example. Although the coal plant remains an appropriate cohort for the geothermal project,the performance of the coal plants is witnessed to improve over time (because of the coal washing andimproved boiler maintenance). Absent updates to the benchmark to accounts for this change, a significantover-crediting of the CDM project might occur. The CDM could unintentionally continue to supporttechnologies that are no longer worthy of carbon reduction credits.

Efficiency improving retrofits and fuel switching projectsCohorts for energy efficiency retrofits or fuel switching projects that take place at existing facilities arelikely have more variable and uncertain performance, and thus frequent updating becomes even moreimportant. Steam and gas turbines are capable of being repowered as combined cycle facilities, greatlyimproving their efficiencies (as much as 20%). At current technologies and costs, coal steam and CCunits as well as nuclear and renewable generation technologies, 5-10% efficiency gains are readilyavailable for many facilities by improving management, operations, controls and other selectedequipment. Furthermore, units producing residual heat could be adapted for cogeneration, which couldlead to significant emission savings through displacement of fossil fuel combustion. Even coal plants canbe quickly retrofitted to cofire zero-carbon biomass for as little as $50/kW; cofiring at 10% wouldimmediately reduce the carbon intensity of an existing coal plant by 10%.

Consider a CDM project that switches a district heating system from oil to biomass. Its initial baselineshould reflect the emissions credit for displacing oil use. However, what if three years later a new naturalgas pipeline were to make gas available at prices lower than oil? Had the switch to biomass not beenmade, district heating plant operators would likely have stopped using oil in any case, and switched togas. A fixed baseline reflecting continued use of oil would overestimate emissions. But an updatedbaseline methodology that tracks the fuel use of district heating systems not targeted by the CDM projectmight immediately reflect the fact that the CDM project is now displacing fewer CO2 emissions than inthe initial years.

In general, therefore efficiency improvement and fuel-switching projects have counterfactuals that couldevolve rapidly. If a small investment can enable the switch from one fuel to another, then a plant operatorcan respond rapidly to changes in relative fuel prices.

Projects that are additional because of existing market barriers. Any CDM project that is additionalbecause of existing market barriers to cost-effective activities is likely to become common practice,changing the counterfactual. Subsequent replications of the same type of project would eventually nolonger be additional. Such a situation can arise from successful market transformation activities, anincreasing focus for many efficiency programs. A good example is the transformation of building heatingsystems in the Baltic States to biofuel and improved efficiency operations, as the result of Swedish AIJinvestments. (Kartha et al., 1998) Relatively rapid adoption of facility improvements could thereforearise if market barriers that are preventing facility improvements are quickly eliminated through changesin policies, prices, consumer awareness, etc.

Activities affected by other exogenous factors. Counterfactual activity might also change in efficiency andcarbon intensity due to availability of new technologies, evolving environmental policies, improvements

24 For investments in new generation facilities, it might also be appropriate to distinguish between projects based onscale. Large projects (e.g., a new 50 MW biomass-fired power plant) might be displacing an investment in newgeneration capacity, (e.g., a new 50 MW gas facility), and therefore have a long-lived counterfactual. Smallerprojects (e.g., a rooftop PV system that produces only 5,000 kWh per year) might be displacing generation at themargin from the existing grid system, which might evolve over time.

Page 58: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

4-8

in fuel quality or availability, and other factors.. Counterfactuals for demand-side management activitiesare even more ephemeral than for supply-side efficiency retrofits, since they are subject to ever-changingconsumer behavior and technology choices. For these reasons, frequent updating of benchmarks isespecially important for efficiency and fuel-switching projects.

In summary, the following several factors affect the rate at which counterfactual situations can evolve andbaselines can grow outdated:§ long-lived, new capital investment versus incremental investment in existing facility§ amenity to future retrofitting, repowering, fuel switching§ existence of market barriers§ potential for important changes in regulations, technologies, or fuel availability

4.4 Options for Standardized Updating Methodologies

There are several options for updating of baselines that could provide some degree of consistency,standardization, and predictability for investors.

1. Standard updating frequency for all projects. Nearly all types of existing electric sector CDMprojects could be subject to carbon intensity changes of one sort or another. This is even true ofinvestments in new capacity, where the counterfactual is a long-lived investment such as a coalplant. The carbon intensity of coal plants could change in the future as the result of:

o changes in fuel quality (e.g., China and India coal washing/cleaning programs)o cofiring with biomass (gathering momentum in US/Europe – 10% cofiring reduces CO2

emissions by ~10%)o improvements in boiler and generator efficiencies (India program)o mothballing due to economic downturns

If the relevant performance data is being gathered anyway, as is the case in many electric sectors,then the technical and administrative requirements of annual updating might be modest. Thiswould allow baselines to be updated regularly and frequently – perhaps annually – makingbaselines as up to date and accurate as possible. Regular and frequent updating would beacceptable especially if investors have access to instruments to help them to manage risk, such asthose discussed in Boxes 4.1 and 4.2.

2. Differentiation among project types. For instance, as noted above efficiency and fuel-switching projects are likely to have more ephemeral baselines, thus one might argue that theirbaselines should be updated annually, and new baseload power supply (greenfield) projects every3-5 years. Conversely, one could argue the inverse on the basis of data collection costs.Standardized data for power plant performance (and fuel choice) may be readily available on anannual basis, whereas data needed to track extent of retrofit activity or market penetration ofefficient technologies may not. However, for power plants alone – and our focus here is onpower supply investments – it may be necessary only to know how carbon intensities havechanged, rather than how and why.

3. Thresholds could be established for certain indicators, such a predetermined ratio of relative fuelprices or market penetration of retrofit practices, to trigger baseline recalculation. Whileappealing, achieving consensus on meaningful thresholds would likely prove difficult. Forinstance, there can be considerable disagreement on the extent of response to changes in relativefuel prices in econometric literature, or there may simply be no relevant econometric or other

Page 59: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

4-9

basis for developing such thresholds in many developing country economies.

4. A checklist of factors affecting the unpredictability of counterfactual conditions (such as thoselisted in section 2.2) could be compiled and reviewed for each project. Projects with highindications of counterfactual volatility could be updated annually while those with expected lowvolatility could be updated less frequently.

4.5 When Should the Initial Baseline Methodology Be Reconsidered?

Investors need to be protected from capricious changes in baselines for already approved projects.Ideally, market mechanisms such as credit insurance will provide some level of protection through riskpooling. But such insurance may be costly or unavailable if the updating process does not provide aminimum level of predictability. Using fixed cohorts whose performance depends on factors (fuel prices,technological change) that investors and insurers can readily track, the updating procedures describedabove provide some predictability. However, there will be situations where the original methodologyused to derive the benchmark (stringency level, averaging methods) may no longer be valid. The cohortmight need to be redefined (e.g., because plants have shut down). Or accumulated CDM experience maylead decision makers to adjust averaging methods (fewer or more years) and so on. Therefore, to balancethe needs for predictability and baseline accuracy, rules such as the following could be adopted:

• Revision of the cohort or baseline calculation methodology should be considered when less than athreshold amount (e.g., 50%) of the original cohort is still operational.

• Revision of the baseline determination method (e.g., cohort and averaging method) should occurno more frequently than every (5) years.

4.6 Revising Benchmarks for New Projects

The amount of time that elapses between revisions to a given sector’s benchmarks (for new projects) willdepend on the speed with which the sector is evolving and benchmarks become outdated, as well as thecost of updating benchmarks. The carbon intensity of marginal generating capacity can changesignificantly over time scales as short as one year, and benchmarks might rapidly lose validity if they arebased on old data.25 This is especially true of sector wide benchmarks, that reflect what might be a rapidlychanging fuel mix for capacity additions, but is also true for fuel-specific technologies (such as gasturbines) whose performance has been evolving fairly rapidly.

One option that would make the updating process simpler, though less accurate, would be to combine theprocesses of revising baselines for new projects and renewing baselines for existing projects. When it istime to renew the baseline for an existing project, the baseline could simply be replaced with the currentbaseline for new projects. This eliminates the process of defining a cohort and tracking its performanceover time. On the other hand, it potentially results in baselines that change more quickly andunpredictably. To offset this variability, baselines could be fixed for a period of time before beingupdated, using more conservative baselines as discussed in Box 4.2.

25 See Figures 4-6 to 4-11 in Lazarus et al, 1999.

Page 60: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

4-10

4.7 Benchmark Updating: Examples

Example based on performance data from Indian coal plantsHistorical data can be used to examine the feasibility and consequences of annually updated benchmarksalong the lines described above. To do this, we took data for Indian coal plants and calculated fuel-specific benchmarks for the years of data collected by SAIC as part of this research effort (1992-1999).We considered several possible cohorts: all plants, the best 5 plants in the past 10 years, and the mostrecent 5 plants. The annual variations are shown in Table 4.2.

The year-to-year variation in carbon intensity the case of Indian coal plants in the 1990s is shown forthree different cohorts: all plants, best five plants, and most recent five plants. The greatest variation isseen for the best 5 plants cohort. The three-year rolling average for the best 5 plants cohort reduces thisvariation.

For the sake of illustration, imagine that the CDM were active back in 1994, and that a low-emission coalplant – an integrated gasification combined-cycle facility with a carbon intensity of 0.85 tCO2/MWh –was certified with annual updating of a coal-specific benchmark shown in Table 4.2. In the case of thebenchmark based on the best 5 plants, a fixed benchmark based on the performance during the first year(1994) would credit the CDM project at a constant rate of 0.06 tCO2/MWh each year. On the other hand,if the benchmark were dynamically updated to reflect the evolving performance of the 5 plant cohort, theCDM project’s credits would vary considerably (notwithstanding the 3 year rolling average) – from 0.06tCO2/MWh in 1994 and 1995 to 0.08 tCO2/MWh in 1998. Using any other of the benchmarks shown,the variation in credits would be as high or higher.

Table 4.2 Affects of updating benchmarks for coal power plants, India 1994-1999Generation-Weighted Carbon Intensity (tCO2/MWh)

1994 1995 1996 1997 1998

best 5 plants (1982-91)*, 3 yr rolling avg. 0.91 0.91 0.92 0.92 0.93efficient coal plant as CDM project 0.85 0.85 0.85 0.85 0.85

Credit for CDM project 0.06 0.06 0.07 0.07 0.08

Other possible benchmarksaverage of all plants 1.11 1.09 1.11 1.11 1.10

most recent 5 plants (as of 1991)* 1.01 1.02 1.01 1.00 0.99best 5 plants (1982-91)* 0.92 0.90 0.94 0.92 0.92

Based on data collected by SAIC.*The initial years of plant operation are not included, due to performance data that suggests unrepresentativebehavior during plant start-up and burn-in.

This example shows that changes in performance, even for a fixed set of plants, can be rapid enough andsignificant enough to warrant annual renewing of baselines. In this example, a cohort consisting of thebest five plants as of 1991 varied considerably in its performance over the period 1994-1998. Arguably,the benchmark should be adjusted dynamically to account for this evolving performance.

An unexpected feature of this Indian power sector example is the apparent decline in performance, andresulting rise in the benchmark. Before credits are awarded based on this trend, the underlying basisshould be investigated and verified as a trend that should rightfully allow more credits to be awarded,rather than an artifact of the data or some perverse effect.

Page 61: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

4-11

Example based on economic incentives for various prospective CDM projectsA fixed benchmark might not need to be fixed for very long to adequately reduce investor risk. Sincefuture revenues are discounted, the credit revenues become progressively less important in later years.Therefore, the benchmark is most important to the investor in the earlier years. To put this in economicterms, the NPV of the credits earned through the first 5 years of a 30 year project is 40% of the valuethroughout the project’s lifetime (assuming a constant credit price), and 75% of the credits will have beenearned in the first 13 years (which corresponds to maximum crediting period from 2000 through the endof the first budget period in 2012) 26. Thus fixing benchmarks for only a short period (e.g. 4 to 7 years)could preserve a large fraction of the economic value of credits (if credit price escalation is significantlyless than the discount rate27), and thus fixing benchmarks for a longer period may be unnecessary topreserve a sufficient degree of investor certainty, and to provide adequate incentive for good projects.

Table 4.3 Impact of updating on credits earned assuming 10% discount ratetotal lifetime initial credit updated credit credits earned relative

to baseline 1Project A 10 yearsbaseline 1 100%

(through year 3)100%(from year 4 to 10)

100.0%

baseline 2 100%(through year 3)

75%(from year 4 to 10)

85.6%

baseline 3 100%(through year 3)

50%(from year 4 to 10)

71.3%

Project B 25 yearsbaseline 1 100%

(through year 7)100%(from year 8 to 25)

100.0%

baseline 2 100%(through year 7)

75%(from year 8 to 25)

95.2%

baseline 3 100%(through year 7)

50%(from year 8 to 25)

90.4%

4.8 Conclusions

Compared with ad hoc project-specific baselines, standardized methodologies like benchmarks offerimportant advantages for maintaining baseline relevance over time. Revision of baselines for new projectsand renewal of baselines for existing projects can be based on predetermined algorithms. This lowerscosts and allows more frequent revision and renewal, while improving baseline accuracy. This procedurewould inherently limit investor uncertainty to measurable changes in ambient conditions. Baselinechanges are unlikely to be excessively rapid or unpredictable. However, it may also be necessary undersome conditions to adjust the methodology itself, which might cause baseline changes that are more rapidand harder to anticipate.

26 These calculations derive directly from the assumption of a 30 year project lifetime, discount rate of 10%,constant credit price, and constant benchmark.27 A 20% discount rate is not unreasonable for many risky developing country investments, and 10% escalation incredit prices is also conceivable. If both were to happen together, the net effect would be the same as the caseshown (10% discount rate, 0% credit escalation).

Page 62: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

4-12

Providing investor certainty. The risk of changing baseline is but one of three major sources of risk infuture credit revenues that investors must face. Project performance and especially the future marketvalue of carbon credits are likely to introduce as much or more uncertainty. The risk of a varying baselinecan managed in at least two ways. First, investors could be offered the choice between an annuallyupdated baseline and a fixed baseline that is comparatively conservative. Second, risk poolingmechanisms such as insurance or government guarantees could enable investors to recover financiallosses due to a declining dynamic baseline. Using either approach, the cost of reducing investor risk anduncertainty is internalized as a straightforward, predictable and manageable cost. This can help makedynamic baselines acceptable to investors, and thereby avoid the conflict between providing investorcertainty and ensuring environmental integrity.

Using cohorts to update benchmarks: Our observations suggest that benchmarks, if based on a cohortapproach, can reflect changes in counterfactual activity, while not being onerous to implement and updatefrequently. In many electric sectors, performance data is already collected annually, so the incrementaltechnical effort for updating baselines based on a cohort’s evolving performance could be relativelysmall. Our observations suggest that benchmarks, if based on a cohort approach, can reflect futurechanges in counterfactual activity. Gathering the data to track a cohort may not be overly onerous. Inmany electric sectors, performance data is already collected annually, so the incremental technical effortfor updating baselines based on a cohort’s evolving performance could be relatively small. However, ifthe monitoring the cohort proves costly, updates could be limited to intervals such as every 5 years. Thiswould help incorporate new trends, but keep the transaction costs low

Differentiating updating requirements by project category: Ultimately, decisions will have to be madeabout the frequency of updating (renewing and revising) baselines, balancing the need for baselineaccuracy, the technical and administrative costs of updating, and concerns about investor uncertainty. Itmay be helpful to tailor updating requirements to the conditions faced by different types of projects.

§ Retrofit projects are likely to have more ephemeral counterfactuals, thus it might be necessary torenew their baselines annually.

§ New power projects are likely to have more consistent counterfactuals, although even in these casesthere might be important variations over a CDM project’s lifetime. Baselines might not need to beupdated annually to remain reasonably accurate, although the example in Section 4.6 suggests that thecredits could vary significantly even over periods of two or three years.

§ Where market barriers play an important role, conditions could change rapidly and frequent (annual)updating might be necessary.

§ Where counterfactuals are especially sensitive to changeable parameters such as fuel availability,technological advances, or regulations, baselines should also be updated frequently.

Page 63: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

4-13

References:

Ellis, J., and Bosi, M., 1999. Options for project emission baselines, OECD and IEA Information Paper,Organisation for Economic Co-operation and Development, Paris, October.

Kartha, S., T. Kaalaste, M. Lazarus, and E. Martinot, 1998. Programme for an Environmentally AdaptedEnergy System (EAES): An Assessment of Joint Implementation Between Sweden and the Baltic States,A Report to the Swedish National Energy Administration, Stockholm Environment Institute, Boston, MA.

Kaufman, S., Duke, R., Hansen, R. Rogers, J., Schwartz, R., and Trexler, M. Rural Electrification withSolar Energy as a Climate Protection Strategy, Renewable Energy Policy Project, Research Report #9,Washington, DC.

IEA, 1997. Activities Implemented Jointly — Partnerships for Climate and Development. InternationalEnergy Agency, Paris.

IEA, 1997b. Energy Balances of non-OECD Countries, 1960-1995. Electronic Distribution. InternationalEnergy Agency, Paris.

Lazarus, M. Kartha, S., Ruth, M, Bernow, S., and Dunmire, C., 1999. Evaluation of Benchmarking as anApproach for Establishing Clean Development Baselines. Tellus Institute and Stratus Consulting,October.

Meyers, S., 1999. Additionality of Emissions Reductions From Clean Development MechanismProjects: Issues and Options for Project-Level Assessment, Environmental Energy TechnologiesDivision, Ernest Orlando Lawrence Berkeley National Laboratory, Report LBNL-43704, July .

Meyers, S., Marnay, C., Schumacher, K., and Sathaye, J., 2000. Establishing Benchmarks for EstimatingCarbon Emissions Avoided by Electricity Generation and Efficiency Projects: A Standardized Method.Orlando Lawrence Berkeley National Laboratory, forthcoming.

Michaelowa, A., Dutschke, M., 1999. “Economic and Political Aspects of Baselines in the CDMContext”, in Promoting development while limiting greenhouse gas emissions: trends & baselines, JoséGoldemberg, Walter Reid (eds.), New York, p. 115-134

Repetto, R., 2000. The Clean Development Mechanism: Institutional Breakthrough or InstitutionalNightmare?, Institute for Policy Implementation, University of Colorado, Denver.

SAIC, IIEC, and Tellus Institute, 1999. Electric sector data set for Jordan, India, Venezuela, SouthAfrica, and Thailand.

Page 64: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

B-1

Appendix A: Comparing carbon intensity and capacity factor(as proxy for duty cycle) in case study countries

Figures A.1 and A.2 show how different plant types vary in capacity factor and carbon intensity in eachcountry. The case of Jordan shown in Figure 5 is deceptively simple and neat. Due to low fuel costs atthe wellhead site, the natural gas combustion turbine plants are operated as baseload plants, with thehighest capacity factors in the system. Oil steam turbines also operate in the baseload range, and the lessefficient oil CTs run to cover the intermediate and peakload requirements. As one might ideally expect,the more continuously a plant operates, the more efficient and lower carbon-emitting it is. From thischart, one might deduce that a CDM project operating at low capacity factor and on peak would avoidpeaking power (CTs at about 1.3-1.8 kgCO2/kWh) with approximately twice the carbon intensity asbaseload power (NGCT and oil steam at 0.6-1.0 kgCO2/kWh). This suggests that a CDM project with apeaking profile should have a considerably higher benchmark than a baseload facility, and that theseseparate benchmarks could be readily derived from available data.

Figure A.1 Carbon intensity as a function of plant type and duty cycle, Venezuela

Venezuela, 1990-1999

0.00

0.20

0.40

0.60

0.80

1.00

1.20

1.40

1.60

1.80

0 0.2 0.4 0.6 0.8 1

Capacity Factor

Car

bo

n In

ten

sity

HydroNG/GT

NG/GT-DieselNG/Steam

NG/Steam/GasOil/GT

Oil/DieselOil/Steam/Gas

Page 65: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

B-2

Figure A.2 Carbon intensity as a function of plant type and duty cycle, Jordan

Jordan, 1990-1998

00.20.40.60.8

1

1.21.41.6

1.82

0 0.2 0.4 0.6 0.8 1

Average Capacity Factor

Car

bon

Inte

nsity

(k

gC02

/kW

h)

GT/OilSteam/OilGT/NatGas

While the relatively neat relationship between capacity factor and carbon intensity in Jordan, suggests thatbenchmarking methods or algorithms can readily capture peakload/baseload difference, there are severalcomplications:

• The simple and straightforward conditions found in Jordan are typical only of that minority ofcountries with limited fuel choice (no hydro, in particular), low fuel price variability (within thecountry), a reasonable reserve margin (15-40%) and system reliability, and homogenous groupsof technologies (i.e., all coal steam or CTs of similar characteristics28).

• On the other hand, technologies commonly built for peaking purposes (gas turbines) in someregions are often used for baseload/intermediate duty in others, because of under-capacity on thesystem (they need to run), capital constraints (they're cheap to build), or access to very cheap fuel(at the gas wellhead). This situation is found, for instance, in Venezuela. As shown in FigureB.2, capacity factor is not neatly correlated with fuel/technology mix. Some natural gas CTs andoil steam plants operate at high capacity factor while others operate more as peaking orintermediate capacity. As a result, there is no simple method a priori classification oftechnologies as peakload or baseload; each grid must be examined individually.

• A plant’s average capacity factor is a crude indicator of peak vs. baseload operation. Runningcost information or a detailed load curve provides far better indicator, but assembling a loadcurve, averaged over an appropriately long historical period, could considerably increase the datacollection requirements and the costs of benchmark development. In countries with several gridsystems there may be multiple (non-additive) load curves.

• Some CDM projects will not fall cleanly into either baseload or peakload categories, especiallyDSM and intermittent renewables that are not typically dispatched.

• The existence of peaking hydro (or pumped storage) capacity further complicates the peak/basedistinction. In Thailand, peak power is provided by a mix of gas turbines and peaking hydrofacilities, while intermediate power is typically purchased from IPPs29, and baseload power is

28 For example, as a result of considerable technological advances in recent years, combustion turbines may be farmore efficient and less costly than older models. As a result, CTs have become more attractive relative to steamturbines as baseload technologies, overtaking them in cost and efficiency.29 Note the lack of reporting requirements on IPPs adds further uncertainty as to the carbon-intensity of intermediatepower.

Page 66: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

B-3

generated thermal and combined cycle (CC) units using coal, oil and gas. Because of asignificant fraction is low-carbon hydro, peak generation in Thailand is, on average, much lesscarbon-intensive than baseload power. However, it would be misleading to then deduce thatprojects displacing baseload power save more carbon than those that displace peak. Because oftheir storage capabilities, high-head hydro (and pumped storage) sites are often operated to meetpeak demands, when the economic value of generation is highest. Their running costs aretypically low, unlike other peaking facilities such as diesels and combustion turbines. If operatedless on peak they would operate more during other periods, displacing other intermediate loadfacilities (gas, oil, or coal). Therefore, a CDM project that generated (or saved) electricity onpeak might displace fossil generation at other times. This effect could be accounted for withknowledge of the relative running costs of different facility types. Nonetheless, the CDM projectcould have systematic effects on overall system operation that would be hard to predict withoutthe use of a model.

Page 67: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

C-4-1

Appendix B. Fuel Shares of Electricity Production, 1995Threshold 3%Country Coal Oil Gas Hydro Nuclear Biomass GeothALGERIA 0% 3% 96% 1% 0% 0% 0%ANGOLA 0% 6% 0% 94% 0% 0% 0%ARGENTINA 3% 5% 40% 41% 11% 0% 0%BAHRAIN 0% 0% 100% 0% 0% 0% 0%BANGLADESH 0% 16% 80% 3% 0% 0% 0%BENIN 0% 100% 0% 0% 0% 0% 0%BOLIVIA 0% 8% 37% 53% 0% 1% 0%BRAZIL 1% 3% 0% 92% 1% 2% 0%BRUNEI 0% 14% 86% 0% 0% 0% 0%CAMEROON 0% 3% 0% 97% 0% 0% 0%CHILE 25% 8% 1% 66% 0% 1% 0%CHINA 73% 6% 0% 19% 1% 0% 0%CHINESETAI 34% 26% 4% 7% 29% 0% 0%COLOMBIA 11% 1% 17% 70% 0% 1% 0%CONGO 0% 1% 1% 99% 0% 0% 0%ZAIRE 0% 4% 0% 96% 0% 0% 0%COSTARICA 0% 14% 0% 86% 0% 0% 0%CUBA 0% 91% 0% 1% 0% 8% 0%DOMINICANR 5% 64% 0% 31% 0% 0% 0%ECUADOR 0% 38% 0% 62% 0% 0% 0%EGYPT 0% 37% 43% 20% 0% 0% 0%ELSALVADOR 0% 36% 0% 42% 0% 2% 20%ETHIOPIA 0% 8% 0% 87% 0% 0% 5%GABON 0% 12% 11% 77% 0% 0% 0%GHANA 0% 1% 0% 99% 0% 0% 0%GUATEMALA 0% 26% 0% 67% 0% 7% 0%HAITI 0% 21% 0% 75% 0% 4% 0%HONDURAS 0% 0% 0% 100% 0% 0% 0%HONGKONG 98% 2% 0% 0% 0% 0% 0%INDIA 69% 3% 6% 20% 2% 0% 0%INDONESIA 23% 27% 32% 14% 0% 0% 4%IRAN 0% 36% 56% 9% 0% 0% 0%IRAQ 0% 98% 0% 2% 0% 0% 0%IVORYCOAST 0% 58% 0% 42% 0% 0% 0%JAMAICA 0% 93% 0% 2% 0% 5% 0%JORDAN 0% 86% 13% 0% 0% 0% 0%KENYA 0% 9% 0% 83% 0% 0% 8%NORTHKOREA 36% 0% 0% 64% 0% 0% 0%SOUTHKOREA 26% 23% 12% 3% 36% 0% 0%KUWAIT 0% 22% 78% 0% 0% 0% 0%LEBANON 0% 86% 0% 14% 0% 0% 0%LIBYA 0% 100% 0% 0% 0% 0% 0%

Page 68: Key Issues in Benchmark Baselines for the CDM: Aggregation, Stringency, Cohorts, and ... · 2015-07-28 · KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE ES-3 project category

KEY ISSUES IN BENCHMARK BASELINES TELLUS INSTITUTE

C-4-2

Fuel Shares of Electricity Production, 1995 (cont)

MALAYSIA 9% 39% 38% 14% 0% 0% 0%MOROCCO 46% 49% 0% 5% 0% 0% 0%MOZAMBIQUE 13% 64% 15% 9% 0% 0% 0%MYANMAR 0% 14% 45% 40% 0% 0% 0%NANTILLES 0% 100% 0% 0% 0% 0% 0%NEPAL 0% 3% 0% 97% 0% 0% 0%NICARAGUA 0% 57% 0% 23% 0% 2% 17%NIGERIA 0% 24% 38% 38% 0% 0% 0%OMAN 0% 19% 81% 0% 0% 0% 0%PAKISTAN 0% 29% 27% 43% 1% 0% 0%PANAMA 0% 31% 0% 67% 0% 2% 0%PARAGUAY 0% 0% 0% 100% 0% 0% 0%PERU 0% 12% 0% 86% 0% 1% 0%PHILIPPINE 7% 63% 0% 11% 0% 0% 20%QATAR 0% 0% 100% 0% 0% 0% 0%SAUDIARABI 0% 55% 45% 0% 0% 0% 0%SENEGAL 0% 100% 0% 0% 0% 0% 0%SINGAPORE 0% 83% 17% 0% 0% 0% 0%SOUTHAFRIC 93% 0% 0% 1% 6% 0% 0%SRILANKA 0% 7% 0% 93% 0% 0% 0%SUDAN 0% 10% 0% 71% 0% 0% 0%SYRIA 0% 32% 23% 45% 0% 0% 0%TANZANIA 0% 14% 0% 86% 0% 0% 0%THAILAND 18% 30% 42% 8% 0% 0% 0%TRINIDAD 0% 0% 99% 0% 0% 1% 0%TUNISIA 0% 65% 34% 1% 0% 0% 0%UAE 0% 19% 81% 0% 0% 0% 0%URUGUAY 0% 11% 0% 88% 0% 1% 0%VENEZUELA 0% 5% 24% 70% 0% 0% 0%VIETNAM 7% 8% 6% 78% 0% 0% 0%YEMEN 0% 100% 0% 0% 0% 0% 0%ZAMBIA 0% 0% 0% 100% 0% 0% 0%ZIMBABWE 68% 0% 0% 32% 0% 0% 0%Source: IEA, 1997b