agglomeration and productivity in china firm level evidence

8/17/2019 Agglomeration and Productivity in China Firm Level Evidence

1/17

Agglomeration and productivity in China: Firm level evidence☆

Cui HU a,⁎, Zhaoyuan XU b, Naomitsu YASHIRO c

a School of International Trade and Economics, Central University of Finance and Economics (CUFE), 39 South College Road, Haidian District, Beijing 100081, PR Chinab Industrial Economy Research Department, Development Research Center of the State Council, R.216, No. 225, Chaoyangmen Nei Dajie, Beijing 100010, PR Chinac Organization for Economic Cooperation and Development (OECD), Economics Department, 2 Rue André Pascal, Paris 75016, France

a r t i c l e i n f o a b s t r a c t

Article history:Received 25 August 2013

Received in revised form 2 January 2015

Accepted 3 January 2015

Available online 9 January 2015

This paper conducts an in-depth evaluation on the role of industrial agglomeration in produc-

tivity growth of China's industrial sector by exploiting large dataset of manufacturing rms

active in 176 three-digit industries and in 2860 counties. We also complement our analysis

with the 2004 Census data to capture the agglomeration of small rms. Unlike previous studies

that often focused on specic industries, we assess the impact of agglomeration in a compre-

hensive range of industries and extend the scope of analysis to upstream industries as well.

Moreover, we explore how the ownership of Chinese rms shapes their ability to benet

from agglomeration effect as well as to act as the source of externality. We nd that congestion

and ercer competition offset the benets of agglomeration for rms operating within agglom-

erated regions. On the other hand, a co-location of large rms contributes signicantly to pro-

ductivity. We also nd a more important contribution from the agglomeration of upstream

industries than from that of the same industry. Private enterprises are the primary source of

agglomeration effects especially in upstream industries, whereas their productivity is boosted

most by the agglomeration of other private enterprises. We reckon that industrial agglomera-tion contributed up to 14% of the productivity growth in China's industrial sector between

2000 and 2007.

© 2015 Elsevier Inc. All rights reserved.

JEL classi cation:D24

R12

Keywords:Agglomeration

Productivity

China

1. Introduction

Sustainability of China's competitiveness as manufacturing base is recently questioned as enterprises in China are facing rise in

labor and resource costs, currency appreciation and tighter constraints associated with environmental protection. However, if the

competitiveness of Chinese industry is mainly founded on the productivity improvement as opposed to cheaper labor cost, China is

likely to retain its position as the world's major exporter of manufacturing goods. A likely source of such productivity growth is the

industrial agglomeration which increased consistently since the mid-1990s, driven by the globalization of the Chinese economy

(Lu & Tao, 2006). The geographic concentration of industrial activities was associated with a dramatic increase in number of rms,namely of the clustering of interconnected small rms (Long & Zhang, 2012). Case studies reported that such clustering of small

and medium enterprises (SMEs) reduced the technological barrier to entry and promoted quality upgrading of Chinese rms

(Fleisher, Hu, McGuire, & Zhang, 2010; Huang, Zhang, & Zhu, 2008). However, it is less clear whether the contribution of agglomera-

tion can be generalized beyond the ndings on specic industries or regions.

China Economic Review 33 (2015) 50–66

☆ We thank China Humanities and Social Sciences Youth Foundation of Education Ministry (Project code: 13YJC79004) and the China National Natural Science

Foundation (Project code: 71403302/G0304 & 71173058/G0301) for funding this paper.

⁎ Corresponding author. Tel.: +86 13501193242 (mobile).

E-mail address: [email protected] (C. Hu).

http://dx.doi.org/10.1016/j.chieco.2015.01.0011043-951X/© 2015 Elsevier Inc. All rights reserved.

Contents lists available at ScienceDirect

China Economic Review

http://dx.doi.org/10.1016/j.chieco.2015.01.001http://dx.doi.org/10.1016/j.chieco.2015.01.001http://dx.doi.org/10.1016/j.chieco.2015.01.001mailto:[email protected]://dx.doi.org/10.1016/j.chieco.2015.01.001http://www.sciencedirect.com/science/journal/1043951Xhttp://www.sciencedirect.com/science/journal/1043951Xhttp://dx.doi.org/10.1016/j.chieco.2015.01.001mailto:[email protected]://dx.doi.org/10.1016/j.chieco.2015.01.001http://crossmark.crossref.org/dialog/?doi=10.1016/j.chieco.2015.01.001&domain=pdf


2/17

Industrial agglomeration is often considered to generate positive externalities such as knowledge spillover, more ef cient

input sharing and richer labor pooling (Marshall, 1890). On the other hand, it can be associated with negative effects on pro-

ductivity due to congestion and ercer competition. Comprehensive assessments of the net effects of agglomeration in China

are still limited and their ndings remain inconclusive. For instance, Batisse (2002) reported a negative relationship between

the industrial specialization of a province and its growth in value-added, whereas Fan and Scott (2003) reported a positive

relationship between an industrial concentration and the province-level productivity. Empirical evidences at rm-level are

even more limited and are often conned to specic industries. An example is Lin, Li, and Yang (2011) which studied the

impact of agglomeration on productivity of textile industry rms and reported that initial positive externalities are

overwhelmed by diseconomies as the agglomeration intensies.

This paper provides a comprehensive assessment of agglomeration effects on productivity of Chinese rms by exploiting a

large survey data of manufacturing rms across 176 three-digit industries spanning between 2000 and 2007. Because the annual

survey data covers only the rms with sales of 5 million RMB or above, observing agglomeration effects solely based on those

data ignores the clustering of smaller rms which are found to play an important role in upgrading of Chinese industries.

Therefore, we complement our base analysis by incorporating the agglomeration of smaller rms using the 2004 Census data

that covers all the manufacturing rms.

Our analysis also comprises three notable features. First, we use the number and the average output size of rms in a spatial

unit to capture the size and the quality of an agglomeration. The number of rms is highly relevant to many aspects of Marshallian externality. For instance, knowledge spillover within an agglomeration is proportional to the number of rms,

when each rm engages in some types of knowledge creation and the nearby rms all benet from its outcome (Henderson,

2003). Also, larger number of rms in a region increases the scale and depth of inputs demand, allowing a more ef cient

input sharing. Finally, it enhances the ef ciency of matching between rms and workers via deeper labor pooling. On the

other hand, the average size of rms in an agglomeration not only captures the size of input demand and labor pooling for a

given number of rms, but also the co-location of productive enterprises. Firm size has often been seen as a proxy of productivity

and several studies have indeed reported signicant externalities arising from the co-location of large rms (Greenstone,

Hornbeck & Moretti,2010; Li, Lu, & Wu, 2012).

Second, we incorporate explicitly the agglomeration of upstream industries. The presence of rich supporting industries that pro-

vide high quality intermediate goods is essential for a country's industrial competitiveness. The agglomeration of upstream industries

increases the varieties of inputs supplied, and by allowing rms to specialize in their core activities while outsourcing some of in-

house production (Broda & Weinstein, 2006; Holmes, 1999). Given that sharing of specialized inputs has always been considered

as an essential element of Marshallian externality, it is somewhat surprising that previous studies did not take into account the

agglomeration of upstream industries.

Finally, we explore how a rm's ownership structure shapes its ability to benet from agglomeration effects as well as its ability to

act as the source of agglomeration effects. The difference in corporate behavior and culture can dene the extent to which a rm ben-

ets from an agglomeration (Saxenian, 1994). Chineserms differ substantially in their performances and corporate behaviors across

their ownership ( Jefferson, Rawski, Wang, & Zheng, 2000). State-owned enterprises (henceforth, SOEs) enjoy privileges in adminis-

trative treatment, good access to nance and sometimes monopolistic power, but can be driven by policy rather than prot. Private

enterprises (henceforth, PEs) are managed by vibrant entrepreneurs and are highly prot-oriented, but often face barriers to market

entry and dif culty in accessing credit. Foreign-invested enterprises (henceforth, FIEs) possess advanced technologies but often en-

gage in low value-added activities such as the assembly and re-export of imported intermediate inputs, referred as the processing

trade. Such stark differences across ownerships may not only shape the ability of Chinese rms to absorb knowledge spillovers but

also shape their ability to act as the source of spillovers.

We nd positive and sizable contributions by the agglomeration of upstream industries to the total factor productivity

(TFP) of Chinese rms. Our estimates imply that doubling the number of rms in upstream industries raises the TFP by

3.2%. On the other hand, agglomeration effects within the same industry are more complex: increase in number of rms

within the same county is seen to suppress TFP. This somewhat surprising nding suggests that severe congestion and in-

tense competition associated with agglomeration are offsetting the Marshallian externality. The increase in number of

rms in a moderate distance is associated with smaller negative impact or even positive impact to productivity. We also

nd that increase in average rm size contributes importantly to higher TFP. This suggests that the qualitative aspect of

an agglomeration, namely the co-location of large rms possessing rich knowledge stock denes the benets of agglomer-

ation. When looking across different ownerships, PEs seem to be the main recipient of both positive and negative agglom-

eration effects but this becomes less clear when the agglomeration of smaller rms is incorporated. However, we nd that

the agglomeration of PEs impacts the productivity of Chinese rms more than that of SOEs or FIEs. Especially, PEs are the

main sources of the sizable agglomeration effects from upstream industries. The agglomeration effects from PEs are self-

reinforcing in a sense that the productivity of a PE is most enhanced by the agglomeration of PEs. Based on our estimated

results, we reckon that industrial agglomeration contributed up to 14% of TFP growth in China's industrial sector during

2000–2007.

The next section provides a non-exhaustive review of works on industrial agglomeration, especially in context of China. Section 3

describes the dataset used in our analysis, the method to construct indicators of agglomeration and to estimate TFP and our empirical

model. Section 4 lays out our estimation results and discusses the prominent feature of agglomeration effects in China. Based on the

results from Section 4, Section 5 computes the role of agglomeration in the productivity growth of China's industrial sector during thesample period. Section 6 concludes with policy implications.

51C. Hu et al. / China Economic Review 33 (2015) 50–66


3/17

2. Agglomeration and productivity —a brief review

2.1. An overview of studies on agglomeration

Economic activity is geographically concentrated. Previous studies on why an agglomeration occurs and how it benets economic

agents have been mainly based on the two perspectives (Rosenthal & Strange, 2004): the localization economies, which are the

Marshallian externalities arising from a concentration of rms in a same industry; and the urbanization economies, which arise

from an increase in city size that enables cross-fertilization of idea among diverse economic activities ( Jacobs, 1969). While both in-

dustry concentration and city size were found to be associated with higher productivity, when assessed together, empirical evidences

tended to support the localization economies more than the urbanization economies (for example, Henderson, 2003; Li et al., 2012).

While this paper is primarily oriented toward the localization perspective, it goes beyond the concept by incorporating explicitly the

agglomeration of upstream industries.

There are several channels through which localization economies manifest: rst, an agglomeration enables the sharing of knowl-

edge and skills through formal and informal interactions among rms or individuals (Marshall, 1890; Saxenian, 1994). Knowledge

spillovers occur mostly within proximity. For instance, Jaffe, Trajtenberg, and Henderson (1993) found that patent citations are

more likely to come from the same state or metropolitan area as the original patent. The shared knowledge may not be conned to

advanced technologies but can also be management skills and business knowledge. For example, Greenaway and Kneller (2008)

reported that a larger number of exporting rms in proximity promotes other rms' export entry, suggesting that agglomeration of

experienced exporters can reduce the entry costs of exports through spillovers of knowledge on foreign markets. Second, a concen-

tration of an industry enables the production of specialized intermediate inputs to attain the level that is suf cient to exploit scale

economies (Marshall, 1890). This, in turn, allows rms to outsource higher share of their intermediate inputs and specialize on the

most protable activities (Holmes, 1999). Third, geographical concentration of rms and workers reduces the search frictions

between jobs and workers, allowing more ef cient matching. A deeper labor pooling reduces the risks that a worker remains

unemployed and a rm cannotll its vacancies (Krugman & Venables, 1995). Finally, there is a self-reinforcing “home-market effect”

that induces rms to locate near a large market, especially when an increasing return to scale and transportation costs is important

(Davis & Weinstein, 1999).

On the other hand, agglomeration can be associated with several diseconomies. A dense rm location may generate congestion

which increases business costs. Congestion may be especially severe if infrastructure is a bottleneck to economic activities. For in-

stance, Lall, Shalizi, and Deichmann (2004) found that geographic concentration of the same industry lowers the productivity of

Indian rms, whereas access to markets through improvement in inter-regional infrastructure has favorable effect. Furthermore,

the ercer competition in product and factor markets can suppress a rm's mark-up and productivity through lower product prices

and higher inputs costs. Especially, severe competition for workers with specialized skills can result in a disproportional increase in

their wage compared to their productivity, at least in the short run. Therefore, whether the benet of agglomeration to productivity

exceeds the diseconomies is an empirical issue.

Various approaches to capture agglomerations exist, ranging from the Gini coef cients of geographical distribution of industrial

output or employment to the EG index proposed by Ellison and Glaeser (1997) which simultaneously controls for the share of an

industry's employment within a region, the share of aggregate manufacturing employment within a region, and the market concen-

tration of an industry. While as not as popular as the EG index, the number of rms within a region has been employed by previous

studies for its tractability and straight-forward implication. For example, Henderson (2003) assessed the localization economies in

high-tech industries in the United States by observing the number of plants in the same industry. He claimedthat knowledge spillover

within an agglomeration is proportional to the number of rms, when each rm engages in some types of knowledge creationand the

nearby rms all benet from itsoutcome. The tractability of thenumber of rmsas a measure of agglomeration is especially benecial

when assessing agglomeration effects across numerous industries or when incorporating some aspects of distance. For instance, the

EG index does not consider the geographical proximity and distance of industrial agglomeration.

Although the sharing of specialized input has been long considered as one important aspect of localization economies, previous

studies seldom explored how an agglomeration of upstream industries affects the productivity of downstream rm. The agglomera-

tionof upstream industries increases the varieties of inputs supplied, thereby contributing to the productivity of sourcingrms(Broda

& Weinstein, 2006). It also allows rms to specialize in their core activities while outsourcing some of in-house production (Cainelli &

Iacobucci, 2012; Holmes, 1999). Furthermore, stronger competition in upstream industries due to agglomeration can be associated

with the supply of cheaper and better quality intermediate inputs. Consistent to such view, the FDI penetration in upstream industries

is often found to improve the productivity of domestic rms (for example, Ito, Yashiro, Xu, Chen, & Wakasugi, 2012; Javorcik, 2004).

We therefore extend our analytical framework to incorporate explicitly the agglomeration of upstream industries.

A strand of research explored the effect of a co-location by largerms on productivity growth. The size of a rm, measured in out-

put or employment, is often closely related to its productivity as well as to its intangible competitiveness such as good management

practices (Bloom, Lemos, Sadun, Scur, & Van Reenen, 2014). Locating near largerms may thus provide opportunities to absorb spill-

overs of advanced knowledge. For instance, Greenstone, Hornbeck, and Moretti (2010) found that the United Statesrms in counties

which saw new openings of “Million Dollar plants” realized 12% higher productivity growth after 5 years vis-à-vis the rms in other

counties. More recently, Li et al. (2012) observed that Chinese rms are more likely to become larger by locating with a number of

large rms than with larger number of rms. Therefore, agglomerations that include many large rms may reward larger favorable

effects on a

rm's productivity. On the other hand, considering that co-location with productive

rms is likely to be associated withstronger competition, the benets of knowledge spilloverscould be at least partially offset, resultingin a net impact that is not a priori

52 C. Hu et al. / China Economic Review 33 (2015) 50–66


4/17

clear-cut. For instance, Henderson (2003) reported that the average size of plants in the same industry and the same county does not

have any signicant impact on theproductivityof US plants in high-techindustries. This paper exploreswhether the presence of large

rms in an agglomeration has an impact of its own besides the size of agglomeration captured by the number of rms.

2.2. Studies on agglomerations in China

The industrial agglomeration in China has increased consistently since the mid-1990s, driven by the globalization of the Chinese

economy (Lu & Tao, 2006). Major exporting industries (such as textiles and electronics) as well as industries depending heavily onimported intermediate goods (such as machinery) concentrated in coastal areas that have better access to trade routes and advanced

technology (Fujita & Hu, 2001). Long and Zhang (2012) observed that industrial concentration has been driven by the dramatic

increase in number of rms in the regions rather than the increase in size of output perrm, namely the clustering of highly intercon-

nected small rms. Case studies reported the role of such clustering in reducing technological barriers and promoting quality

upgrading. For example, Huang et al. (2008) studied the footwear industry in Wenzhou and claimed that clustering lowered the

technical and capital barriers to entry by deeper division of complex production process, and brought a wide range of entrepreneurs

in rural areas into the industry. Sonobe, Hu, and Otsuka (2004) observed that clustering of rms producing light-volt electronic

appliances in Wenzhou promoted the fast quality upgrading and introduction of new marking strategy. Fleisher et al. (2010) reported

that rms in the Zhili Township children's garment cluster in Zhejian province have responded to the rigorous competition pressure

by investing in design, branding and quality upgrading.

More general empirical evidences on agglomeration effects in China are still limited and somewhat mixed. Early studies relied on

aggregated data. For instance, Batisse (2002) found negative relationship between industrial specialization and province-level value-

added growth, whereas Fan and Scott (2003) found strong positive relation between industrial concentration and province-levelproductivity. Ke (2010) reported a positive and causal relationship from an industrial agglomeration to city-level labor productivity.

Evidences based on rm-level data are more recent and are even more limited. For example, Lin et al. (2011) explored the agglomer-

ation effects on productivity of rms in textile industry and found an inverted-U type effect, which implies that net positive impact is

more than offset by diseconomies after certain point. Yang, Lin, and Li (2013) reported that the agglomeration of production contrib-

utes to productivity of electronicsrms while the agglomeration of R&D activitieshas negative impact. However, not only thescope of

those studies is conned to specic industries, they are based on the above-scale surveydata which does not capture the clustering of

smaller rms. Furthermore, to our knowledge, there are not yet studies that captured how the rapidly growing non-state sector is

shaping the agglomeration effects in China. Overall, there is considerable room to be lled in order to understand the role of agglom-

eration in China's industrial competitiveness.

3. Empirical strategy

3.1. Data

3.1.1. The Survey on above-scale manufacturing enterprisesThe dataset we use for computing the productivity of Chinese enterprises is the annual survey collected by China's National

Bureau of Statistics, which covers all the state-owned enterprises (SOEs) and the non-SOEs with annual sales exceeding RMB

5 million (the Survey of above-scale industrial enterprises, henceforth the Survey). We use this above-size rm data for its

availability at annual basis and its large coverage of China's industrial output. The data have been employed by numerous

works to study the productivity growth of Chinese industries (such as Brandt, Biesebroeck, & Zhang, 2012) or factors behind

the productivity growth (such as Yu, 2014). We use the sample spanning from 2000 to 2007 to compute the productivity of

Chinese rms and the indicators of agglomeration. The number of observations for each year ranges from 162,885 (in 2000) to

336,768 (in 2007).

Quite a few observations in the Survey sample are plagued by missing information and misreporting. We therefore conduct a

data-cleaning procedure following the recent studies such as Cai and Liu (2009) and Feenstra et al. (2014). This involves deleting

the observations with the following characteristics: (1) key variables such as total industrial output, industrial added value, xed as-

sets and employees have negative values; (2) the value of liquid assets or total xed assets are larger than that of total assets; (3) the

number of employees are less than 10; and (4) rm age is negative. This process reduces our observations to 160,868 in 2000 and to

335,616 in 2007, which stilladd up to more than 98% of original sample. It is shown in Table 1 that the share of retained observations is

maintained throughout our sample period.

Table 1

Description of the Survey data set.

Source: Authors' calculation.

Year 2000 2001 2002 2003 2004 2005 2006 2007

Total observations 162,885 171,240 181,557 196,222 276,474 271,835 301,961 336,768

Observation retained 160,868 169,279 179,706 195,659 275,289 270,945 300,908 335,616

Share of retained observations 98.8% 98.9% 99.0% 99.7% 99.6% 99.7% 99.7% 99.7%



5/17

Furthermore, the original identication codes in the Survey are sometime duplicated amongseveral rms or are missing for some

rms, preventing us to uniquely identify each observation. We therefore assign each enterprise a new identication code based on its

name in Chinese, its original identication number and the name of its legal representative. Also, China has modied its industry

coding system in 2002, switching from GB1994 to GB2002. Because our exercise requires consistent information on industry

classication, we employ the method by Brandt et al. (2012) to prevent our observations from switching their industries due to

this administrative issue.

3.1.2. The 2004 CensusOne issue about using the Survey data to assess industrial agglomeration is that it fails to capture the small non-SOE rms with

sales under the threshold. In order to take into account the clustering of SMEs highlighted by many case studies, we exploit the

2004 Census (henceforth, the Census) to calculate alternative set of indicators of agglomerations. To demonstrate the difference in

coverage between the Census and Survey, Table 2 compares the industrial output andrm number in 2004 by province. The number

of rms covered by the Census is about 5 times larger than that by the Survey, while the total industrial output covered by the Census

is only 10% larger than that by the Survey. These differences indicate the existence of a large mass of small rms. Indeed, the average

rm size in the Census is overall about one-fth of thatin the Survey. The number of small rms covered by the Census but not by the

Survey is especially large in Zhejiang, Jiangsu and Guangdong where case studies and media have documented vibrant industrial

clusters. In fact, 40% of such rms are concentrated in the three provinces. Relying solely on the Survey ignores such mass of small

rms, and can result in inaccurate measurement of agglomerations in China. We therefore construct an alternative set of indicators

of agglomeration using the Census data.

The Census and the Survey share some common variables such as a rm's name, identication number, address code and industry

code. Since we construct agglomeration indicators from the Census data by a spatial unit and industry (as described later in this

section), we rely on address code and industry code to match the two datasets. However, the largest drawback of the Census data

is its limited availability: within in our sample period, it is only available in the year 2004. This constrains our empirical analysis

using the Census data to a cross-sectional exercise in one specic point of time. Therefore, we complement the base analysis using

the panel of the Survey data with an additional analysis based on the cross-section of the Census data.

Table 2

Comparison between the Census and the Survey data (year 2004).

Source: Authors' computation based on 2004 Census and 2004 Survey of Above-scale Industrial Enterprises.

Number of rms Industrial output (1000 RMB) Average rm size (output,

1000 RMB)

Census Survey Census Survey Census Survey

Beijing 31,669 6871 597,470,208 573,325,440 18,866 83,441

Tianjin 25,654 6462 611,914,432 585,449,536 23,853 90,599

Hebei 64,629 9284 1,018,096,896 868,167,360 15,753 93,512

Shanxi 28,884 5013 417,396,224 377,103,936 14,451 75,225

Neimeng 11,849 2275 232,745,824 209,593,584 19,643 92,129

Liaoning 54,610 10,635 914,065,792 860,390,080 16,738 80,901

Jilin 16,363 3280 355,172,064 334,378,048 21,706 101,945

Heilongjiang 20,303 3297 395,569,632 371,877,344 19,483 112,793

Shanghai 55,806 15,766 1,459,413,760 1,396,810,752 26,152 88,596

Jiangsu 188,844 40,848 2,947,678,976 2,668,238,848 15,609 65,321

Zhejiang 188,917 41,358 2,122,122,624 1,872,797,952 11,233 45,283

Anhui 39,263 4778 423,602,208 366,015,040 10,789 76,604

Fujian 49,824 11,918 751,248,256 678,341,504 15,078 56,917

Jiangxi 29,468 4019 273,658,912 221,197,920 9287 55,038

Shandong 120,673 23,915 2,467,854,336 2,252,100,864 20,451 94,171

Henan 76,896 11,474 923,664,320 757,558,784 12,012 419,733Hubei 29,262 6232 532,919,104 496,024,640 18,212 79,593

Hunan 43,925 7523 434,186,112 365,406,624 9885 48,572

Guangdong 137,652 34,584 3,151,649,536 2,955,180,800 22,896 85,449

Guangxi 19,080 3749 224,198,624 202,548,480 11,750 54,027

Hainan 2066 588 42,941,776 40,723,348 20,785 69,257

Chongqing 20,509 2634 259,884,384 214,272,608 12,672 81,349

Sichuan 43,759 7413 530,365,728 471,684,928 12,120 63,629

Guizhou 11,121 2545 154,630,880 139,491,408 13,904 54,810

Yunnan 14,402 2332 234,406,848 209,204,800 16,276 89,710

Tibet 356 187 2,484,754 2,269,713 6980 12,138

Shanxi 25,785 3012 315,087,840 273,521,600 12,220 90,810

Gansu 11,664 1927 169,580,784 158,260,272 14,539 82,128

Qinghai 2198 463 38,809,336 37,421,872 17,657 80,825

Ningxia 4019 662 60,518,648 55,365,568 15,058 83,634

Xinjiang 5807 1430 165,602,080 157,184,448 28,518 109,919

Total 1,375,257 276,474 22,228,940,898 20,171,908,101 16,163 72,961



6/17

3.2. The indicators of agglomeration

Mainland China has six layers of administrations, starting from a province ① and descending to a prefecture, a county, a

township and to a village. The spatial unit we use for our base analysis is a county. As of 2013, mainland China comprises

2856 counties. We observe the agglomerations of 131 3-digit China Industry Code (CIC) industries within a county.② Later in

the paper, we conduct a robustness tests to show that our main nding is not sensitive to our choice of spatial unit and industry

classication.

The benets of agglomeration are likely decrease with the distance. However, we cannot capture the relation between agglomer-

ation effects and the distance precisely because neither the Census nor the Survey provides exact geographical location of rms (such

as their longitude and latitude). As a second best approach, we compare the impact of an agglomeration within the same county the

rm is located and that of an agglomeration in other countieswithin the same province. We do not consider the agglomeration effects

across provinces given that the protectionism by provincial government in product and factor market is likely to limit them(Bai, Duan,

Tao, & Tong, 2004).

3.2.1. Agglomeration of the same industryWe construct four measures of agglomeration that correspond to a conventional concept of localization economies that is the

agglomerations of rms within the same industry. For a rm i operating in the industry j in the county k, such indicators are:

1. The number of rms in the same industry and the same county: agg _NC jkThis corresponds to the number of rms in the industry j and in the county k. We include the rm itself when calculating thisindicator so that it is feasible to take the logarithm of index even when there is one rm operating in a given industry and a

given county.2. The average size of rms in the same industry and the same county: agg _YC jk

This measure captures the co-location of large rms belonging to the same industry within proximity of a given rm. It is obtained

by the following formula:

agg YC jk ¼X

m∈ j;k

Y m

0@

1A−Y i

0@

1A. agg NC jk−1

ð1Þ

where Y i is the output of therm i. When there is only onerm in the industry j in the county k, we replace the index by 0. Althoughthis measure is specic to each rm, we drop the subscript i for the sake of brevity.We construct two more indicators of agglomeration that concern the number or the average size of the rms belonging to the same

industry but that are located outside the county the

rm i operates, while still within the same province.3. The number of rms in the same industry but outside the county: agg _NP jkThis corresponds to the number of rms in the industry j but outside the county k.

4. The average size of rms in the same industry but outside the county: agg _YP jkUnlike the second indicator, this indicator is a simple ratio between the aggregate output and number of therms in the industry jbut located outside the county k.

3.2.2. Agglomeration of the upstream industryIn order to avoid the multicollinearity from having too many variables in our estimation, we focus on agglomerations of upstream

industries located within the same county. A rm's input coef cients from upstream industries should differ for each rm depending

on its sourcing strategy. However, because the information on inputs at rm-level is not available, we follow Javorcik (2004) and

many other studies to use the industry-level input coef cients as the proxy of rm-level input coef cients. The coef cients

are taken from input–output (IO) tables of China.③ For a given rm in industry j and in country k, the indicator of agglomeration of

upstream industries is constructed as the following:1. The number of rms in the same county but upstream industries: Uagg _NC jk

The indicator aggregates the number of rms in other industries as the industry j located in the country k using the inputcoef cients of the industry j as weights:

Uagg NC jk ¼X

m;m≠ j

σ jmN mk ð2Þ

① Provinces also include municipality directly administered by central government such as Beijing, Tianjin, Shanghai and Chongqing, as well as autonomous regions

like Guangxi, Ningxia, Neimeng, Xingjiang and Xizang.② The CIC comprises 470 four-digit, 181 three-digit, and 38 two-digit industries, but we covered only 131 three-digit industries due to there are not suf cient obser-

vations in some industries.③ Because IO table withinour sample periodis only availableon 2002 and2007, weobtainthe input coef cientsin theotheryearsfrom a linear interpolation between

the 2002 and the 2007 input coef cients. We also assume that the input coef cients in 2001 are identical to those in 2002. Furthermore, input coef cients in the IOtables are at two-digit level, implying that three-digit industries within the same two-digit industry are all assigned the same coef cients.



7/17

the subscript m represent industry other than j. σ jm is theinputcoef cient of industry j from industry m and N mk is the total number of rms in industry m in the county k.2. The average size of rms in the same county but upstream industries: Uagg _ YC jk

The indicator is calculated as the following:

Uagg YC jk ¼X

m;m≠ j

σ jmY mk

=Uagg NC jk ð3Þ

where Y mk is the sum of output of the rms belonging to industry m in the same county k as rm i.

All the indicators are transformed into natural logarithm. Table 3 provides summary statistics of these agglomeration indicators

both when using the Survey data and the Census data.

3.3. Estimation of TFP

We observe the total factor productivity (TFP) as the principle measure of productivity. TFP is obtained as the “residual” after

deducting the contributions by labor and capital inputs to its output. It is thus obtained from an estimation of production function.

We apply the method proposed by Levinsohn and Petrin (2003) which controls for the potential endogeneity in estimate of inputs

coef cients. Consider a rm's production function such as below:

Y t ¼ f I t ;ς t ;β ð Þ ð4Þ

where, I t are inputs used in production; β are parameters; and ς t is error term vector, which can be interpreted as Hicks-neutralproductivity.

Becauserms with higher productivity are expected to usemore inputs, the input vector I t may be correlated with the unobservableproductivityς t . In such case, the estimates obtained from OLS are biased. Levinsohn and Petrin (2003) consider the following log-linearproduction function:

yt ¼ β 0 þ β llt þ β kkt þ β mmt þ υt þ ζ t ð5Þ

where, yt , lt , kt and mt are log of output, labor, capital and intermediate inputs, respectively; lt is deemed as free variable, kt and mt arestate variables. The error term ς t in the Eq. (4) is divided into two parts, namely υt and ζ t . υt is the unobservable state variable which

impacts the rm's decision on inputs, whereas ζ t is an independently and identically distributed disturbances that has no impact onthe rm's decision. Assume intermediate input's demand function is given as,

mt ¼ mt υt ; kt ð Þ: ð6Þ

Table 3

Summary statistics of the indicators of agglomeration.

Source: Authors' calculation.

Obs Mean Std. Dev. Min Max

Survey data sampleagg_NC 609,331 0.56 0.81 0.00 6.29

agg_YC 609,331 4.40 5.11 0.00 17.69

agg_NP 609,331 4.12 1.40 0.00 8.04

agg_YP 609,331 10.69 1.29 0.00 18.24

Uagg_NC 609,217 0.60 1.59 0.00 6.01

Uagg_YC 609,058 10.69 0.98 0.08 18.86

Census data sampleagg_NC 67,013 1.70 1.28 0.00 7.65

agg_YC 67,013 7.26 3.74 0.00 17.44

agg_NP 67,013 5.86 1.44 0.00 9.20

agg_YP 67,013 9.43 1.13 0.00 16.79

Uagg_NC 67,013 2.42 1.53 0.00 7.04

Uagg_YC 67,013 9.25 0.90 3.27 14.01

Note:

1. The agglomeration indicators are all in natural logarithm.

2. agg_NC is thenumber of rmsin thesame industry andsame county;agg_YC is theaverage size of rmsin thesame industry andsame county; agg_NPis thenumber

of rms in the same industry same province but outside the county; agg_YP is the average size of rms in the same industry same province but outside the county;Uagg_NC is the number of rms in the same county but upstream industries; and Uagg_YC is the average size of rms in the same county but upstream industries.



8/17

If such function is monotonic inυt forall kt , the unobservable υt can be captured by theinverse function of intermediate input to getυt = υt (mt , kt ). This enables the Eq. (5) to be rewritten as,

yt ¼ β llt þ ϕt mt ; kt ð Þ þ ζ t ð7Þ

where, ϕt (mt , kt ) = β 0 + β kkt + β mmt + υt (mt , kt ).Following idea of Robinson (1988), subtracting expectation of yt conditional on mt and kt , the Eq. (7) can be transformed as

below:④

yt −E yt jmt ; kt ½ ¼ β l lt −E lt jmt ; kt ½ ð Þ þ ζ t : ð8Þ

Since ζ t is independently and identically distributed errors, OLS regression of Eq. (8) yields the consistent estimate of β l.Coef cients on capital input β k and on intermediate inputs β m are estimated as the following: rst, it is assumed that υt follows

a rst order Markov process and that capital does not immediately respond to ξt ,the innovations in productivity over last period'sexpectation, then,

ξt ¼ υt −E υt jυt −1½ : ð9Þ

According to the relationship of Eq. (5), we can get:

ξt þ ζ t ¼ yt −β̂ llt −β kkt þ β

mmt þ Ê υt jυt −1½ ð10Þ

where β̂ l is the estimated coef cient of labor input obtained previously.β k⁎ andβ m⁎ are thecandidate values forβ k andβ m, respectively. Ê

[υt |υt − 1] is the residual obtained from the regression of yt −β̂ llt −β kkt −β

mmt

on ϕ̂t −1−β

kkt −1−β

mmt −1

using weighted

quadratic least squares. The latter is therefore a function of β k⁎ and β m⁎.The following two moment conditions are required to obtain the consistent estimates:

E ξt þ ζ t ð Þkt ½ ¼ E ξt kt ½ ¼ 0 ð11Þ

E ξt þ ζ t ð Þmt −1½ ¼ E ξt mt −1½ ¼ 0 ð12Þ

The rst moment condition requires that capital input and productivity are not correlated, whereas the second requires that the

past intermediate inputs are not correlated with productivity. The estimates β̂ k and β̂ m are obtained by minimizing the followingGMM standard equation:

Q β ð Þ ¼ minβ X

4

h¼1

Xi

XT i1t ¼T i0

ξit þ̂ζ it

Z i;ht Þ

2ð13Þ

where Z t = {lt , kt , mt − 1, mt − 2}. In the equation above, i represent enterprises, h the number of Z ; and T i0 and T i1 are the second yearand the last year when enterprise i is observed in the sample. Since panel data are required for the estimation of production function,we rely on theSurvey data to estimate TFP. This implies that the recipients of agglomeration effects in this paper are SOEs and relative

large non-SOEs.

Given that rm-specic prices are not available, we follow Brandt et al. (2012) to use the 2-digit industry-level price index to

deate a rm's output and intermediate input, and the xed asset investment price to deate its capital. Furthermore, we estimate

product function separately for each 2-digit industry considering the fact that input coef cients differ substantially across industries.

The average TFP (in logarithm) for each year is shown in Table 4, indicating a clear uptrend that is in line with previous studies. It

can also be seen that the average TFP of private enterprises is generally higher than those of SOEs and FIEs although there has

been an apparent surge in the TFP of SOEs at the latter years. This is in line with the well-documented growth in protability of

SOEs as they retreat from the sectors open to competition and FDI and concentrate on highly regulated upstream industries (Du &

Wang, 2013).

3.4. Empirical model of agglomeration effects

We estimate the following model that relates the TFP of a rm i in industry j located in county k to the agglomeration of the sameindustry and the upstream industries:

lnT FP ijkt ¼ α þ β 1Horizontal jkt −1 þ β 2Horizontal jmt −1 þ β 3Upstream jkt −1 þ γ X ijkt −1 þ η i þ η t þ η j þ η k þ ε ijkt : ð14Þ

④ Take expectation for Eq. (4) conditional on mi and ki, we can get E [ yt |mt , kt ] = β lE [lt |mt , kt ] + ϕt (mt , kt ), so Eq. (5) is gotten.



9/17

The rst term on the left hand side is a constant, whereas the second to the fourth terms are the indicators of industrial agglom-

eration. The term Horizontal jkt is a vector of theindicators of agglomerationin thesame industry and in the same county, agg _ NC , andagg _ YC . The term Horizontal jmt is a vector of theindicators of agglomerationin the same industry but outside thecounty, agg _ NP andagg _ YP . The term Upstream jkt is a vector of the two indicators of agglomeration of the upstream industries, Uagg _ NC and Uagg _ YC .Since all agglomeration indicators are in logarithm, the estimated coef cients can be interpreted as elasticity of productivity to

number of rms or average rm size in the same industry or in upstream industries.

The signs of estimated coef cients on the indicator of agglomeration Horizontal jkt are not clear a priori given that positive impactsof Marshallian externalities are at least partly offset by congestion effects and intensied competition in product and factor markets

that suppresses product prices while driving up inputscosts. However, such negative effects to a rm's productivity can be less severe

when the agglomeration is located farther from the rm (i.e., in other counties) or occurring in upstream industries. On the other

hand, knowledge spillovers are expected to be smaller when an agglomeration is not located within the proximity or it concerns

industries using different technologies. Therefore, the signs and relative sizes of coef cients on Horizontal jmt and Upstream jkt arealso not clear.

Thefth term X ijkt is thevector of control variables. It includes the followingrm characteristics that can inuence a rm's produc-tivity: (1) the logarithm of R&D Investment; (2) the market share, which is the rm's output share in the same industry. Firms with

larger market share often enjoy stronger monopolistic power and higher mark-up; (3) debt ratio, measured as the ratio of the rm's

total liability to total assets, which we interpret as the rm's borrowing capability. Enterprises with strong borrowing capability may

realize higher productivity because their investments are less subjected to credit constraints; (4) the export intensity, measuredas the

ratio of the rm's export to its total value-added. Firms that export enjoy scale economy by supplying the world's market and can also

tap into advanced technology through transaction with foreign partners (De Loecker, 2007); and (5) rm age, measured as the differ-

ence between the year in sample and the year of establishment. Longer survival reects the rm's intangible competitiveness such as

superior managerial capability. It also accounts for productivity advantage over competitors that is founded on “learning by doing”

effect. Furthermore, a county-level average wage is included in order to control for thequalityof labor in the county therm is located.

Firms often agglomerate in areas with abundant supply of skilled labor, the use of which is an important determinant of a rm's pro-

ductivity. Controlling for the local labor quality allows us to capture the pure effects of agglomeration. Table 5 provides summary sta-

tistics of those control variables.

Because of more intense competitions in product and factor markets within agglomerated regions, rms with originally high

productivity are more likely to locate in such area ( Baldwin & Okubo, 2006). We address the possibility of such reverse causality

from productivity to agglomeration in two ways: rst, we lag all indicators of agglomeration one period along with the control

variables. Thus, when using the indicators constructed from the Census data, Eq. (14) is estimated with a cross-sectional OLS which

regresses rm's TFP of 2005 on the indicators that capture agglomeration in the year 2004. Second, when using the indicators com-

puted from the Survey data, we include a rm-level xed effect η i

that captures time-invariant unobservable rm characteristics

(such as superior managerial quality). Because some rms switch industries and areas over the sample period, three digits industry

xed effects and province xed effects are included in estimation. Furthermore, we include time xed effect to control for trend

growth in productivity when using panel data. Lastly, ε ijkt is an i.i.d error which follows N (0, σ 2).

Table 4

Summary of the estimated TFP by year.

Source: Authors' calculation based on survey data of above-scale industrial enterprises.

Sample Statistics 2001 2002 2003 2004 2005 2006 2007

All rms Mean 2.95 3.08 3.23 3.31 3.47 3.60 3.75

Std. Dev 1.56 1.54 1.49 1.42 1.42 1.41 1.40

SOEs Mean 2.46 2.60 2.85 3.02 3.26 3.51 3.89

Std. Dev 1.82 1.83 1.80 1.79 1.81 1.77 1.75

PEs Mean 3.12 3.21 3.32 3.37 3.51 3.63 3.78Std. Dev 1.37 1.36 1.35 1.29 1.32 1.33 1.35

FIEs Mean 3.10 3.18 3.25 3.23 3.38 3.51 3.60

Std. Dev 1.55 1.55 1.53 1.51 1.48 1.47 1.47

Notes:

1. TFP are in natural logarithm.

2. SOEs refer to state-owned enterprises; PEs refer to private enterprises; FIE S refer to foreign invested enterprises.

Table 5

The denition and summary statistics of control variables.

Source: Authors' calculation based on survey data of above-scale industrial enterprises.

Variables Denition Obs Mean S.D. Min Max

R&D investment Natural log of R&D investment 1,303,196 0.61 1.84 0.00 16.14

Market share The proportion of rm' output to total output of industry 1,830,664 0.00 0.00 0.00 0.80

Debt ratio The ratio of total assets to total liability 1,830,664 0.59 0.35 0.00 48.50

Export ratio The ratio of export to the added value 1,830,664 0.16 0.33 0.00 2.00

Age (year) Firm age 1,830,664 10.00 11.40 1.00 150.00

Average wage Natural log of average wage of the county 1,830,664 2.70 0.49 −2.36 5.74



10/17

The model is estimated for the whole sample as well as for the sub-samples of rms with different ownerships, namely the SOEs,

private enterprises (PEs) and FIEs. The latter specication reveals how the difference in business conditions, corporate culture and

technological endowment among the three types of rms shapes their ability to benet from agglomeration. Furthermore, we also

assess how such differences shape a rm's ability to act as the source of spillovers by estimating the Eq. (14) with the indicators

that capture an agglomeration of the rms of specic ownership.

4. Estimation results

4.1. Agglomeration effects based on the Survey data

Table 6 presents the results of base estimation and those over sub-samples of rms with different ownerships. As the most basic

setting, column 1 shows the results where only the indicator of agglomeration of the same industry in the same county is included.

The average rm size (agg_YC) has statistically signicant positive contribution to a rm's productivity, while the impact of number

of rms (agg_NC) is not signicant and negative. The insignicant coef cient on number of rms suggests that severe congestion ef-

fect associated with agglomeration is offsetting the benets from Marshallian externality. The positive coef cient on average rm size

however suggests that the co-location of large, productive rms within proximity does contribute to productivity.

As we add the agglomeration of the same industry outside the same county and that of upstream industries within the same

county in column 2, the coef cient on agg_NC now indicates a statistically signicant negative effect on productivity. It implies that

doubling the number of rms in the same industries within the same county can reduce a rm's productivity up to 1.2%. On the

other hand, the coef cient on the number of rms in the same industry outside the county (agg_NP) is signicant and positive,

implying that doubling such number raises a rm's productivity by 1.3%. This somewhat surprising nding that agglomeration ismore benecial when it is in a moderate distance indicates a specic situation in China where an agglomeration of the same industry

Table 6

Estimates of agglomeration effects based on the Survey data.

Sample All (1) All (2) SOEs (3) PEs (4) FIEs (5)

Indicators of agglomeration(1) agg_YC 0.004*** 0.004*** 0.001 0.004*** 0.003**

(8.97) (7.99) (0.52) (8.09) (2.92)

(2) agg_NC −0.002 −0.012*** 0.001 −0.016*** 0.001

(−0.73) (−4.54) (0.06) (−5.37) (0.21)

(3) agg_YP 0.010*** 0.003 0.014*** −0.001

(5.30) (0.59) (5.79) (−0.13)

(4) agg_NP 0.013*** 0.032*** 0.008** 0.009(5.12) (3.61) (2.72) (1.61)

(5) Uagg_YC 0.052*** 0.028*** 0.060*** 0.035***

(18.31) (3.11) (17.63) (5.16)

(6) Uagg_NC 0.032*** 0.024** 0.037*** 0.027***

(13.10) (2.93) (12.52) (4.60)

RD investment 0.003*** 0.003*** 0.009*** 0.003*** 0.004**

(5.11) (5.05) (4.20) (3.13) (2.57)

Market share 7.065*** 7.275*** 3.378*** 9.452*** 6.881***

(19.79) (20.13) (4.14) (15.82) (10.26)

Debt ratio −0.077*** −0.075*** −0.046*** −0.083*** −0.051***

(−17.83) (−17.56) (−3.92) (−13.93) (−6.38)

Export ratio −0.084*** −0.084*** −0.052 −0.032*** −0.144***

(−13.76) (−13.72) (−1.19) (−3.97) (−15.17)

Age −0.010*** −0.011*** −0.039*** 0.003 0.030***

(−4.25) (−4.57) (−4.15) (1.22) (3.37)

Average wage 0.156*** 0.149*** 0.087*** 0.142*** 0.212***(28.27) (26.96) (4.28) (21.96) (16.02)

Constant 2.036*** 1.298*** 0.353 2.543*** −1.270

(8.13) (5.13) (0.49) (3.01) (−1.24)

Prob F 0.000 0.000 0.000 0.000 0.000

R-square 0.484 0.480 0.137 0.371 0.289

Observations 849,775 849,775 77,849 582,251 183,863

Notes:

1. All estimations include rm, time, industry and province xed effects.

2. t-statistics are in the parentheses.

3. ***, **each corresponds to signicance at 1% level and 5% level, respectively.

4. The data used for estimation is from Annual Survey of Above-scale Industrial Enterprises.

5. SOEs refer to state-owned enterprises; PEs refer to private enterprises; FIE S refer to foreign invested enterprises.

6. agg_NC is thenumber of rmsin thesame industry andsame county;agg_YCis theaverage size of rmsin thesame industry andsame county; agg_NPis thenumber

of rms in the same industry same province but outside the county; agg_YP is the average size of rms in the same industry same province but outside the county;

Uagg_NC is the number of rms in the same county but upstream industries; and Uagg_YC is the average size of rms in the same county but upstream industries.All these variables are in natural log form. Denitions of other control variables please refer to Table 5.



11/17

penalize rms in proximity with severe congestion and competition that overwhelm the localization economy. The negative effect of

stronger competition is also hinted by the positive coef cient on the average size of rms in the same industry but outside the same

county (agg_YP), which is larger than that on the average rm size within the same county (agg_YC). This suggests that the benet

from the colocation of large, competitiverms is partly offset when they are within proximity, because they can be formidable rivals

in the local product and factor markets.

Furthermore, the coef cients on the number of rms and the average rm size in upstream industries within the same county are

both positive and signicant. Moreover, those coef cients are considerably larger than those corresponding to the agglomeration of

the same industry. For instance, they imply that that doubling the number of rms in upstream industries or their average size raises

a rm's productivity by 3.2% and 5.2%, respectively. These results provide strong support to our view that agglomeration of upstream

industries is the important source of competitiveness of Chineseindustry. Furthermore, the benet of agglomeration to productivity is

less weakened by the negative impact via ercer competition when agglomeration concerns non-competing industries. Finally, it is

noteworthy that the F-statistics indicates the joint-signicance of all those additional indicators of agglomeration. This indicates

that studies that failed to incorporate agglomeration of upstream industries were indeed prone to misspecication and may have

not captured the true contribution of agglomeration to productivity.

Columns 3 to 5 report the results for the sub-samples of rms with different ownerships, namely state-owned enterprises (SOEs),

private enterprises (PEs) and foreign invested enterprises (FIEs). Interestingly, many of the observations above are driven by the

sub-sample of PEs. For instance, the signicant and negative impact on productivity from agg_NC is only observed among the PEs.

The signicant positive coef cients on both agg_YC and agg_YP are also observed only for PEs. Such results may be reecting the

fact that PEs are most exposed to market competition and are therefore most subject to negative impacts of agglomeration. They

are also most eager to absorb knowledge spillovers given their strong prot-orientation and a large room for catch-up in production

and managerial technology (Zhang, Zeng, Mako, & Seward, 2009). Interestingly, the number of rms outside the county (agg_NP)

seems to benet SOEs more than PEs.

As for the agglomeration of upstream industries, the number of rms (Uagg_NC) and average rm size (Uagg_YC) are both asso-

ciated with positive and signicant coef cients for rms with all types of ownership. This is not surprising since the benet from the

agglomeration of supporting industries such as supply of higher quality and cheaper intermediate goods should rewardrms regard-

less of their corporate behavior. However, PEs are again enjoying the largest impact, possibly because they outsource more of their

inputs. Overall, our observation reveals that PEs are the primary recipient of both positiveand negative agglomeration effects in China.

Each column also displays the contribution of control variables that is more or less consistent across ownership. Most importantly,

R&D contributes signicantly to productivity as reported by previous studies such as Wu (2006). The estimated coef cients suggest

that doubling R&D expenditure increases the rm's productivity from 0.3% to 0.9%, depending on the rm's ownership. The elasticity

is highest among the SOEs, which are observed to occupy a large share of R&D investment in China (Pilat, Yamano, & Yashiro, 2012).

This is roughly in line with the view that the return to R&D investment is shaped by a rm's capability to absorb new technology,

which is formed by previous R&D activities (Cohen & Levinthal, 1989). Contrary to our conjecture, the debt ratio and export ratio

are negatively associatedwith therm's productivity. The negative relation between export ratio and productivity is especially appar-

ent for FIEs, which engage intensively on processing trade. Firm age is also negatively associated with productivity in case of SOEs.

Lastly, county-level average wage contributes signicantly to a rm'sproductivity, suggesting that skilledlabor is indeed an important

element of competitiveness by Chinese rms.

4.2. Agglomeration effects based on the Census data

The estimation of agglomeration effects using the indicators of agglomeration constructedfrom the Census datais a cross-sectional

OLS that does not control for an unobserved rm heterogeneity. This may result in biased coef cients if such heterogeneity affects

both the productivity and likelihood of locating in agglomerated region. Also, it only observes the relationship between productivity

and agglomeration in a specic point of time, the year 2005. The estimated coef cients are therefore not comparable from those ob-

tained from the estimation using indicators based on the Survey data. Keeping those caveats in mind, we investigate whether taking

into account the agglomeration of smaller rms will alter or provide new nding on our previous results based on the Survey data.

The results fromthe estimation displayed in Table 7 are qualitatively similarto the results in Table 6. For instance, thecoef cient onthe number of rms in the same industry within the same county (agg_NC) is negative and signicant, conrming the severe conges-

tion and competition effect associated with agglomeration. The negative impact of agg_NC becomes more pronounced when agglom-

eration outside the county and that of upstream industries are incorporated (column 2). On the other hand, contrary to the results in

Table 6, the coef cient on agg_NP is also negative, albeit the limited signicance. This suggests that the agglomeration in a moderate

distance is not as benecial as previously found, possibly because the positive externality wane with distance or congestion and com-

petition effects spreads across counties. The smaller coef cient on agg_NP than that on agg_NC is still in line with theprevious nding.

As forthe averagerm size,the coef cienton agg_YC and that on agg_YP are both positivewiththe former being larger than the latter,

in line with the results in Table 6. The contribution by the agglomeration of upstream industries is conrmed by the signicant and

positive coef cient on Uagg_NC. These results support our ndings based on the Survey data. It is also noteworthy that the signs of

coef cients on control variables are also in line with the previous results except the one on export ratio.⑤

⑤ There is abundant theoretical and empirical evidence on the self-selection of productive rms into exports (for instance, Melitz, 2003, Tybout et al., 1998). The

strong positive correlation between productivity and export intensity may be due to the inability to control for the unobserved rm heterogeneity that raisesproductivity.



12/17

However, when looking across sub-samples of rms with different ownerships in columns 3 to 5, PEs are not necessarily the main

recipient of agglomeration effects. For instance, it is the SOEs that suffer the signicant negative impact from agg_NC while the coef-

cient is insignicant for PEs and FIEs. This is likely to reect an erosion of market shares and prots of SOEs due to competitions

against PEs and FIEs. While the coef cient on Uagg_NC is positive and signicant for both SOEsand PEs,the size of coef cient suggests

that SOEs benet more than PEs. Furthermore, the coef cients on the average rm size suggest that SOEs and FIEs enjoy larger

spillovers from a co-location with large rms than PEs.

4.3. Agglomeration effects from different ownerships

The difference in ownership may dene not only a rm's ability to absorb favorable agglomeration effects but also the extent to

which it acts as a source of agglomeration effects. For instance, FIEs has long been considered as the source of technology spillover.

Also, PEs, which are signicantly smaller than SOEs or FIEs, may be an active source of agglomeration effects to the extent that smaller

rms are more likely to interact with and outsource to other rms due to its limited resource (Rosenthal & Strange, 2001). On the

other hand, SOEs that are oriented toward government policies and enjoy policy-backed monopolistic power may generate little spill-

overs to other rms. In order to investigate which type of rms are the most important source of agglomeration effects, we construct

theindicators of agglomeration separatelyfor SOEs, PEs and FIEs and re-estimate themodel corresponding to the column 2 of Tables 6

and 7. We estimate the agglomeration effects to the TFP of all rms as well as to that of PEs, the latter being the main recipients of

agglomeration effects.

Table 8 summarizes the results of this exercise. For the sake of brevity, the estimated coef cients on control variables are not

displayed. Columns 1 to 3 observe how the TFP of Chinese rms is affected by the agglomeration of SOEs, PEs and FIEs. The coef cient

on agg_NC, the number of rms in the same industry within the same county, is insignicant regardless of the type of rms. This sug-

gests that the negative coef

cient observed in Tables 6 and 7 is a composite effect of agglomeration of all three types of

rms. On theother hand, the coef cient on agg_YC is positive only for the agglomeration of PEs and for that of FIEs but with limited signicance.

Table 7

Estimates of agglomeration effect based on the Census data.

Sample All (1) All (2) SOEs (3) PEs (4) FIEs (5)

Indicators of agglomeration(1) agg_YC 0.058*** 0.052*** 0.054*** 0.038*** 0.059***

(55.90) (49.89) (17.11) (31.49) (21.90)

(2) agg_NC −0.006*** −0.023*** −0.072*** −0.004 −0.004

(−3.50) (−9.98) (−6.39) (−1.45) (−0.72)

(3) agg_YP 0.066*** 0.065*** 0.062*** 0.023**(15.30) (5.74) (11.96) (2.04)

(4) agg_NP −0.006* −0.011 −0.001 0.015*

(−1.65) (−0.82) (−0.22) (1.67)

(5) Uagg_YC 0.124*** 0.178*** 0.084*** 0.128***

(34.74) (14.91) (20.61) (14.76)

(6) Uagg_NC 0.046*** 0.155*** 0.023*** −0.020***

(16.61) (15.49) (7.24) (−2.94)

R&D investment 0.129*** 0.129*** 0.187*** 0.115** 0.107**

(99.01) (99.02) (48.07) (71.83) (38.61)

Market share 68.861*** 67.599*** 46.829*** 74.325*** 70.663***

(103.65) (102.07) (28.19) (72.67) (59.67)

Debt ratio −0.172*** −0.176*** −0.272*** −0.134*** 0.006

(−25.20) (−25.89) (−12.47) (−15.82) (0.41)

Export ratio 0.167*** 0.151*** 0.511*** 0.125*** −0.073***

(22.90) (20.82) (9.02) (12.98) (−5.87)

Age −0.011*** −0.011*** −0.127*** 0.044*** 0.033***(−4.30) (−4.07) (−13.17) (15.21) (4.75)

Average wage 0.037*** 0.135*** 0.045 0.197*** 0.071***

(4.38) (14.28) (1.39) (18.21) (3.13)

Constant 5.753*** 4.594*** 3.379*** 5.124*** 4.927

(162.27) (76.72) (17.25) (74.67) (25.45)

Prob F 0.000 0.000 0.000 0.000 0.000

R-square 0.243 0.249 0.443 0.216 0.226

Observations 215,076 215,076 21,178 147,185 46,713

Notes:

1. All estimations include control variables as well as industry and province xed effects.


3. ***, **,* each corresponds to signicance at 1% level, 5% level and 10% level, respectively.

4. The data used for estimation is from 2004 Census.

5. SOEs refer to state-owned enterprises; PEs refer to private enterprises; FIEs refer to foreign invested enterprises.

6. agg_NC is thenumber of rmsin thesame industry andsame county;agg_YCis theaverage size of rmsin thesame industry andsame county; agg_NPis thenumberof rms in the same industry same province but outside the county; agg_YP is the average size of rms in the same industry same province but outside the county;

Uagg_NC is the number of rms in the same county but upstream industries; and Uagg_YC is the average size of rms in the same county but upstream industries.

All these variables are in natural log form. Denitions of other control variables please refer to Table 5.



13/17

Furthermore, the positive and signicant coef cient on agg_NP noted in column 2 of Table 6 is only observed for PEs. While the

coef cient on agg_YC is positive when agglomeration concerns all types of rms, it is largest in case of PEs. Therefore, Chinese

rms seem to benet most from the co-location of large, competitive PEs.

It is noteworthy that the important contribution from the agglomeration of upstream industries is generated primarily from PEs.

The coef cient on Uagg_NC is only positive for the agglomeration of PEs whereas it is insignicant or negative for those of SOEs and

FIEs. The coef cient implies that doubling the number of PEs in upstream industries raises the productivity of Chineserms by 4.8%.

While it is somewhat surprising that an agglomeration of FIEs in upstream sector has a negative impact on productivity, it is possible

Table 8

Agglomeration effects from different ownerships (the Survey data).

Samples All rms PEs

The source of agglomeration SOEs (1) PEs (2) FIEs (3) SOEs (4) PEs (5) FIEs (6)

(1) agg_YC −0.001 0.002*** 0.001** 0.000 0.027*** 0.006**

(−1.53) (4.16) (2.43) (0.14) (16.27) (2.48)

(2) agg_NC 0.012 −0.002 −0.004 −0.017 0.001 0.016**

(1.52) (−

0.74) (−

1.46) (−

1.60) (0.25) (2.44)(3) agg_YP 0.004*** 0.008*** 0.003*** 0.004*** 0.024*** 0.007*

(7.26) (4.94) (5.20) (6.56) (7.56) (1.92)

(4) agg_NP −0.017*** 0.016*** 0.002 −0.021*** 0.012*** −0.013*

(−8.61) (6.24) (0.69) (−8.66) (3.36) (−1.97)

(5) Uagg_YC −0.002 0.048*** 0.019*** −0.004*** 0.064*** 0.021***

(−1.50) (17.32) (10.24) (−3.04) (16.55) (2.95)

(6) Uagg_NC −0.002 0.020*** −0.052*** 0.008 0.034*** 0.015*

(−0.23) (6.51) (−13.88) (0.87) (8.28) (1.93)

Prob F 0.000 0.000 0.000 0.000 0.000 0.000

R-square 0.482 0.449 0.484 0.322 0.350 0.268

Observations 839,791 727,896 742,700 573,801 464,062 140,042

Notes:

1. All estimations include control variables as well as rm, time, industry and province xed effects.


3. ***, **,* each corresponds to signicance at 1% level, 5% level and 10% level, respectively.

4. The data used for estimation is from Annual Survey of Above-scale Industrial Enterprises.5. SOEs refer to state-owned enterprises; PEs refer to private enterprises; FIEs refer to foreign invested enterprises.

6. agg_NCis thenumber of rmsin thesame industry andsame county;agg_YC is theaverage size of rmsin thesame industry andsame county; agg_NPis thenumber


Uagg_NC is the number of rms in the same county but upstream industries; and Uagg_YC is the average size of rms in the same county but upstream industries.

All these variables are in natural log form.

Table 9

Agglomeration effects from different ownerships (the 2004 Census data).

Samples All rms PEs

The source of agglomeration SOEs (1) PEs (2) FIEs (3) SOEs (4) PEs (5) FIEs (6)

(1) agg_YC 0.013*** 0.059*** 0.018*** −0.002 0.196*** 0.007***

(10.90) (45.75) (27.75) (−1.10) (81.02) (9.31)

(2) agg_NC −0.087*** −0.012*** 0.011*** 0.038*** 0.022*** 0.012***

(−8.96) (−5.51) (3.30) (3.13) (9.04) (3.39)

(3) agg_YP 0.008*** 0.060*** 0.008*** 0.007*** 0.025*** 0.009***

(5.54) (13.19) (5.23) (4.34) (4.31) (5.54)

(4) agg_NP 0.008* −0.010*** 0.004 0.016*** 0.008* 0.018***

(1.84) (−2.58) (0.99) (3.32) (1.82) (4.44)

(5) Uagg_YC 0.030*** 0.139*** 0.042*** 0.012*** 0.095*** 0.019***(18.64) (34.75) (20.39) (7.22) (21.28) (8.50)

(6) Uagg_NC −0.022*** 0.039*** −0.015*** −0.038*** 0.029*** −0.012***

(−3.22) (13.17) (−4.68) (−4.67) (8.84) (−3.32)

Prob F 0.000 0.000 0.000 0.000 0.000 0.000

R-square 0.233 0.249 0.228 0.205 0.255 0.210

Observations 212,324 205,464 202,179 144,875 139,607 139,617

Notes:

1. All estimations include industry and province xed effects.


3. ***,* each corresponds to signicance at 1% level and 10% level, respectively.

4. The data used for estimation is from 2004 Census.

5. SOEs refer to state-owned enterprises; PEs refer to private enterprises; FIEs refer to foreign invested enterprises.

6. agg_NC isthe numberof rmsin thesame industry andsame county;agg_YC is theaverage size of rmsin thesame industry andsame county; agg_NPis thenumber


Uagg_NC is the number of rms in the same county but upstream industries; and Uagg_YC is the average size of rms in the same county but upstream industries.All these variables are in natural log form.



14/17

that it intensies competition in factor market, namely of skilled labor. On the other hand, the average size of FIEs in upstream indus-

tries has positive and signicant impact, implying that co-location of competitive FIEs in upstream industries is benecial. However,

the coef cienton Uagg_YCis even larger when it concerns PEs. Overall, wend that PEs are themain sources of agglomeration effects

in China's industry. In contrast, neither the number nor average size of SOEs in upstream industries seems to benet the productivity

of Chinese rms.

Columns 4 to 6 observe how the TFP of PEs is affected by the agglomeration of different types of rms. The overall picture looks

fairly similar to the case of allrms. It can be noted that the positive coef cients on agg_YC and agg_YP are larger when they concern

PEs than SOEs or FIEs. This indicates that PEs learn signicantly more from the competitive PEs in proximity than from other types of

rms. Furthermore, PEs enjoy positive and signicant impact from the number of PEs in the same industry outside the county

(agg_NP) but not from thatof SOEs or FIEs. Finally,like the case of all rms, only the number of PEs in upstreamindustries contributes

positively to the productivity of PEs, but with larger magnitude (6.4% productivity increase when doubling the number of PEs).

Overall, it can be said that PEs benet most from the agglomeration of PEs. The agglomeration effects from PEs are self-reinforcing

in that sense.

We also observe whether the indicators based on the Census data provide alternative picture. The results, summarized in Table 9,

reveal some interesting new ndings. For instance, columns 1 to 3 indicate that the negative impact from agg_NC is driven mostly by

the agglomeration of SOEs and PEs, whereas the number of FIEs in the same industry has favorable impact. This suggests that the

clustering of small domestic rms is generating substantial congestion and competition, whereas FIEs do not add strong competition

pressure in domestic product markets possibly because they are more export-oriented. PEs are also the primary source of congestion

effects across counties, for a negative and signicant sign on agg_NP is only associated with its agglomeration. On the other hand, the

coef cients on average rm size are largest in case of an agglomeration of PEs, in line with Table 8. Similarly, the positive coef cient on

Uagg_NC is still observed only for PEs.

Turning to columns 4 to 6, wend that the coef cients on agg_NC are positive and signicant for the agglomeration of all types of

rms. This implies that thenegative net impact from the number of rmsin thesame industry within thesame county observedin the

column 2 of Tables 6 and 7 is sufferedby other types of rmsthan PEs. However, such results should be interpreted with care, for they

may be driven partly by theunobservedrm characteristics that increase both the productivity of PEs and their propensity to locate in

agglomerated areas. Other notable ndings in Table 8 are also preserved: PEs benet most from the average size of PEs in the same

industry and in upstream industries and only the number of PEs in upstream sector contribute to their productivity.

4.4. Robustness check

We assess the robustness of our ndings based on Survey data corresponding to the column 2 of Table 6. We do so by reproducing

the estimation using alternative estimates of rm-level productivity as well as with different industrial classication and regional unit.

The results of this exercise are summarized in Table 10.

Table 10

Base estimates with alternative specications.

TFP_OLS (1) TFP_OP (2) Labor productivity (3) 4 Digit industries (4) 2 Digit industries (5) City level (6)

agg_YC 0.003*** 0.004*** 0.004*** 0.002*** 0.004*** 0.005***

(7.89) (7.97) (9.50) (5.93) (6.45) (7.26)

agg_NC −0.008*** −0.011*** −0.013*** −0.011*** −0.004 −0.003

(−3.45) (−4.41) (−5.47) (−4.39) (−1.56) (−1.10)

agg_YP 0.011*** 0.010*** 0.014*** 0.005*** 0.031*** 0.005***

(6.45) (5.36) (7.80) (4.83) (7.93) (6.33)

agg_NP 0.015*** 0.012*** 0.012*** 0.009*** 0.024*** 0.011***

(6.42) (4.88) (4.96) (4.51) (6.17) (4.42)

Uagg_YC 0.058*** 0.052*** 0.054*** 0.049*** 0.047*** 0.038***

(21.03) (18.43) (19.57) (17.18) (16.32) (7.47)Uagg_NC 0.036*** 0.032*** 0.029*** 0.033*** 0.031*** 0.072***

(15.23) (13.33) (12.21) (13.75) (11.86) (16.44)

Constant −1.292*** 1.852*** 2.453*** 1.580*** 0.939*** 0.999***

(−5.27) (7.36) (10.03) (6.26) (3.39) (3.88)

Prob. N F 0.000 0.000 0.000 0.000 0.000 0.000

R-square 0.062 0.440 0.043 0.471 0.332 0.461

Observations 849,771 849,285 847,308 849,771 849,771 849,776

Notes:

1. All estimations include control variables as well as rm, time, industry and province xed effects.


3. ***corresponds to signicance at 1% level.

4. The data used for estimation is from Annual Survey of Above-scale Industrial Enterprises.

5. agg_NC is thenumber of rmsin thesame industry andsame county;agg_YCis theaverage size of rmsin thesame industry andsame county; agg_NPis thenumber


Uagg_NC is the number of rms in the same county but upstream industries; and Uagg_YC is the average size of rms in the same county but upstream industries.All these variables are in natural log form.



15/17

First, we usethe total factor productivity estimated by OLS (column 1). The novelnding that agglomeration captured as the num-

ber of rms in thesame industry impacts theproductivitynegatively when it is within the same county remains valid. Also the sizable

positive impact from the agglomeration in upstream industries, as well as the larger coef cients on agg_YP compared to agg_YC are all

preserved. We then use the TFP estimated by the method of Olley and Pakes (1996), obtaining similar results (column 2). Because the

estimated TFP may pick up the difference in prices and mark-up rates amongrms, we also check if our ndings are robust to another

measure of productivity, namely a rm's labor productivity measured as value-added per employee. Again, we obtain very similar

results (column 3).

We next assess the sensitivity of our results to industrial classication. The alternative indicators of agglomeration are constructed

under ner industrial classication (four-digit) as well as rougher classication (two-digit). Column 4 displays the results of

estimation based on four-digit industrial classication, which is very similar to the base results. In the estimation based on

two-digit industrial classication, however, the coef cient on agg_NC becomes insignicant while remaining negative (column 5).

One interpretation to this change is that under a rougher classication, agg_NC may count rms that are not actually competing in

the same industry, thereby diluting the negative impact from intensied competition. All other observations are preserved.

Finally, we construct the indicators of agglomerationat alternative regional unit, namely at city level. A city is a largerregional unit

than a county, for it usually comprises several counties. The results presented in column 6 are somewhat similar to the case of

indicators constructed on basis of two-digit industrial classication. The negative coef cient on agg_NC becomes insignicant while

all other observations are preserved. One explanation to this change is that congestion is less severe under a larger geographical

scope of industrial concentration. Put differently, agg_NC in this case comprises some of rms previously counted as agg_NP, which

has signicantly positive coef cient.

5. Contribution to the industry-level productivity growth

How much did the industrial agglomeration contributed to the productivity growth of China's industrial sector? We exploit the

production function used to compute rm-level TFP we described in Section 3.3 and the coef cients on indicators of agglomeration

in column 2 of Table 6 to compute the share of TFP growth during 2000–2007 that is explained by agglomeration. Note that the

data used for this exercise are based on the Survey data, given the long time-span needed. First, we obtain the industry-level TFP

by plugging the employment and capital stock of each industry to the estimated production function which has different input

coef cients for each industry (see Section 3.3). We then aggregate the industry-level TFP using output shares as weights to obtain

the aggregated TFP of the whole industrial sector. The average annual growth rate of such aggregated TFP, displayed in rst column

of Table 11, is 8.23%. The same column displays the average annual TFP growth of the private sector obtained from the same exercise

but using only the sample of PEs, which is 7.71%.

The aggregate TFP growth due to agglomeration effects is computed by plugging the industry-level mean of each indicator of

agglomeration during 2000–2007 into the Eq. (15), holding other variables constant. The estimated coef cients are from column 2

of Table 6. This yields the predicted TFP growth for each industrial sector j:

ΔT FP j;t ¼ β̂ 11Δt agg YC j;t þ β̂ 12Δt agg NC j;t þ β̂ 13Δt agg YP j;t þ β̂ 14Δt agg NP j;t

β̂ 21Δt Uagg YC j;t þ β̂ 22Δt Uagg NC j;t þ β̂ 23Δt Uagg YP j;t þ β̂ 24Δt Uagg NP j;t ð15Þ

where the upper bar on each indicator of agglomeration indicates industry-level mean during 2000–2007 and the hat on each of co-

ef cients indicates that they are estimates from the model. We once again aggregate those industry-level TFP growth using output

share as weights to obtain the aggregate TFP growth of industrial sector predicted by agglomeration effects. Such predicted growth,

shown in the second column of Table 9, is 1.16%, which is equivalent to 14.09% of the average annual TFP growth of total industrial

sector. The contribution share of agglomeration is larger in private sector, amounting to 15.43%. It is also seen that the contribution

share of agglomeration is smaller but still comparable to that of R&D investment computed following the same steps. Although not

captured by the Survey data, the private sector of China's industry consists of numerous small rms thatdo not engagein R&D invest-

ment due to resource constraint. For those rms, agglomeration effects can be the primary driver of productivity growth.

Table 11

Contribution of agglomeration to industry-level TFP growth during 2000–2007.

Average annual TFP growth rate

of industrial sector (percentage)

Agglomeration effect R&D

Contribution to average

growth rate (percentage)

Contribution share

(percentage)

Contribution to average

growth rate (percentage)

Contribution share

(percentage)

All rms 8.23 1.16***

(331.81)

14.09 1.88***

(25.00)

22.84

Private Enterprises 7.71 1.19***

(230.73)

15.43 1.53***

(10.53)

19.84

Notes:

1. t-statistics are in parentheses.

2. *** indicates signicant at 1% level.3. The data used for computation is from the coef cients on indicators of agglomeration in column 2 of Table 6 and Annual Survey of Above-scale Industrial Enterprises.



16/17

6. Conclusions

This paper cond

agglomeration and productivity in china firm level evidence

Documents