efﬁcient in-network computing with noisy wireless channels · 1 efﬁcient in-network computing...

1

Efficient In-network Computing with NoisyWireless Channels

Chengzhi Li, Student Member, IEEE, and Huaiyu Dai, Senior Member, IEEE

Abstract—In this paper we study distributed function computation in a noisy multi-hop wireless network. We adopt the adversarialnoise model, for which independent binary symmetric channels are assumed for any point-to-point transmissions, with (not necessarilyidentical) crossover probabilities bounded above by some constant ϵ. Each node takes an m-bit integer per instance and thecomputation is activated after each node collects N readings. The goal is to compute a global function with a certain fault tolerance inthis distributed setting; we mainly deal with divisible functions, which essentially cover the main body of interest for wireless applications.We focus on protocol designs that are efficient in terms of communication complexity. We first devise a general protocol for evaluatingany divisible functions, addressing both one-shot (N = O(1)) and block computation, and both constant and large m scenarios. Wealso analyze the bottleneck of this general protocol in different scenarios, which provides insights into designing more efficient protocolsfor specific functions. In particular, we endeavor to improve the design for two exemplary cases: the identity function, and size-restrictedtype-threshold functions, both focusing on the constant m and N scenario. We explicitly consider clustering, rather than hypotheticaltessellation, in our protocol design.

Index Terms—Distributed Computing, Noisy Multi-hop Network, Clustering

F

1 INTRODUCTION

1.1 Motivation

Networked systems of intelligent devices are playingan increasingly important role in our life. In particular,they will facilitate monitoring and control of the nation’scritical infrastructures; seamless surveillance, intelligenttransportation, and secure Internet are a few such exam-ples.

Designing efficient protocols to facilitate informationprocessing among distributed nodes is crucial to the suc-cess of these networked systems. While there has beenextensive research in traditional distributed computing[1], [2], [5], the influence of channel noise is largelyignored. In previous studies targeting VLSI or wirelinenetworks, noise-free communications can be fairly as-sumed; instead some consideration was given to the faulttolerance of collapsed or Byzantine nodes. However,as we deal with networked information processing inwireless networks, consideration of noisy channels be-comes necessary. Usually a protocol originally designedfor noiseless channels fails in the noisy communicationdue to messages errors and discrepancy in individualinterpretations of the communication history. Further-more, the problems of interest in wireless applicationsare usually different; here we are interested in somereal (possibly vector) functions of the data at all nodes,typically with physical meanings. A good example is thesummary or statistics of collected data in wireless sensornetworks.

C. Li was with NC State University, Raleigh, NC. He is currently with Broad-com Corporation, Matawan, NJ, 07747. Email: [email protected]. H.Dai is with the Department of Electrical and Computer Engineering, NCState University, Raleigh, NC 27695. Email: [email protected].

Devising communication protocols with noisy chan-nels imposes additional challenges. In some sense, itis the counterpart of Shannon’s channel coding in themuch more challenging network setting. In general,increase in complexity is inevitable even if a constanterror tolerance is allowed, and any protocols workingin the noisy environment should make this penalty assmall as possible. Meanwhile, it is also required that theprotocols be oblivious, i.e., the transmission schedule ofnodes be pre-determined, independent of initial inputsand the communication history; this avoids transmissioncontention and out-of-order execution.

Time and message complexity are two key measuresfor the efficiency of distributed computing protocols. Inthis work, we concentrate on the latter, as communica-tion cost is typically a dominant factor in the energyconsumption and determines the life time of wirelessnetworks. Also, we focus on the bit complexity, repre-senting fundamental limits in the theoretical approaches.This naturally draws the connection with the theory ofcommunication complexity [3]–[5]. Like computationalcomplexity, communication complexity is an inherentproperty of a problem; it measures the hardness of aproblem in terms of the communication (rather than theexecution time) required for the most efficient solution.

For a noiseless-channel protocol with communicationcomplexity of n, one could repeat transmission of eachbit O(log(n/Q)) times and take the majority of the receiv-ing results; this approach, termed standard amplificationin literature, leads to a noisy-channel protocol of com-plexity O(n log(n/Q)) with error tolerance Q. Obtainingnoisy-channel protocols with less increase in complexity,especially those of complexity O(n), is highly non-trivial,and in some cases impossible. In this paper, we consider

2

efficient protocol designs for computing divisible func-tions in a noisy multi-hop network, which constitutes themain body of interest for applications in wireless senornetworks. Our contributions are summarized as follows:

• We devise a general protocol for evaluating anydivisible function in a noisy multi-hop wirelessnetwork. The complexity of the general protocol isanalyzed with respect to various parameters such asthe number of nodes in the network n, the lengthof data blocks at each node N , the size of eachdata point in bits m, and the cardinality of thefunction range rn. For some specific functions suchas histogram and parity, our protocol achieves thebest results available in literature. The analysis onthe bottleneck of this general protocol in differentscenarios motivates further studies on more efficientprotocols.

• We endeavor to improve the design for two spe-cial cases: the identity function, and a special classof restricted type-threshold functions introduced inSection 2. We reveal a tight bound on the com-munication complexity of the identity function andpropose a more efficient protocol for the special classof type-threshold functions. We also believe that themethodologies developed in these two cases mayassume wider applicability.

• We incorporate clustering techniques into our pro-tocol design to improve protocols’ efficiency, whichis more practical and flexible than network tessel-lation, a theoretical approach widely adopted inliterature.

1.2 Related works

Communication complexity of distributed computingwith noisy channels was first considered by El Gamal in[7] in a broadcast network, where each of the n nodes,holding one binary input, can broadcast (and listen) to allothers through independent binary symmetric channels.Gallager showed that complexity O(n log log n) is achiev-able for computing any functions in a noisy broadcastnetwork [8], which was further shown optimal for theidentity function in [14]. In [15], Yao posed the questionwhether there exist nontrivial Boolean functions that canbe computed with O(n) broadcasts; this is answeredin the affirmative for the threshold functions in [16]with the independently and identically distributed (i.i.d.)random noise model, and in [9] for the OR function withthe more realistic (and more general) adversarial model(where a “benign” adversary is allowed to arbitrarilyreduce the error probability of each link at the beginning,or even dynamically cancel any errors on the fly, so as toprevent the protocols from exploiting the stochastic reg-ularities of the previous i.i.d. model). While most workin this area focuses on computing Boolean functions ofbinary variables, our recent work [18] extends the studyto finding the K largest integer values among nodes in a

TABLE 1: communication complexity in noisy broadcastnetworks (mN = O(1))

Function Upper bound Lower boundOR O(n) [9] O(n)parity O(n log logn) [8] O(n)identity O(n log logn) [8] O(n log logn) [14]threshold function O(n) [16] O(n)

noisy broadcast network. The results in noisy broadcastnetworks are summarized in Table 1.

Due to concerns on energy consumption and scalabil-ity, transmissions are typically carried out in a multi-hopfashion in wireless networks. [6], [24] and [10] focusedon noiseless sensor networks, i.e., the communicationchannels are assumed reliable. In [6] the authors ex-ploited block computation to study the communicationcomplexity of evaluating symmetric functions in randomnetworks with unbounded node degrees, while [24]investigated the computation problem in networks withfinite degrees. In [10] the authors explored the minimaltime cost and power consumption for the evaluation ofthe max function.

Distributed computing with a reliability constraint innoisy multi-hop networks is arguably more challengingthan in noiseless multi-hop networks and noisy broadcastnetworks. In particular, it takes more efforts to combatthe adverse effect of noise, as much more severe errorpropagation is expected in multi-hop transmission. Sofar there are only a few works in this area. [11] indi-cated that a symmetric function can be evaluated withcomplexity O(n log log n), which can be further improvedto O(n) if block coding is adopted. Note that we addressdivisible functions in this work, including histogram andthe identity function. Therefore our protocol can com-pute any symmetric functions (through evaluation ofhistogram), even any functions (through evaluation ofthe identity function). An algorithm for the max functionis proposed in [13], taking advantage of the “witnessdiscovery” protocol in [9] and the coding strategy in[12], and shown order optimal in both the number oftransmissions and computation time.

Our scheme explores hierarchical cooperation to im-prove the efficiency of distributed computing in wirelessnetworks. Similar ideas were also explored in [30], [31]for the study of network capacity, where a networkis divided into different layers and distributed MIMOis exploited for performance improvement. Besides theapparent difference in problem settings, our work em-phasizes using as few bits as possible to complete com-putational tasks, while in [30], [31] the main goal is toincrease the transmission rate. Nonetheless, advancedsignal processing such as MIMO technique may help de-sign more efficient protocols for distributed computingand deserves further study.

The remainder of this paper is organized as follows.The system model is given in Section 2, together withsome preliminaries and the summary of our main re-sults. The outline of our protocol design and some prop-

3

erties of clustering are presented in Section 3. A generalprotocol is proposed for divisible functions in Section 4,with performance analysis and further discussion. Basedon the insight obtained from Section 4, in Section 5 and6, more efficient protocols are proposed for the identityfunction and some restricted type-threshold functions,respectively. The conclusion and future directions areprovided in Section 7.

2 PROBLEM FORMULATION AND MAIN RE-SULTS

In this section, we first discuss the models and as-sumptions considered in this work, then introduce someexisting results to serve our analysis1.

2.1 System Model

We consider a synchronized dense network model2,where n nodes are uniformly and independently dis-tributed in a unit square. In our following analysis wemodel the network as a Geometry Random Graph [19]G(n, tn); n nodes are vertices, and there is an undirectedlink between node i and j , i, j ∈ 1, . . . , n , [n],if ||Pi − Pj || ≤ tn, where Pi is the position of nodei, || · || is Euclidean norm and tn is the transmissionrange, identical for all the nodes. Each node i holdsan m-bit integer xi(t) at time t, taking values fromsome finite set χ , [|χ|] (|χ| ≤ 2m). Each m-bit integercould be a measurement in the field, e.g., temperature,or some metrics assigned beforehand. The computationis performed after each node collects a block of Nreadings. The goal is to calculate a divisible functionf(x(t)) , f(x1(t), x2(t), ..., xn(t)) correctly with a certainfidelity in this distributed setting. Divisible functions aredefined as [6]:Divisible Function: A function f is divisible if

• rn = |R(f, n)| is nondecreasing in n, where R(f, n)is the function range;

• given any partition Π(S) = S1, S2, ..., Sp of S ⊂ [n],there exists a function gΠ(S), such that for any x ∈χn

f(xS) = gΠ(S)(f(xS1), f(xS2

), ..., f(xSp)), (1)

where xS = xi, i ∈ S.

1. We use the following order notations throughput our paper. Givennon-negative functions f(n) and g(n):

• f(n) = Ω(g(n)) if there exists a positive constant c1 and aninteger k1 such that f(n) ≥ c1g(n) for all n ≥ k1.

• f(n) = O(g(n)) if there exists a positive constant c2 and aninteger k2 such that f(n) ≤ c2g(n) for all n ≥ k2.

• f(n) = Θ(g(n)) if f(n) = Ω(g(n)) and f(n) = O(g(n)).• f(n) = o(g(n)) if there exists a positive constant k3 such that

f(n) ≤ c3g(n) for all c3 > 0 and n ≥ k3.• f(n) = ω(g(n)) if there exists a positive constant k4 such that

f(n) ≥ c4g(n) for all c4 > 0 and n ≥ k4.

2. Our results hold for the extended network model [32] as well,where the network space increases with the network size while thenetwork density is fixed.

Divisible functions essentially cover the main body ofinterest for distributed computing in wireless networks;identity function, histogram3, parity, mean, max/minare a few examples. Since divisible functions includethe histogram and identity function, a general protocolevaluating divisible functions can also compute all thesymmetric functions, i.e, functions invariable to the per-mutation of their arguments and depending on the inputonly through their histograms, and actually all functions(through the identity function).

Without loss of generality, we assume that the result ismade known to a special node, named sink node. Thisis common for applications in sensor networks; whennecessary, the result at the sink node can be distributedto the whole network, typically at a similar complexityas what is shown below.

We assume identical transmission power P for eachnode. Each point-to-point transmission is disrupted bythe interference from concurrent transmissions and ther-mal noise. To constrain the interference, we adopt theProtocol Model widely used in literature [19]. Namely,node R(i) receives the (noisy) transmission from node iif

• the distance between the transmitter and the re-ceiver is no more than tn, i.e., ||Pi − PR(i)|| ≤ tn;

• for every node k, k = i, transmitting at the sametime, ||Pk − PR(i)|| ≥ (1 + ∆)tn, where ∆ is theprotocol-specified guard-factor to limit interference.

To deal with noise we adopt the adversarial noise model[9], for which independent (but not necessarily identical)binary symmetric channels are assumed for any trans-missions, with crossover probabilities bounded aboveby some constant ϵ. More precisely, for any bit vectorI ∈ 0, 1k the received noisy copy I ′ = I XOR n,where n ∈ 0, 1k is an independent noise bit vector withPr(ni = 1) = pi ≤ ϵ. In other words, for any transmissionthe received bit is flipped with some probability nolarger than ϵ. In this paper, it is assumed that ϵ < 1/4for technical convenience.

2.2 Preliminaries

In this subsection we introduce a few known results,beginning with a generalization of the Chernoff bound.

2.2.1 Hoeffding inequality [25]

Lemma 1: Let Xi (1 ≤ i ≤ n) be n i.i.d. randomvariables over the interval [a, b]. For the sum of thesevariables S =

∑ni=1 Xi, we have the inequity

Pr (S − E(S) ≥ nδ) ≤ e−2nδ2

b−a , δ > 0.

3. The histogram of a vector x ∈ χn is defined as τ(x) =[τ1(x), τ2(x), ..., τ|χ|(x)], where τi(x) := |j : xj = i|, the numberof occurrences of i in x.

4

2.2.2 Constant-rate, Constant-fraction-Minimum-distance Codes [26], [14]

Lemma 2: For γ ∈ (0, 12 ) there is an integer C1 = C1(γ)

such that for each positive integer n, ∀C ≥ C1, there isa binary code Γn of size 2n and length Cn, such thatfor all v, w ∈ Γn with v = w, the hamming distanced(v, w) ≥ γCn.

Based on the above facts, the following result can bederived, which will be extensively used in our protocoldesign and analysis.

Corollary 1: With O(m) transmissions over a binarysymmetric channel (with error probability boundedabove by a constant ϵ), an m-bit integer can be correctlyreceived by a receiver with probability at least 1− e−δm,∀δ > 0.The proof is given in Appendix B.

Remark 1: Any positive value can be chosen for δwithout affecting the scaling law. Also, m does not needto be large.

2.2.3 Tail bounds for sums of noise bits [14]Lemma 3: Let n = ni be a vector containing k

independent random variables, with each taking value 1with probability pi ≤ ϵ. For any nonzero vector α ∈ Rk,and t > 0 we have

Pr

[∑i

αi(ni − ϵ) > t

]≤ e−2t2/|α|2 .

This lemma is a straightforward generalization ofLemma 12 in [14], where i.i.d. noise components areassumed.

2.3 Overview of Main ResultsIn this paper, we present three protocols: a generalprotocol for computing all divisible functions, and twospecific efficient protocols, one for the identity functionand one for a special class of type-threshold functions.The metric of interest is communication complexity ofthe protocol, i.e., the total number of bits exchangedin the network to get the task completed. We studythe scaling law of this metric with respect to variousparameters including n, N , and m. Note that this metricis closely related to energy consumption; in particular, itcoincides with the energy usage in [11] when multipliedby a factor tαn , where α > 2 is the path loss exponent.

We address both one-shot (N = O(1)) and blockcomputation, and both constant and large m scenarios4.Our results are summarized below.

Theorem 1: The sink node can compute any divisiblefunctions correctly with probability at least 1 − o(1), atthe cost of

O

(n

Nmax(Nm, log logn) +

max(N log rn, logn)n

N logn

)(2)

4. large m has relevance, e.g., when node identities are used asinputs, a common practice in the study of distributed computing.

transmitted bits per instance.Remark 2: For some functions the general protocol ac-

tually achieves the best results known to us in literature,while for some others improvement can still be obtainedin certain scenarios. Besides being interesting in its ownright, this general protocol also provides insights into thedesign of more efficient protocols for specific functions.

As two exemplary cases, we further propose moreefficient protocols for the identity function and a specialclass of restricted type-threshold functions, both focusedon the one-shot case with constant-size data.

For the identity function, we have

Theorem 2: For constant m and N , the communicationcomplexity of computing the identity function, f(x) =x, by the sink node correctly with probability at least1− o(1) is Θ(n

√n

logn ).Remark 3: The identity function corresponds to the

sink node collecting all the raw data held by the nodes,which enables the sink node to calculate any functions.Therefore, the complexity of any functions evaluated ina noisy multi-hop network is O(n

√n

logn ).Our final interest is on type-threshold functions de-

fined as [6]:Type-Threshold Function: A function f(·) is said to be type-threshold if there exists a nonnegative |χ|-dimensionalvector θ, called the threshold vector, such that f(x) =f ′(τ(x)) = f ′(min(τ(x), θ)), for all x ∈ Xn, where τ(x)is the histogram of x and min is understood as element-wise minimum.

Remark 4: Type-threshold functions are a subset ofsymmetric functions 5, whose values are solely deter-mined by the histograms of the input data. For example,the max function is a type threshold function with athreshold vector θ = [1, 1, ..., 1]; we can find the maxi-mum among the inputs by searching the first non-zeroposition from right in the vector min(τ(x), θ).

This definition of the type-threshold function impliesthat the value of a type-threshold function can be de-termined by a subset xS of the total inputs, i.e., f(x) =f(xS). For example, as to the max function, S could beany subset that contains the index i such that xi is themaximum of the input. Of course, this subset is typicallyinput-dependent and not known a priori. However, itis possible to design efficient protocols that lead to thediscovery of this subset. To this purpose, we furtherimpose restriction on the size of S, and define the size-restricted type-threshold functions as follows:

Size-Restricted Type-Threshold Function: A type-thresholdfunction f(·) is said to be K-restricted if there exists asubset S with its size bounded above by a known valueK, such that f(x) = f(xS).

Remark 5: Note that only the size of subset xS isneeded to be known, instead of its content. Fox example,

5. Although type-threshold functions are not necessarily divisible,our general protocol still serves as a benchmark for them throughevaluation of histogram.

5

TABLE 2: communication complexity without block cod-ing (mN = O(1))

Function Upper bound Lower boundhistogram (Sec. 4.3) O(n log logn) O(n)parity (Sec. 4.3) O(n log logn) O(n)

identity (Sec. 5) O(n√

nlogn

) O(n√

nlogn

) [10]

max (Sec. 6) O(n) O(n)K(= o(logn)) largestnumbers (Sec. 6)

O(n log logK) O(n)

TABLE 3: communication complexity with block coding(mN = Ω(log log n)) for all the divisible functions

Function Upper bound Lower boundrn = O(nm) (Sec. 4.3) O(nm) O(nm)

rn = Ω(nm) (Sec. 4.3) O(n log rnlogn

) O(nm)

|S| = 1 for the max function and the indicator function,and |S| = K for the function that computes the Klargest values (which need not be distinct). However, thefunction that computes the Kth largest value (requiredto be strictly smaller than the (K − 1)th one) is notsize-restricted, even though it is also a type-thresholdfunction; the reason is that the size of its defining subsetS can not be pre-determined but depends on the inputinstead.

For the size-restricted type-threshold function, wehave

Theorem 3: A size-restricted type-threshold functionwith K = o(log n) can be computed by the sink nodecorrectly with probability at least 1 − δ for an arbitrarysmall constant δ, at the cost of O(n log logK) transmittedbits for constant m and N .

Remark 6: This is the best result known to us inliterature for this type of functions. For K = O(1)the size-restricted type-threshold functions admit linearcomplexity, which is tight. For K = ω(1) the nonlinearfactor log logK grows very slowly compared with n.

We summarize our results (achievable communicationcomplexity) for some interesting functions in Table 2 and3, together with lower bounds best known in literature.The linear lower bounds are trivial so no references aregiven.

3 OUTLINE OF PROTOCOL DESIGN

As a common approach in the study of wireless net-works, we take a layered approach in our protocol de-sign for in-network computing. Each individual protocolis composed of an inter-cluster protocol and an intra-cluster protocol: the former one is employed for localcomputation and the latter one for data aggregation. Theintuition behind this layered design is as follows. Gener-ally speaking, functions can be evaluated more efficientlyin broadcast networks than in multi-hop networks sincemulti-hop transmission may incur additional errors. Itis therefore beneficial to partition the whole multi-hopnetwork into local broadcast networks as few as possible.Functions are first evaluated in local broadcast networks

and then the local (partial) results are aggregated andsent to the sink node. One application of this idea istessellating the network into regular cells [19], which isalso widely used in literature [10], [11]. However it ishighly non-trivial, if possible at all, to implement tessel-lation in practice. As an alternative, clustering techniquesare more flexible and practical even though each clusteris not necessarily a broadcast network 6. Our protocolswork on clustered networks without degrading the per-formance in terms of scaling law, compared with cell-based ones.

In our analysis we assume that clusters are formed sothat

(1) each node uniquely belongs to one cluster;(2) in each cluster i there is a cluster head hi, which can

communicate with any other nodes in this cluster;(3) ||Phi − Phj || ≥ tn, ∀i, j.

One clustering approach satisfying the above assump-tions can be found in [28] (c.f. Appendix A). Clearly,the set of cluster heads is a dominating set7. Searchingthe minimal dominating set is an NP-hard problem[27], which, however, is not required in our study. Twoclusters c1 and c2 are neighbors if there exist nodes x andy in c1 and c2 respectively with ||Px − Py|| ≤ tn, andx, y can serve as relay nodes. If multiple pairs of suchnodes are available, one pair is randomly chosen. Twoclusters c1 and c2 are potential interfering neighbors ifthere exist nodes x and y in c1 and c2 respectively with||Px − Py|| ≤ (2 + ∆)tn. A typical clustered network isshown in Fig. 1. Denote by Nc, ni, Bi and Di, the numberof clusters in the network, the number of nodes in clusteri, the number of neighbors and the number of potentialinterfering neighbors of cluster i, respectively.

Lemma 4: Given the transmission rangetn = Θ(

√log n/n), which guarantees the full

connectivity of the network, and a clustering thatsatisfies the assumptions indicated above, we have,with high probability8 (w.h.p.):

a) Nc = Θ( nlogn );

b) ni = Θ(log n);c) 1 ≤ Bi ≤ Di < K, where K is a constant.

Proof: Since the cluster head is one hop away fromother nodes within the same cluster the area of eachcluster is upper bounded by the circle centered at thecluster head with radius tn. Thus, Nc ≥ 1

πt2n= Ω( n

logn )

and ni ≤ 2nπt2n = O(log n), w.h.p., according to thecental limit theorem. Since for any two cluster heads||Phi − Phj || ≥ tn, there is no overlapping among circlescentered at the cluster heads with radius tn/2, whichindicates that Nc ≤ 1

π(tn/2)2= O( n

logn ). a) is proved. It isknown that the degree of a node in a geometric random

6. In a cluster satisfying our assumptions below, one node may reachanother one through two hops.

7. A dominating set for a graph G = (V,E) is a subset D of V suchthat every vertex not in D is joined to at least one member of D bysome edge.

8. The probability approaches 1 when n goes to infinity.

6

Fig. 1: a clustered multi-hop network

graph scales as Θ(log n), w.h.p. [22], thus ni ≥ Θ(log n).b) is then proved.Bi ≥ 1 due to the fact that the transmission range

tn guarantees full connectivity of network. Since twoneighboring clusters must also be potential interferingneighbors, while the inverse is not true, Bi ≤ Di. Thedistance between the cluster heads of two potential in-terfering neighbors is at most (1+2+∆+1)tn = (4+∆)tn.Thus, all the interfering neighbors of cluster i are locatedwithin the circle centered at hi with radius (5 + ∆)tn.Di <

4π(5+∆)2t2nπt2n

, K due to the fact that there is nooverlapping among circles centered at the cluster headswith radius tn/2. Hence the proof is competed.

Note that although these properties coincide withthose when tessellation is applied, clustered networkslack the regular structure of tessellated networks, whichcomplicates the protocol analysis. Clustering approachesare out of the scope of this paper; in the following dis-cussion we assume clusters satisfying our assumptionsand properties in Lemma 4 are already formed.

As we mentioned at the beginning of the section, inclustered networks the protocol design is decomposedinto intra-cluster and inter-cluster ones. Some existingprotocols specific for noisy broadcast networks mayfacilitate the intra-cluster protocol design, however, thedifficulty still remains to bound the total error probabil-ity since there are unbounded number of clusters in anetwork. The inter-cluster protocol aims at aggregatingthe local results calculated within clusters and forward-ing them to the sink node through a sink tree T (Fig. 2),which is a spanning tree of graph G′, a subgraph of Gcomposed of all the cluster heads and some relay nodes.Caution should be taken to prevent error propagationin the inter-cluster protocol design. In the following, weassume T is generated and known to all the cluster headsand relay nodes before the computation starts.

The protocols are executed according to a properscheduling scheme described as follows. We color the

Fig. 2: a sink tree rooted at the sink node. The directedline shows the information flow from the leaves to thesink node.

clusters in a way that each cluster is colored differentlyfrom its interfering neighbors. The clusters with the samecolor are scheduled (active) simultaneously and nodesin the active clusters can transmit to any nodes withintheir transmission range. Note that 1) this schedulingscheme only constrains the interference from concurrenttransmissions and the influence of the noise should stillbe taken into consideration; 2) there may exist a bet-ter scheduling scheme to improve the time complexity,which is not our concern in this paper.

4 A GENERAL PROTOCOL FOR DIVISIBLEFUNCTIONS

In this section we introduce a general protocol for com-puting any divisible function. On the practical side, ageneral protocol can deal with various functions andthus simplify implementation. And on the theoreticalside, a general protocol serves as a uniform platformto investigate the distributed computing problem, andits bottleneck analysis provides significant insights intopotential performance improvement.

4.1 ProtocolThe proposed protocol is composed of an intra-clusterprotocol and an inter-cluster protocol, corresponding totwo sequential stages: local processing and data aggre-gation, respectively. The intra-cluster protocol evaluatesthe function over the data within the cluster throughthe coordination of cluster head hi, which takes (much)more responsibility besides its normal role as othernodes in the protocols. In the inter-cluster protocol, thelocal (partial) results are aggregated and routed throughcluster heads and relay nodes to the sink node, withpossible further processing along the way.

The intra-cluster protocol is executed in each clusterwhen it is active, and nodes in the active cluster take

7

turns (in a fixed order) to transmit their information,which conforms to the requirement that the protocolsdealing with noisy communications be oblivious. For theith cluster, assuming that its members are ordered from 1to ni, the intra-cluster protocol is executed in two phases:

Intra-cluster protocol:(i) Each node in the ith cluster encodes its data,

composed of N observations, into a codeword withsize O (max(Nm, log(ni))) and transmits it to thecluster head hi. Then hi decodes the noisy copyof the codeword, re-encode the Nm-bit raw datainto a codeword with size O (max(Nm, log(ni)))and broadcasts it. This phase ends when the clusterhead broadcasts all the data in the cluster once.

(ii) After decoding the codewords from the clusterhead, each node computes the function and en-codes the output (N log rni bits) into a codewordwith size s = O(max(N log rni , ni)). The codewordis padded with 0 to a length of an integral multipleof ni, and then partitioned equally into ni blocks.Then each node j in the ith cluster takes turnto broadcast the jth block of its own codeword9.The cluster head hi collects all the blocks from itscluster and decodes it.

After executing the intra-cluster protocol, local resultswith certain reliability are obtained by the cluster heads.Then the local results are aggregated and transmitted tothe sink node on a sink tree described in Section 3 (seeFig. 2). The data aggregation process is performed by theinter-cluster protocol, where the information flow is fromthe leaves (which are all cluster heads) to the root (thesink node) and each vertex on the tree only transmitsonce.

Inter-cluster protocol:(i) The leaves encode the results they obtain into

codewords with size O(max(N log rn, logn)) andtransmit them to their parent nodes on the tree.

(ii) Intermediate vertices on the tree, either clusterheads or relay nodes, perform decoding-fusion-forward: first decode the codeword(s) they receive,then fuse the local results according to the defini-tion of divisible functions, and finally forward thenew results to their parent nodes after encoding itinto codewords with size O(max(N log rn, log n)).

4.2 AnalysisWe analyze the performance of the intra-cluster protocolfirst.

Proposition 1: For any cluster i, the complexity of theintra-cluster protocol is O

(max(Nm, log ni)ni

1N

)per in-

stance. And the cluster head hi obtains a correct resultwith probability at least 1− 2

n2 .

9. Note that each node transmits only a portion of its local computa-tion result, which effectively introduces spacial diversity into the finalaggregation at the cluster head.

The proof is given in Appendix C.Since ni = Θ(log n) and there are Θ( n

logn ) clus-ters, the total complexity of the intra-cluster protocolis O

(nN max(Nm, log log n)

). And by the union bound,

the probability that all the cluster heads acquire correctinformation in their clusters is at least 1 − O( 2

n logn ) =1− o(1).

To analyze the inter-cluster protocol we know thatthere are Θ( n

logn ) cluster heads and at most KΘ( nlogn )

relay nodes according to Lemma 4. It follows that thesize of the spanning tree is Θ( n

logn ). Since each node onlytransmits once, the complexity of the inter-cluster proto-col for one instance is10 O

(1N max (N log rn, log n)

nlogn

).

We now consider the error probability for the inter-cluster protocol. A codeword can be correctly decodedby its parent node with probability at least

Pr (a codeword is decoded correctly)≥ 1− e−γ max(N log rn,logn), (3)

for some γ > 1, according to Corollary 1.By the union bound

Pr (all the codewords are decoded correctly)

≥ 1−Θ(n

log n)e−γ max(N log rn,logn)

≥ 1− an

log ne−γ logn

= 1− a

nγ−1 log n. (4)

where a is a constant. Combining the analysis for boththe intra-cluster and inter-cluster protocols we reach theconclusion in Theorem 1.

4.3 DiscussionSince our general protocol can compute any divisiblefunction, it is interesting to examine its efficiency andpotential bottlenecks. Depending on different relationsbetween the system parameters, we differentiate a fewcases as below:

4.3.1 One shot, constant-size data caseIn this scenario N = O(1) and m = O(1). From Theorem1, we can see in such a case the complexity becomesO(n log log n + max(log rn, log n)

nlogn ). rn plays an im-

portant role here and provides us some insights for theprotocol design.

• rn = O(nlog logn): In this case the intra-cluster oper-ation dominates, and the complexity of our generalprotocol becomes O(n log log n), which provides thebest upper bound we know in literature on someindividual functions. For example, our result for thehistogram computation (rn = O(n|χ|)) is the sameas what is achieved in [11]. For the parity function(rn = O(1)), the best known protocol is given in [8]for a noisy broadcast network with the complexity

10. m may play a role here through rn.

8

of O(n log log n). Our result shows that actually thisresult is achievable for a noisy multi-hop network aswell. In addition, when rn = O(n) the complexityof the inter-cluster protocol is O(n). Obviously forall non-degenerate functions, Ω(n) transmissions isneeded, even in a noiseless broadcast network. Thus,any effort to improve the efficiency of the inter-cluster protocol is not necessary in terms of scalinglaw of communication complexity.

• rn = Ω(nlog log n): In this case it is easily seenthat the complexity of our proposed protocol isO(log rn

nlogn ). This reveals an interesting point that,

when the function range rn is large enough thecomputation bottleneck lies in the inter-cluster pro-tocol. The identity function (rn = |χ|n) belongs tothis category, and our general protocol can computeit with the complexity of O( n2

logn ). We will give amore efficient inter-cluster protocol for computingthe identify function in the next section.

4.3.2 Block computationThis scenario corresponds to mN = ω(1). It can beshown that O(nm) is achieved when rn = O(nm), andO(n log rn

logn ) is achieved otherwise. Intuitively when eitherm or N is large enough, the computation efficiencywill be improved through block coding. By examiningthe complexity expression of our protocol in (2), wefind that this intuition is only true for the intra-clusterprotocol. In particular, the first term of the complexityexpression suggests that when Nm = Ω(log log n), a tightbound O(nm) for the intra-cluster protocol is achieved.For the inter-cluster protocol, when rn = O(n), blockcomputation helps improve the efficiency from O(n) toO(n log rn

logn ). However when rn = Ω(n) it does not help.For example block computation can not help for theinter-cluster propagation in our general protocol for thecomputation of the identity function; in contrast, for themax function with constant m, block computation helpsboth for the intra-cluster processing and the inter-clusterpropagation.

To summarize this section, our general protocol isactually quite efficient; for example, it admits the bestknown results about the computation of the histogramand parity function in literature. On the other hand,we also identify the scenarios for potentially furtherimprovement. In the following two sections, we discusstwo such improvements.

5 IMPROVEMENT FOR THE IDENTITY FUNC-TION

It is shown in the last section that if rn is large the inter-cluster protocol limits the performance. To demonstratehow to improve the efficiency of an inter-cluster protocolwe consider the identity function, which attracts muchattention in the studies of distributed computing sinceit is the mother function of all the other functions. In

the one-shot, constant-size data case, it is known thatin a noisy broadcast network with n nodes the (order)optimal complexity for identity function is θ(n log log n)[14], which is achieved by the intra-cluster protocolabove. Therefore we focus on the improvement of theinter-cluster protocol for the identity function in thissection.

We consider the one shot case with constant m. In thenew inter-cluster protocol each cluster head codes itsaggregated information, Θ(m logn) bits in total obtainedfrom the intra-cluster protocol (recall from Lemma 4 thateach cluster contains Θ(log n) nodes), into a codewordwith size O(log n). Each codeword is transported to thesink node through the shortest path and appropriatescheduling. The decode-and-forward scheme is adoptedfor the information’s relay. Then we have

Proposition 2: For constant m and N , the identity func-tion can be evaluated at the complexity of O

(n√

nlogn

)with error probability o(1).

Proof: To begin with, we show that the total numberof hops Nh is Θ

(n

logn

√n

logn

). Without losing generality

we assume the sink node is located at the center of thesquare. We first divide the space into one circle withradius tn, and a sequence of annulus Ci , C((2i −1)tn, (2i + 1)tn), i = 1, 2, ...,Θ( 1

tn), all centered at the

sink node (see Fig 3). And then the space is furthertessellated11 into cells with side sn = tn/

√5 so that

each cell contains at least one node w.h.p.. Therefore,the cluster heads in Ci need at most (2i + 1)tn/sn =√5(2i+ 1) hops to reach the sink node and there are at

most π(2i+1)2t2n−π(2i−1)2t2nπ(tn/2)2

= 32i clusters in Ci (followingthe same arguments in Lemma 4). Therefore the totalnumber of hops Nh can be computed as:

Nh =

Θ(√

nlog n )∑

i=1

√5(2i+ 1) · 32i (5)

= Θ

(n

log n

√n

log n

).

Since O(log n) bits are transmitted by each hop thecommunication complexity of this inter-cluster protocolis given by

O(log nNh) = O

(n

√n

log n

). (6)

It’s easy to check that in each hop a codeword can becorrectly decoded by a receiver with probability at least

Pr (a codeword is decoded correctly)≥ 1− e−γ logn, (7)

for some γ > 3/2, according to Corollary 1.

11. Tessellation here only facilitates the calculation of Nh. The pro-tocol for the identity function is still applied in clustered networks.

9

Fig. 3: division and tessellation of the network. The dataof each node is transmitted to the sink node through theshortest path routing.

By the union bound

Pr (all the codewords are decoded correctly)

≥ 1−Θ(n

log n

√n

logn)e−γ logn

= 1− a

nγ−3/2(log n)3/2

= 1− o(1). (8)

where a is a constant.The lower bound of evaluating the identity function

in multi-hop networks is achieved by assuming the net-works are error free and each individual node transmitsits integer to the sink node through the shortest pathrouting. Note that there is no redundancy in the inputsto compute the identity function since the sink noderequires all the information in the network. Accordingto [10] this ideal shortest path routing protocol admitscomplexity Θ

(n√

nlogn

), which reveals that our protocol

applied in the noisy networks provides a sharp bound.The above analysis completes the proof of Theorem 2.

Remark 7: This inter-cluster protocol incorporates acoding scheme into the shortest path routing, which isobviously applicable for any function. However it im-proves the computing efficiency for the identity function,not necessarily for other functions. As we discussed in4.3, for the rn = O(nlog logn) case, the bottleneck lies inthe intra-cluster protocol, so there is no need to improvethe inter-cluster protocol.

Remark 8: The inter-cluster protocol in the general pro-tocol exploits in-node computation, i.e., relay nodes mayprocess the information they receive before forwardingit, which could be more efficient for small rn. In contrast,the inter-cluster protocol here utilizes the shortest pathrouting with the decode-and-forward scheme, whichcould be better for large rn.

6 IMPROVEMENT FOR RESTRICTED TYPE-THRESHOLD FUNCTIONS

In this section we’ll show that the intra-cluster protocolcould be further improved for some restricted type-threshold functions f , which are defined in Sec. 2. It’snot difficult to check that for this class of functions,a tight bound Θ(mn) in communication complexity isachieved by our general protocol (Theorem 1) in blockcomputation scenarios. Therefore we focus on the oneshot case (N = O(1)) with constant m, for which thegeneral protocol gives a complexity of O(n log log n). Aswe discussed in Sec. 4.3, the bottleneck of computationfor this type of functions lies in the intra-cluster protocol.In this section we design a more efficient intra-clusterprotocol, K-best candidates, for this class of functionswith complexity O(n log logK) and a fault toleranceQ + o(1), where Q is an arbitrarily small constant. Ourstudy is inspired by the “witness discovery” protocolin [9] which is designed for the max function (onecandidate) and applied in a noisy broadcast network.Our problem is more complicated and challenging innature since the number of candidates (K) and clustersin the network are both unbounded. The non-trivialityof our design lies in the way to bound the total errorprobability. In the following, we use the function, findingthe K largest values (without distinction), to describeour design for concreteness. With suitable modification,it can be extended to computing other restricted type-threshold functions.

6.1 The intra-cluster protocol—K best candidates

The intuition behind the K best candidates protocolcomes from the elimination match, where teams or in-dividuals are grouped to compete with each other andthe winner(s) will go into the next round and be groupedagain. The whole process is repeated until a champion isselected. Similarly in the K best candidates protocol, thecluster head intends to find out the K largest numbersin its cluster. First, we divide ni nodes in cluster i into12

ni

4K groups, each of size 4K. The cluster head hi selectsK nodes from each group, which have the (nominally)largest values, into the next round. Then the selectednodes are grouped and compared again. The details ofthe protocol are described as below:

(1) The cluster head hi collects all the measurementsin its cluster. We differentiate two scenarios basedon the value of K:

• K = O(1): in this case K is a constant. Eachnode simply encodes its integer into a code-word with size O

(max(m, log K

Q ))

and trans-mits it to hi;

• K = Ω(1): in such a case each group is fur-ther divided into subgroups of size log 4K.

12. Throughout this protocol, extra dummy nodes with the smallestmeasurements may be added to make the group sizes equal.

10

And then the general intra-cluster protocol dis-cussed in Sec. 4 is applied in each subgroup,with ni there replaced with13 log 4K.

(2) For each group, hi compares the 4K values anddetermines the indices of the K largest values. Andthen hi assigns ‘1’ to nodes holding the K largestvalues and ‘0’ to others. These ni assigned bits areconcatenated into a word, which is further encodedto a codeword of size O(ni) and broadcasted.

(3) Let J be the set of winning nodes in (2). Theprotocol (steps 1 − 3) proceeds by executing theprotocol on J (of size ni/4) for 3 independent times.In each time, the head node hi obtains from theprotocol output K indices14 corresponding to theK (nominally) largest values; it takes the majorityof the three outputs as the result for the protocolon cluster i (or arbitrarily decides if there is nomajority).

(4) hi encodes the K indices found in (3) into a code-word with size O(ni) and broadcasts it. The nodesselected then encode their data into codewordswith size O(ni) and send them to hi. The protocolends up with hi knowing the K largest values incluster i.

Remark 9: Recursion happens on the third step, wherewe obtain a winning set of reduced size (by a factor of4), and apply this protocol to the winning set for threetimes. That is, to find out the K candidates once over znodes, the protocol is executed over z/4 nodes for threetimes. The recursion ends when z = 4K. In this case, hi

compares the 4K values and determines the indices ofthe K (nominally) largest values for three independenttimes.

This protocol is non-oblivious and can be turned intoan oblivious one by the ’helper’ idea in [9] withoutdegrading the performance.

6.2 Analysis

Proposition 3: With O(n log logK) bits transmitted, theK largest values are received by cluster heads incorrectlywith probability Q + log logn

n + Kn logn , where Q is an

arbitrarily small constant.The proof is given by the Appendix D.

The inter-cluster protocol is the same as theone in the general protocol, with the complexityO(max(mK, log n) n

logn

)= O(n) and the error probabil-

ity 1n logn .

Theorem 3 follows from combining the results for theintra-cluster and the inter-cluster protocol and setting

δ = Q+log log n

n+

K

n logn+

1

n log n.

13. The only difference from the intra-cluster protocol in Sec. 4 isthat, we do not need individual heads for each subgroup; hi collectsthe data from each of them.

14. the result from one execution of the protocol on ni/4 nodes

7 CONCLUSION AND FURTHER WORK

Distributed computing in noisy multi-hop networks isan attractive yet challenging problem. In this paper wefirst designed a general protocol for computing anydivisible functions reliably with high probability in anoisy multi-hop network, with communication complex-ity O( n

N max(Nm, log log n)+max(N log rn, log n)n

N logn ),where n,N,m and rn are the number of nodes in thenetwork, the length of data blocks at each node, the sizeof each data point in bits, and the range of the function ofinterest, respectively. The merits of this general protocolinclude: 1) it provides a benchmark to evaluate variousfunctions; 2) its bottleneck analysis motivates furtherimprovement. After analyzing its bottleneck in differ-ent scenarios, we endeavored to design more efficientprotocols for the identity function and some restrictedtype-threshold functions, focusing on the one-shot case(N = O(1)) with constant-size data (m = O(1)). We showthat they can be evaluated with complexity Θ(n

√n

logn )

and O(n log logK), respectively.In our future work we’ll devote our effort to good

lower bounds for restricted type-threshold functions,which can reveal the tightness of the upper bound weachieved in this paper. We will also consider the com-putation of type-sensitive functions [6] over noisy multi-hop networks. Intuitively they may not be calculated(much) more efficiently than the identity function sinceall the information in the network is required to evaluatethem.

ACKNOWLEGEMENTS

This work was supported in part by the US NationalScience Foundation under Grant CCF-0721815, CCF-0830462 and ECCS-1002258. A conference version of thiswork appeared in IEEE INFOCOM 2010 [29].

REFERENCES

[1] N. A. Lynch, Distributed Algorithms, San Francisco, CA: MorganKaufmann, 1997.

[2] H. Attiya and J. Welch, Distributed Computing: Fundamen-tals, Simulations, and Advanced Topics, second edition, Wiley-Interscience, 2004.

[3] A. Orlitsky and A. El-Gamal, “Communication complexity,” Com-plexity in Information Theory, Y. S. Abu-Mostafa (ed), pp. 16-61, 1988.

[4] L. Lovasz, “Communication complexity: A survey,” Paths, Flows,and VLSI layout, B. H. Korte (ed), Springer-Verlag, 1990.

[5] E. Kushilevitz and N. Nisan, Communication Complexity, Cam-bridge Univ. Press, 1997.

[6] A. Giridhar and P. R. Kumar, “Computing and communcatingfunctions over sensor network,” IEEE J. Sel. Areas Commun., vol.23, no. 4, Apr. 2005.

[7] A. El Gamal, Open problem presented in the 1984 Workshop onSpecific Problems in Communication and Computation, sponsoredby Bell Communication Research, 1984.

[8] R. Gallager, “Finding parity in a simple broadcast network,” IEEETrans. Info. Theory, vol. 34, no. 2, pp. 176-80, Mar. 1988.

[9] I. Newman, “Computing in fault tolerance broadcast networks,”IEEE Annual Conference on Computational Complexity (CCC), 2004.

[10] N. Khude, A. Kumar, and A. Karnik, “Time and Energy Complex-ity of Distributed Computation of a Class of Functions in WirelessSensor Networks,” in IEEE Trans. Mobile Computing, 2008.

11

[11] L. Ying, R. Srikant, and G. E. Dullerud, “Distributed symmetricfunction computation in noisy wireless sensor network,” IEEETrans. Infor. Theory, 2007.

[12] S. Rajagopalan and L. Schulman, “A coding theorem for dis-tributed computation,” Proc. 26th Annual ACM Symp. Theory ofComp., 1994.

[13] Y. Kanoria and D. Manjunath, “On distributed computation innoisy random planar networks,” Proc. IEEE International Symposiumon Infomration Theory, Nice, France, June 2007.

[14] N. Goyal, G. Kindler and M. Saks, “Lower bounds for the noisybroadcast problem,” IEEE Symposium on Foundations of ComputerScience, 2005.

[15] A. Yao, “On the complexity of communication under noise,”invited talk in the 5th ISTCS Conference, 1997.

[16] E. Kushilevitz and Y. Mansour, “Computation in noisy radionetworks,” ACM-SIAM Symp. Discrete Algorithms, 1998.

[17] U. Feige and J. Kilian, “Finding OR in a noisy broadcast network,”Information Processing Letters, vol. 73, no. 1-2, 2000.

[18] C. Li, H. Dai and H. Li, “Finding the K largest metrics in a noisybroadcast network,” Allerton Conference on Communication, Controland Computing, 2008.

[19] P. Gupta and P. R. Kumar, “The capacity of wireless networks,”IEEE Trans. Infor. Theory, vol. 46, no. 2, Mar. 2000.

[20] F. Xue and P. R. Kumar, Scaling Laws for Ad Hoc WirelessNetworks: An Information Theoretic Approach, Delft, The Nether-lands, 2006.

[21] P. Gupta and P. Kumar, “Criticial power for asymptiotic con-nectivity in wireless networks,” Stochastic Analysis, Control, Opti-mization and Applications: A Volume in Honor of W. H. Fleming, W.McEneaney, G. Yin, and Q. Zhang, Eds. Boston, MA: Birkhauser, 1998.

[22] F. Xue and P. Kumar, “The number of neighbors needed forconnectivity of wireless networks,” Wireless network, vol. 10, no.2, Mar. 2004.

[23] D. Mosk-Aoyama and D. Shah. “Computing separable functionsvia gossip,” ACM Principles of Distributed Computing, Sep. 2007.

[24] S. Subramanian, P. Gupta and S. Shakkottai, “Scaling Bounds forFunction Computation over Large Networks,” Proc. IEEE Interna-tional Symposium on Infomration Theory, Nice, France, June 2007.

[25] W. Hoeffding, “Probability inequalities for sums of boundedrandom variables,” J. Amer. Stat. Assoc., vol. 58, pp. 13-30, 1963.

[26] J. H. van Lint, Introduction to coding theory, 3rd Edition, Springer-Verlog, 1999.

[27] M. R. Garey and D. S. Johnson, A Guide to the Theory of NP-Completeness, W. H. Freeman and Company, 1979.

[28] W. Li and H. Dai, “Cluster-based Distributed Consensus,” IEEETrans. Wireless Communications, vol. 8, no. 1, Jan. 2009.

[29] C. Li and H. Dai, “Towards Efficient Designs for In-network Com-puting with Noisy Wireless Channels,” in Proc. IEEE INFOCOM,2010.

[30] A. Ozgur, O. Leveque and D. Tse, “Hierarchical CooperationAchieves Optimal Capacity Scaling in Ad Hoc Networks, ” IEEETransactions on Information Theory, vol 53, no. 10, pp. 3549 - 3572,October 2007

[31] J. Ghaderi, L. Xie and X. Shen, “Hierarchical Cooperation in AdHoc Networks: Optimal Clustering and Achievable Throughput,”IEEE Transactions on Information Theory, vol 55, no. 8, 2009.

[32] L.-L. Xie and R. R. Kumar, “A network information theory forwireless communication: Scaling laws and optimal operation,”IEEE Transactions on Information Theory, vol 50, no. 5, 2004.

Chengzhi Li (S’10) received the B.E. degreefrom Nankai University, Tianjin, China, and M.S.degree from Peking University, Beijing, China,in 2001 and 2004, respectively, and the Ph.D.degree in electrical engineering from North Car-olina State University, Raleigh, NC in 2012. He iscurrently a staff scientist in Broadcom Corpora-tion, Matawan, NJ. His research interests are inwireless communications and networking, signalprocessing for wireless communications.

Huaiyu Dai (M’03, SM’09) received the B.E.and M.S. degrees in electrical engineering fromTsinghua University, Beijing, China, in 1996 and1998, respectively, and the Ph.D. degree inelectrical engineering from Princeton University,Princeton, NJ in 2002.

He was with Bell Labs, Lucent Technologies,Holmdel, NJ, during summer 2000, and withAT&T Labs-Research, Middletown, NJ, duringsummer 2001. Currently he is an Associate Pro-fessor of Electrical and Computer Engineering

at NC State University, Raleigh. His research interests are in the gen-eral areas of communication systems and networks, advanced signalprocessing for digital communications, and communication theory andinformation theory. His current research focuses on networked informa-tion processing and crosslayer design in wireless networks, cognitiveradio networks, wireless security, and associated information-theoreticand computation-theoretic analysis.

He has served as editor of IEEE Transactions on Communications,Signal Processing, and Wireless Communications. He co-edited twospecial issues for EURASIP journals on distributed signal processingtechniques for wireless sensor networks, and on multiuser informationtheory and related applications, respectively.

12

APPENDIX ADISTRIBUTED CLUSTERING [28]We assume each node i has an initial seed si which isunique within its neighborhood. This can be realizedthrough, e.g., drawing a random number from a com-mon pool, or simply using nodes’ IDs. From time 0, eachnode i starts a timer with length ti = si, which is decre-mented by 1 at each time instant as long as it is greaterthan 0. If node i’s timer expires, it becomes a cluster-head, and broadcasts a “cluster initialize” message to allits neighbors. Each of its neighbors with a timer greaterthan 0 signals its intention to join the cluster by replyingwith a “cluster join” message, and also sets its timer to0. If a node receives more than one “cluster initialize”messages at the same time, it randomly chooses onecluster-head. At the end, clusters are formed such thatevery node belongs to one and only one cluster. Notethat the uniqueness of seeds within the neighborhoodensures that cluster-heads are at least of distance tn fromeach other. Therefore, this distributed approach formclusters satisfying our assumptions.

APPENDIX BPROOF OF COROLLARY 1The proof follows the lines of Observation 2.1 in [9].According to Lemma 2, for some fixed γ ∈ (0, 1/2), wecan encode the m-bit integer by a codeword v ∈ Γm withlength Cm, where C ≥ C1(γ) will be further specifiedbelow. A receiver gets a noisy copy v′ = v XOR n of vand decodes it as w ∈ Γm which minimizes d(v′, w). Thedecoding error is upper bounded by

Pr

(d(v′, v) ≥ 1

2γCm

)= Pr

(|n| ≥ 1

2γCm

)where | · | is l1 norm of the noise vector n. By Lemma1, this probability is bounded above by exp(−2( 12γ −ϵ)2Cm). This value is at most e−δm for C ≥ δ 1

(γ/2−ϵ)2 ,∀δ > 0.

APPENDIX CPROOF OF PROPOSITION 1We analyze the complexity first. It is clear thatin step (i) O(max(Nm, log(ni)))ni bits in total aretransmitted for N instances. In step (ii) each nodeoutputs O(max(N log rni

,ni))

nibits, which leads to

overall O(max(N log rni , ni)) transmitted bits. Sincemax log rni = mni the latter one is upper boundedby O(Nmni). Considering both step (i) and (ii), thecomplexity of the intra-cluster protocol for one instanceis O

(max(Nm, log ni)ni

1N

).

Then we turn to the analysis of the error probability.Assume ni = a log n, with a constant a > 0, and setβi = 2/a. In phase 1, denote Ui as the set of nodes incluster i, which do not correctly receive all the otherdata relayed by the cluster head. According to Corollary1, each individual data reaches some node I with error

probability 2e−δmax(mN,logni) ≤ 2nδi

, ∀δ > 0, where thefactor 2 comes from the fact that data is relayed once bythe cluster head. By the union bound

Pr[I ∈ Ui] ≤ ni2

nδi

= 2n1−δi .

Therefore ∀αi > 0

Pr[|Ui| ≥αi

2ϵni] ≤

ni∑s=

αi2 ϵni

(2n1−δi )s

(n

s

)(9)

≤ (2n1−δi )

αi2 ϵni2ni < 2−βini ,

where the second inequality is due to the fact that thetotal number of node subsets in cluster i is no more than2ni , and the last inequality follows the fact that 2n1−δ

i <

2−2(1+βi)

αiϵ for large enough δ.In phase 2, each node outputs an N log rni-bit word

after some calculation. Without noise, these words areidentical and can be encoded into a codeword v2 ∈ Γ′

n

with length Cli, where li = max(N log rni , ni), accordingto Lemma 2. In the noisy environment, the cluster headreceives a noisy copy of v2, v′2. The decoding error is up-per bounded by Pr(d(v′2, v2) ≥ 1

2γCli) = Pr(d(v′2, v2) ≥(1 + αi)ϵCli) for γ = 2(1 + αi)ϵ, and occurs due to tworeasons: 1) the blocks of bits from some node j ∈ Ui

are erroneous; 2) bits may be corrupted by noise duringtransmission. We hence have

Pr(decoding error) (10)= Pr(d(v′2, v2) ≥ (1 + αi)ϵCli)

≤ Pr(Clini

|Ui| ≥ Clini

αi

2ϵni) + Pr(Ne ≥ (1 + αi/2)ϵCli)

≤ 2−βini + exp(−α2i

2Cϵ2ni)

< 2 · 2−βini for C ≥ 2βi

α2i ϵ

2,

<2

n2,

where Ne is the number of bits corrupted by noise, andaccording to Lemma 3

Pr (Ne ≥ (1 + αi/2)ϵCli) (11)

≤ exp

(−2((1 + αi/2)ϵCli − Cliϵ)

2

Cli

)≤ exp(−α2

i

2Cϵ2ni).

APPENDIX DPROOF OF PROPOSITION 3We begin with the analysis on complexity. For the caseK = Ω(1), in each group (of size 4K), each nodetransmits O(log logK) bits, and K nodes are chosen into

13

the next round. Therefore, in cluster i the number of bitstransmitted from the nodes to hi during the recursion is:

Cni = O(log logK)ni + 3Cni4

(12)

= O(log logK)(ni +3ni

4+

9ni

16+ ...)

≤ 4O(ni log logK),

where ni = Θ(log n) is the number of nodes in clusteri. The complexity of the index-broadcasting in step (2)is O(ni + ni/4 + ni/16 + ...) = O(ni) and the complex-ity in step (4) is O(ni). Therefore the communicationcomplexity in cluster i is O(ni log logK). Since there areΘ(n/ log n) clusters, the total complexity of the intra-cluster protocol is O(n log logK).

For the K = O(1) case, each node transmitsO(max(m, log K

Q )) bits in each group. Then following thesame proof above we conclude that the complexity ofthe intra-cluster protocol is O(nmax(m, log K

Q )) = O(n)(since K,Q,m are all constants).

We then turn to the analysis of the error probabil-ity. For the ease of description, the nodes holding the(globally) K largest values are called target nodes. Notethat 1) the target nodes could be distributed in at mostK groups, located in at most K clusters; 2) errors donot occur if wrong nodes are selected in groups withouttarget nodes. Therefore we only need to guarantee thatthe protocol is executed correctly at the groups contain-ing the target nodes, and all the selected nodes (at theend of recursion) transmit correct information to theircorresponding cluster heads, with high probability.

An error may occur if either of the following two caseshappens:

e1 : not all the target nodes are selected by the clusterheads (corresponding to the first three steps);

e2 : the cluster heads receive incorrect information fromthe selected nodes (corresponding to step (4)).

To analyze e1, we define the following two events:

e1,1 : in the groups containing one or more target nodes,at least one cluster head obtains wrong data fromthe remaining nodes in its group at step (1);

e1,2 : the nodes receive incorrect indices broadcasted bythe cluster heads at step (2) and (4).

Essentially we would like to show that Pr(e1) remainsbounded during recursion. However, e1,2 leads to errorpropagation in the recursion and renders the analysisintractable. Therefore, we delineate the effect of e1,2 andcalculate Pr(e1) in two steps by

Pr(e1) < Pr(e1|Ie1,2 = 0) + Pr(e1,2), (13)

where I is the indicator function. First, it is assumedthat all the nodes receive the indices broadcasted bythe cluster heads in step (2) correctly. The informationfrom a group (of 4K nodes) is received incorrectly with

probability at most15 2(4K)β−1 log(4K)

(resp. Qβ

Kβ ) for certainβ > 1 and K = Ω(1) (resp. K = O(1)). Due to the factthat the K target nodes come from at most K groups,

Pr(e1,1) ≤2K

(4K)β−1 log(4K)< Q2,

(resp. Pr(e1,1) ≤ Qβ

Kβ−1 < Q2) for sufficiently large β.For the probability Pr(e1|Ie1,2 = 0), it’s easy to check

Pr(e1|Ie1,2 = 0) < Q when n = 4K. Assume for n′/4nodes Pr(e1|Ie1,2 = 0) < Q. Executing the protocol onthe n′/4 nodes three times results in an error probabilityof at most 3Q2. By induction, for n′ nodes Pr(e1|Ie1,2 =0) ≤ Q2+3Q2 < Q for small Q. Therefore Pr(e1|Ie1,2 = 0)remains bounded during recursion.

Next we examine the probability of e1,2. For clusteri, there are at most O(log ni) recursions in total sincethe number of nodes in each recursion is reduced ex-ponentially. In each recursion, the probability that notall the nodes receive the broadcasted indices correctlyis ni

e2ni, according to Corollary 1 and the union bound.

Therefore, nodes in cluster i may receive wrong indicesduring the protocol’s execution with error probability atmost ni logni

e2ni. Since there are θ(n/ log n) clusters,

Pr(e1,2) =n

log n

ni log ni

e2ni=

log log n

n.

Thus according to Eq. (13)

Pr(e1) < Q+log log n

n.

Finally let us check the probability of e2, i.e., theprobability that the cluster heads receive incorrect in-formation from the selected nodes. Θ(K n

logn ) nodes intotal are selected after execution of the intra-clusterprotocol and each selected node transmits its data to itscorresponding cluster head correctly with possibility atleast 1− 1

n2 according to Corollary 1. By the union boundthe error probability of e2 is at most K

n logn . Therefore,the total error probability for the intra-cluster protocolis Q+ log logn

n + Kn logn .

15. According to Prop. 1, the cluster head receives the data fromeach subgroup incorrectly with probability at most 2

(4K)β. Since there

are ⌈ 4Klog(4K)

⌉ subgroups, by union bound, the error probability is atmost 2

(4K)β−1 log(4K)

efﬁcient in-network computing with noisy wireless channels · 1 efﬁcient in-network computing...

Documents