[ieee 2008 42nd annual conference on information sciences and systems (ciss) - princeton, nj, usa...

This work was supported in part by NSF Award CCF-0515253, NSF Award CNS-0430436, MEDC Grant GR-296, and unrestricted gift from Microsoft Research.

Natural Growth Codes: Partial Recovery under Random Network Coding

Shirish Karande, Kiran Misra, Hayder Radha Michigan State University, East Lansing, MI-48824, USA

Email: {karandes, misrakir, radha}@egr.msu.edu

Abstract- Growth Codes (GC) improve the disruption tolerance of zero-configuration sensor networks by providing graceful data recovery. Here, we highlight the existence of periphery monitoring topologies which are conducive for graceful recovery. In such networks, the performance of Random Network Coding (RNC) is observed to be superior to that of GCs. RNC increasesthe data persistence, while maintaining a lower delay.

Index Terms: Network Coding, Sensor Networks

I. INTRODUCTION

Sensor networks are expected to play an important role in assisting disaster relief. These networks, when deployed in response to emergency situations, will often be forced to operate in a zero-configuration mode [1]. In such a mode the nodes are required to collect and transmit data without detailed topological information. In such a scenario, some variant of epidemic or randomized routing has to be employed for data gathering. Random linear network coding (RLNC)([2], [4]) can be efficiently combined with epidemic routing to realize throughput gains. However, under a completely randomized setting, random linear coding (RLC) can create substantial interdependencies in the data. In principle, such interdependencies could lead to a “cliff effect” i.e. no data can be recovered unless K independent linear combinations are received by the sink; where K is equal to the total number of sensor readings. Such a recovery behavior is undesirable onaccount of two important reasons:

In a catastrophic scenario a large number of sensors can get destroyed making it impossible for a sink to receive sufficient number of independent combinations.

In sensor networks, complete recovery of all the information is often not essential for inference.

In [1] Kamra et. al. used the above observations to motivate

the design of Growth Codes. Growth codes present a novel way of employing network coding to improve the “persistence” of sensed data. Persistence refers to the proportion of sensed data that is eventually recovered by the sink. The key concept behind growth codes is to gradually increase the interdependency in the data, so that a sink can recover parts of the sensed data even when K independent linear combinations have not been received. In [1] an “open loop” algorithm has been carefully designed to explicitly control the growth of the codes (i.e. the degree of the encoded symbols) so that partial recovery of information can be facilitated. This partial recovery improves the disruption tolerance of the network.

The work presented in this article is a discourse on Growth Codes (GC) [1]. In [1] the “open loop” algorithm has been designed such that the probability of recovering new information in each step of an (at sink) LT decoding [3] is maximized. In order to induce a code structure that is optimized for LT decoding, GCs employ a sub-optimal sharing of the network bandwidth. Specifically, the communication between the nodes takes place in a one-to-one fashion, which does not embrace the inherent broadcast nature of wireless transmissions. Thus, the objective of this paper is to highlight the existence sensor network topologies, where RNC can outperform GCs without compromising on the gracefulness of the recovery. In this work, RNC refers to a protocol under which each sensor opportunistically broadcasts the linear combinations of received/sensed packets without explicitly controlling the degree of such combinations. Hence, we refer to this method as the Natural Growth Codes (NGC).

Figure 1 provides a motivational example for the work presented here. In Figure 1 it can be observed that the topology of the network is such that, for reasonably large values of q, the sink can recover two information symbols in each round. We show that such a phenomenon can be generalized to practically important perimeter monitoring(PM) network topologies. Our interest in the PM topology is motivated by scenarios where it is impossible to access the region over which the sensors have been sprayed. In such circumstances, it may often be the case, that information is

540978-1-4244-2247-0/08/$25.00 ©2008 IEEE.

gathered by overhearing the transmissions of sensors close to the periphery of the network. With the example of PM topologies, we show that the NGCs are more opportunistic and hence provide superior partial recovery performance.

The organization of the remainder of the paper is as follows: Section II describes the network model. Section III presents the results and analysis. Section IV summarizes the key conclusions.

3 4

5 6

1 2

S

sink

sensed data: x xxxxx

Schedule for each round: 1, 2, 3, 4, 5, 6,

Protocol: Multiply every overheard transmission with a random coefficient in Fq and then add it to the stored symbol.When a transmission slot is available transmit stored data after multiplying it with a random coefficient in Fq.

Overheard by sink Recovered

1 xxx

xx

2 xxxx xx

xxxx

3 xxx xx

xxx

where ijk is coefficient from Fq , corresponding to round i, transmission from sensor j and reading from sensor k

Figure 1 An example of Natural Growth Codes in a sensor network.

II. NETWORK MODEL

We assume that K sensors 1, , KX X are randomly

distributed within a 2-D closed periphery. Each sensor has a transmission range of R and hence all transmissions from a distance less than R from the periphery can be overheard by the sink. Each sensor is assumed to have a storage space

0 , , LB B for L+1 symbols. The storage location 0B is

given a protocol dependent special status. At an epoch, each sensor reads a symbol in qF , multiplies this symbol with L+1

random coefficient chosen from \ 0qF and stores the

resulting values in the L+1 storage slots. We assume slotted communication and thus the data is gathered on the basis of rounds.

A. Routing and Buffer Management

A.1 No Coding

At a transmission opportunity, a sensor randomly picks a value from one of the L + 1 storage locations and broadcasts it. The reception takes place in an opportunistic fashion. Thus

a sensor overhearing the transmission over-writes one of the

storage locations from 1, , LB B .

A.2 Natural Growth Codes

At the start of a round, each sensor randomly picks a value

from one of the storage location 1, , LB B and overwrites

the content of 0B . At a transmission opportunity the sensor

broadcasts the contents of 0B . Similar to the non-coding

case, the reception takes place in an opportunistic fashion. A sensor multiplies each overheard transmission with Lrandomly chosen coefficient from \ 0qF and the L values so

obtained are added to the storage locations 1, , LB B .

Each encoded symbol e received by the sink can be described by an index set eI and coefficients

: \ 0e e qf I F , such that e

e ii I

e f i a

. We refer to eI as

the degree of the symbol. It should be appreciated that the above protocol ensures that the degree of the encoded

541

symbols increases with each round and hence Natural Growth Codes is a suitable descriptive term.

A.3 Growth Codes

The network model we consider in this paper is distinct from [1] and hence the original GC protocol should not be directly employed. In order to deduce a suitable variant, we have to give GCs a few unfair advantages. Hence, in the variant employed here, we assume that at the start of each round the sink can broadcast a value maxd to all the sensors.

In accordance to the discussion in [1], we set

max arg max1d

Z Zd K Z

d d

, where Z

represents the number of symbols already recovered by the sink. The original GC scheme does not utilize any feedback and therefore performs significantly worse than the variant we use for comparison.

At the start of each round, each sensor updates the

storage locations 1, , LB B in the following manner: If

the degree of an encoded symbol je stored at location jB

is less than maxd and o je eI I then multiply the encoded

symbol 0e stored at 0B with a random coefficient from

\ 0qF and add this value to je .

At each transmission opportunity a sensor aX randomly

chooses a neighbor bX . Nodes aX and bX randomly

choose two numbers , 1,i j L , the contents of iB in aX

are exchanged with jB in bX . Note that each exchange

consists of FORWARD transmission from aX and a

RETURN transmission from bX . Hence, for a fair

comparison each round of communication under the GC protocols is considered to be equivalent to two rounds.

B. Medium Access Control

In each round, sensors randomly get an opportunity to transmit in a non-interfering fashion. We employ the disk model to describe the interference in the network. Consequently, every sensor iX can have a maximum of

one transmitter jX within a distance ,i jd X X R . Thus

the transmitter’s in each round are determined in the following manner:

Choose some random permutation of 1, , K , FOR i

= 1:K, set : 0t i , : 0g i ,

FOR x = 1:K, ( )i x , IF 0g i , set : 1t i and DO

FOR j = 1:K, IF ,i jd X X r , set 1g j and

DO

FOR j = 1:K, IF ,j ld X X r , set

1g l

All iX with 1t i are declared as transmitters.

The above algorithm does not guarantee that the exchanges required for GCs can be conducted in a non-interfering manner. In particular, the RETURNtransmissions may interfere with each other. However, for the sake of comparison we ignore this interference.

C. At Sink Decoding Algorithm

The sink discards all symbols that can be expressed as a linear combination of previously received symbols. In addition, the sink maintains an index list of recovered symbols and list of non-discarded encoding symbols. End of each round the sink does the following:

Phase 1: LT Decoding

WHILE e s.t. I 1e DO, FOR e

IF eI 0 , set e

IF eI 1 , with appropriate scaling recover

the symbol belonging to eI , set e ,

set e

Phase 2: Gaussian Elimination

IF ee

I

, use Gaussian Elimination to

recover all symbols in ee

I

, set , set

e

III. RESULTS & ANALYSIS

In an extensive simulation study, we observed that, for a given periphery shape and network size K, there exist a threshold thR such that for all transmission ranges thR R ,

the average number of symbols recovered by NGC at the end

542

of each round is greater than those recovered by GC. We have verified the validity of the above phenomenon on a number of network shapes and sizes, however due to brevity we illustrate the above fact by considering in detail a network consisting of K = 256 sensors, distributed randomly within 1 x 1 square area. Also note that all the results presented here use q = 28-1 and have been averaged over 200 repetitions.

Consider some useful terminology: For a given

transmission range R , let recovery profile ,R T represent

the average number of sensor readings recovered at the end of round T by protocol . We say that scheme A dominates

scheme B, for the range R , iff , ,A R B RT T for all T.

Let , , ,|A B R A R B RS T T T . We refer to ,A B RS

as the duration of non-dominance of A over B. Figure 2 plots the duration of non-dominance of NGC. It can be seen that the number of instances when the recovery of NGC is inferior to GC or No-Coding decreases rapidly with an increase in transmission range. It can be clearly seen that for

0.19thR R , NGC are dominant i.e. at the end of any

round NGC have provided the greatest partial recovery.

Figure 2 Duration of non-dominance of Natural Growth Codes with respect to Growth Codes and No-Coding

The cause for observing partial recovery in a PM topology can be explained in the following manner: Unlike typical sensor network topologies, in a PM network the flow-capacity increases as the proximity to the periphery increases. Consequently the rate at which information percolates to the sink is greater than the rate at which information is “mixed”. Therefore, the number of encoding symbols received by a sink is often significantly greater than the number of sensor readings used to define these symbols. In such a scenario, the encoding symbols form rank complete sub-systems even when K independent linear combinations have not been received.

Compared to GCs, NGCs conduct routing and coding in a more opportunistic fashion. Katti. et. al. have highlighted the virtues of being opportunistic [4]. Consequently the existence of NGC can be exploited to improve the disruption tolerance of the network. This fact is highlighted in Figure 3, which plots the recovery profiles for 0.2R . It can be observed that, at any instance, if all the sensors in a network were to be destroyed, the information recovered by NGC will be the highest. For example, if all sensors were destroyed at round 150 then NGC would have recovered 50 symbols in excess of those recovered by the competing schemes. We also consider scenarios where only part of the network fails. Figure 2 plots the modified recovery profiles for a scenario where at round 100 all sensors in the 0.3 x 0.3 core, at the center of the network, get destroyed. In this case too, NGCs preserve and recover a significantly larger amount of data.

Figure 3 Partial recovery profiles for R = 0.2.

Figure 4 Average normalized complexity of employing Phase 2, Gaussian Elimination, of at sink decoding.

543

NGCs, despite their utility, increase the processing complexity at the sink. However, the partial recovery of data has a useful side effect. We observed that as the transmission range increases, the size of the sub-systems, over which GE gets employed, substantially decreases. To demonstrate this effect we experimentally evaluate the complexity of decoding in the following manner: Firstly note that the complexity of

GE is 3O n , hence, each time Phase 2 of the decoding

algorithm gets executed, we say that 3

eI processing has

been utilized. We calculate the total complexity of decoding by summing the complexities associated with each

run of Phase 2. The expected complexity E is determined

by averaging over multiple runs of decoding. We employ

normalization to get the measure: 3 = complexity E K .

Figure 4 plots this measure for various transmission ranges. It can be clearly seen that as R increases the complexity decreases rapidly. It can be observed that for thR R , the

complexity of decoding is less than 2% of that associated with employing Gaussian elimination on an RLC system defined over K symbols. Furthermore, we observed that beyond a certain threshold of transmission range, Phase 2 of the decoding algorithm rarely gets executed. Thus LT decoding is found to be sufficient even for NGC. It can be easily verified that this is indeed the case if 0.5R .

IV. CONCLUSIONS

We have shown the existence of Natural Growth Codes(NGC) in certain periphery monitoring sensor network topologies. Existence of such codes facilitates partial recovery without compromising on opportunism, thus leading to an improved tolerance to disruptions.

REFERENCES

[1] A. Kamra, V. Misra, J. Feldman, D. Rubenstein, “Growth Codes: Maximizing Sensor Network Data Persistence”, ACM SIGCOMM 2006.

[2] T. Ho, R. Koetter, M. Médard, D. R. Karger and M. Effros, "The Benefits of Coding over Routing in a Randomized Setting" IEEE ISIT, 2003.

[3] M. Luby, “LT codes”, IEEE FOCS, 2003.[4] S. Katti, D. Katabi, W. Hu, H. Rahul and M. Medard,

“The importance of being opportunistic: Practical network coding for wireless environments,”Allerton, 2005.

544

[ieee 2008 42nd annual conference on information sciences and systems (ciss) - princeton, nj, usa...

Documents