[ieee 2008 42nd annual conference on information sciences and systems (ciss) - princeton, nj, usa...
TRANSCRIPT
This work was supported in part by NSF Award CCF-0515253, NSF Award CNS-0430436, MEDC Grant GR-296, and unrestricted gift from Microsoft Research.
Natural Growth Codes: Partial Recovery under Random Network Coding
Shirish Karande, Kiran Misra, Hayder Radha Michigan State University, East Lansing, MI-48824, USA
Email: {karandes, misrakir, radha}@egr.msu.edu
Abstract- Growth Codes (GC) improve the disruption tolerance of zero-configuration sensor networks by providing graceful data recovery. Here, we highlight the existence of periphery monitoring topologies which are conducive for graceful recovery. In such networks, the performance of Random Network Coding (RNC) is observed to be superior to that of GCs. RNC increasesthe data persistence, while maintaining a lower delay.
Index Terms: Network Coding, Sensor Networks
I. INTRODUCTION
Sensor networks are expected to play an important role in assisting disaster relief. These networks, when deployed in response to emergency situations, will often be forced to operate in a zero-configuration mode [1]. In such a mode the nodes are required to collect and transmit data without detailed topological information. In such a scenario, some variant of epidemic or randomized routing has to be employed for data gathering. Random linear network coding (RLNC)([2], [4]) can be efficiently combined with epidemic routing to realize throughput gains. However, under a completely randomized setting, random linear coding (RLC) can create substantial interdependencies in the data. In principle, such interdependencies could lead to a “cliff effect” i.e. no data can be recovered unless K independent linear combinations are received by the sink; where K is equal to the total number of sensor readings. Such a recovery behavior is undesirable onaccount of two important reasons:
In a catastrophic scenario a large number of sensors can get destroyed making it impossible for a sink to receive sufficient number of independent combinations.
In sensor networks, complete recovery of all the information is often not essential for inference.
In [1] Kamra et. al. used the above observations to motivate
the design of Growth Codes. Growth codes present a novel way of employing network coding to improve the “persistence” of sensed data. Persistence refers to the proportion of sensed data that is eventually recovered by the sink. The key concept behind growth codes is to gradually increase the interdependency in the data, so that a sink can recover parts of the sensed data even when K independent linear combinations have not been received. In [1] an “open loop” algorithm has been carefully designed to explicitly control the growth of the codes (i.e. the degree of the encoded symbols) so that partial recovery of information can be facilitated. This partial recovery improves the disruption tolerance of the network.
The work presented in this article is a discourse on Growth Codes (GC) [1]. In [1] the “open loop” algorithm has been designed such that the probability of recovering new information in each step of an (at sink) LT decoding [3] is maximized. In order to induce a code structure that is optimized for LT decoding, GCs employ a sub-optimal sharing of the network bandwidth. Specifically, the communication between the nodes takes place in a one-to-one fashion, which does not embrace the inherent broadcast nature of wireless transmissions. Thus, the objective of this paper is to highlight the existence sensor network topologies, where RNC can outperform GCs without compromising on the gracefulness of the recovery. In this work, RNC refers to a protocol under which each sensor opportunistically broadcasts the linear combinations of received/sensed packets without explicitly controlling the degree of such combinations. Hence, we refer to this method as the Natural Growth Codes (NGC).
Figure 1 provides a motivational example for the work presented here. In Figure 1 it can be observed that the topology of the network is such that, for reasonably large values of q, the sink can recover two information symbols in each round. We show that such a phenomenon can be generalized to practically important perimeter monitoring(PM) network topologies. Our interest in the PM topology is motivated by scenarios where it is impossible to access the region over which the sensors have been sprayed. In such circumstances, it may often be the case, that information is
540978-1-4244-2247-0/08/$25.00 ©2008 IEEE.
gathered by overhearing the transmissions of sensors close to the periphery of the network. With the example of PM topologies, we show that the NGCs are more opportunistic and hence provide superior partial recovery performance.
The organization of the remainder of the paper is as follows: Section II describes the network model. Section III presents the results and analysis. Section IV summarizes the key conclusions.
3 4
5 6
1 2
S
sink
sensed data: x xxxxx
Schedule for each round: 1, 2, 3, 4, 5, 6,
Protocol: Multiply every overheard transmission with a random coefficient in Fq and then add it to the stored symbol.When a transmission slot is available transmit stored data after multiplying it with a random coefficient in Fq.
Overheard by sink Recovered
1 xxx
xx
2 xxxx xx
xxxx
3 xxx xx
xxx
where ijk is coefficient from Fq , corresponding to round i, transmission from sensor j and reading from sensor k
Figure 1 An example of Natural Growth Codes in a sensor network.
II. NETWORK MODEL
We assume that K sensors 1, , KX X are randomly
distributed within a 2-D closed periphery. Each sensor has a transmission range of R and hence all transmissions from a distance less than R from the periphery can be overheard by the sink. Each sensor is assumed to have a storage space
0 , , LB B for L+1 symbols. The storage location 0B is
given a protocol dependent special status. At an epoch, each sensor reads a symbol in qF , multiplies this symbol with L+1
random coefficient chosen from \ 0qF and stores the
resulting values in the L+1 storage slots. We assume slotted communication and thus the data is gathered on the basis of rounds.
A. Routing and Buffer Management
A.1 No Coding
At a transmission opportunity, a sensor randomly picks a value from one of the L + 1 storage locations and broadcasts it. The reception takes place in an opportunistic fashion. Thus
a sensor overhearing the transmission over-writes one of the
storage locations from 1, , LB B .
A.2 Natural Growth Codes
At the start of a round, each sensor randomly picks a value
from one of the storage location 1, , LB B and overwrites
the content of 0B . At a transmission opportunity the sensor
broadcasts the contents of 0B . Similar to the non-coding
case, the reception takes place in an opportunistic fashion. A sensor multiplies each overheard transmission with Lrandomly chosen coefficient from \ 0qF and the L values so
obtained are added to the storage locations 1, , LB B .
Each encoded symbol e received by the sink can be described by an index set eI and coefficients
: \ 0e e qf I F , such that e
e ii I
e f i a
. We refer to eI as
the degree of the symbol. It should be appreciated that the above protocol ensures that the degree of the encoded
541
symbols increases with each round and hence Natural Growth Codes is a suitable descriptive term.
A.3 Growth Codes
The network model we consider in this paper is distinct from [1] and hence the original GC protocol should not be directly employed. In order to deduce a suitable variant, we have to give GCs a few unfair advantages. Hence, in the variant employed here, we assume that at the start of each round the sink can broadcast a value maxd to all the sensors.
In accordance to the discussion in [1], we set
max arg max1d
Z Zd K Z
d d
, where Z
represents the number of symbols already recovered by the sink. The original GC scheme does not utilize any feedback and therefore performs significantly worse than the variant we use for comparison.
At the start of each round, each sensor updates the
storage locations 1, , LB B in the following manner: If
the degree of an encoded symbol je stored at location jB
is less than maxd and o je eI I then multiply the encoded
symbol 0e stored at 0B with a random coefficient from
\ 0qF and add this value to je .
At each transmission opportunity a sensor aX randomly
chooses a neighbor bX . Nodes aX and bX randomly
choose two numbers , 1,i j L , the contents of iB in aX
are exchanged with jB in bX . Note that each exchange
consists of FORWARD transmission from aX and a
RETURN transmission from bX . Hence, for a fair
comparison each round of communication under the GC protocols is considered to be equivalent to two rounds.
B. Medium Access Control
In each round, sensors randomly get an opportunity to transmit in a non-interfering fashion. We employ the disk model to describe the interference in the network. Consequently, every sensor iX can have a maximum of
one transmitter jX within a distance ,i jd X X R . Thus
the transmitter’s in each round are determined in the following manner:
Choose some random permutation of 1, , K , FOR i
= 1:K, set : 0t i , : 0g i ,
FOR x = 1:K, ( )i x , IF 0g i , set : 1t i and DO
FOR j = 1:K, IF ,i jd X X r , set 1g j and
DO
FOR j = 1:K, IF ,j ld X X r , set
1g l
All iX with 1t i are declared as transmitters.
The above algorithm does not guarantee that the exchanges required for GCs can be conducted in a non-interfering manner. In particular, the RETURNtransmissions may interfere with each other. However, for the sake of comparison we ignore this interference.
C. At Sink Decoding Algorithm
The sink discards all symbols that can be expressed as a linear combination of previously received symbols. In addition, the sink maintains an index list of recovered symbols and list of non-discarded encoding symbols. End of each round the sink does the following:
Phase 1: LT Decoding
WHILE e s.t. I 1e DO, FOR e
IF eI 0 , set e
IF eI 1 , with appropriate scaling recover
the symbol belonging to eI , set e ,
set e
Phase 2: Gaussian Elimination
IF ee
I
, use Gaussian Elimination to
recover all symbols in ee
I
, set , set
e
III. RESULTS & ANALYSIS
In an extensive simulation study, we observed that, for a given periphery shape and network size K, there exist a threshold thR such that for all transmission ranges thR R ,
the average number of symbols recovered by NGC at the end
542
of each round is greater than those recovered by GC. We have verified the validity of the above phenomenon on a number of network shapes and sizes, however due to brevity we illustrate the above fact by considering in detail a network consisting of K = 256 sensors, distributed randomly within 1 x 1 square area. Also note that all the results presented here use q = 28-1 and have been averaged over 200 repetitions.
Consider some useful terminology: For a given
transmission range R , let recovery profile ,R T represent
the average number of sensor readings recovered at the end of round T by protocol . We say that scheme A dominates
scheme B, for the range R , iff , ,A R B RT T for all T.
Let , , ,|A B R A R B RS T T T . We refer to ,A B RS
as the duration of non-dominance of A over B. Figure 2 plots the duration of non-dominance of NGC. It can be seen that the number of instances when the recovery of NGC is inferior to GC or No-Coding decreases rapidly with an increase in transmission range. It can be clearly seen that for
0.19thR R , NGC are dominant i.e. at the end of any
round NGC have provided the greatest partial recovery.
Figure 2 Duration of non-dominance of Natural Growth Codes with respect to Growth Codes and No-Coding
The cause for observing partial recovery in a PM topology can be explained in the following manner: Unlike typical sensor network topologies, in a PM network the flow-capacity increases as the proximity to the periphery increases. Consequently the rate at which information percolates to the sink is greater than the rate at which information is “mixed”. Therefore, the number of encoding symbols received by a sink is often significantly greater than the number of sensor readings used to define these symbols. In such a scenario, the encoding symbols form rank complete sub-systems even when K independent linear combinations have not been received.
Compared to GCs, NGCs conduct routing and coding in a more opportunistic fashion. Katti. et. al. have highlighted the virtues of being opportunistic [4]. Consequently the existence of NGC can be exploited to improve the disruption tolerance of the network. This fact is highlighted in Figure 3, which plots the recovery profiles for 0.2R . It can be observed that, at any instance, if all the sensors in a network were to be destroyed, the information recovered by NGC will be the highest. For example, if all sensors were destroyed at round 150 then NGC would have recovered 50 symbols in excess of those recovered by the competing schemes. We also consider scenarios where only part of the network fails. Figure 2 plots the modified recovery profiles for a scenario where at round 100 all sensors in the 0.3 x 0.3 core, at the center of the network, get destroyed. In this case too, NGCs preserve and recover a significantly larger amount of data.
Figure 3 Partial recovery profiles for R = 0.2.
Figure 4 Average normalized complexity of employing Phase 2, Gaussian Elimination, of at sink decoding.
543
NGCs, despite their utility, increase the processing complexity at the sink. However, the partial recovery of data has a useful side effect. We observed that as the transmission range increases, the size of the sub-systems, over which GE gets employed, substantially decreases. To demonstrate this effect we experimentally evaluate the complexity of decoding in the following manner: Firstly note that the complexity of
GE is 3O n , hence, each time Phase 2 of the decoding
algorithm gets executed, we say that 3
eI processing has
been utilized. We calculate the total complexity of decoding by summing the complexities associated with each
run of Phase 2. The expected complexity E is determined
by averaging over multiple runs of decoding. We employ
normalization to get the measure: 3 = complexity E K .
Figure 4 plots this measure for various transmission ranges. It can be clearly seen that as R increases the complexity decreases rapidly. It can be observed that for thR R , the
complexity of decoding is less than 2% of that associated with employing Gaussian elimination on an RLC system defined over K symbols. Furthermore, we observed that beyond a certain threshold of transmission range, Phase 2 of the decoding algorithm rarely gets executed. Thus LT decoding is found to be sufficient even for NGC. It can be easily verified that this is indeed the case if 0.5R .
IV. CONCLUSIONS
We have shown the existence of Natural Growth Codes(NGC) in certain periphery monitoring sensor network topologies. Existence of such codes facilitates partial recovery without compromising on opportunism, thus leading to an improved tolerance to disruptions.
REFERENCES
[1] A. Kamra, V. Misra, J. Feldman, D. Rubenstein, “Growth Codes: Maximizing Sensor Network Data Persistence”, ACM SIGCOMM 2006.
[2] T. Ho, R. Koetter, M. Médard, D. R. Karger and M. Effros, "The Benefits of Coding over Routing in a Randomized Setting" IEEE ISIT, 2003.
[3] M. Luby, “LT codes”, IEEE FOCS, 2003.[4] S. Katti, D. Katabi, W. Hu, H. Rahul and M. Medard,
“The importance of being opportunistic: Practical network coding for wireless environments,”Allerton, 2005.
544