localization and positioning in wireless sensor networks
DESCRIPTION
Laboratory ProjectAuthor: Áron Virginás‐Tar Master of Information Technology, Year 1Faculty of Automation and Computers"Politehnica" University of TimisoaraTRANSCRIPT
1
Localization and Positioning in Wireless Sensor Networks
Laboratory Project Student: Virginas-‐Tar Aron
Master of Information Technology, Year 1 Part I. General Overview 1. Introduction
Wireless sensor networks (WSN) are often deployed in an ad hoc fashion. In such context their location is not known a priori. In many applications, being aware of the positions of sensors is essential in order to provide a physical context to sensor readings. For example, in applications such as environmental monitoring, sensor readings without location information are useless. Similarly, location awareness is further necessary for services such as intrusion detection, inventory and supply chain management, surveillance, etc. Finally, localization is fundamental for sensor network services that rely on the knowledge of sensor positions, including geographic routing and coverage area management. The location of a sensor node can be expressed as a global or relative metric. A global metric is used to position nodes within a general global reference frame. The Global Positioning System (GPS) and the Universal Transverse Mercator (UTM) coordinate systems are common examples of such global reference frames. In contrast, relative metrics are based on arbitrary coordinate systems and reference frames. For example, a sensor’s location can be expressed as distances to other sensors without any relationship to global coordinates. It is usually infeasible for all sensor nodes within a WSN to have knowledge of their global coordinates. Sensor networks usually rely on a subset of location-‐aware nodes, called anchors, which are then used to estimate the positions of the other nodes. Localization techniques that rely on anchor nodes are referred to as anchor-‐based localization. A large number of localization schemes are based on range measurements, which are estimations of distances between sensor nodes. These techniques, known collectively as range-‐based localization, require sensors to measure certain characteristics (such as received signal strengths) of wireless communications or time difference of arrival of ultrasound pulses. The most common ranging approaches, together with the basic location estimation techniques will be presented in the following pages. Later sections will discuss different localization schemes proposed in recent years, organized under several categories. 2. Range-‐Based Localization
Numerous localization methods rely on the estimation of physical distances between sensor nodes (range estimates). Such techniques are collectively referred to as range-‐based localization.
2
2.1 Formal Problem Definition
Consider that we have a set of m sensors called anchor nodes, each with a known position expressed as l-‐dimensional coordinates 𝑎! ∈ ℝ! , 𝑘 = 1,𝑚 and n sensors 𝑥! ∈ ℝ! , 𝑖 = 1, 𝑛 with unknown locations. For each pair of two nodes in the network we introduce the Euclidean distance, marked 𝑑!" = 𝑥! − 𝑎! between anchors and non-‐anchors, and 𝑑!" = 𝑥! − 𝑥! , 𝑖 ≠ 𝑗 between two non-‐anchor nodes. Note that in practice 𝑙 ∈ 2,3 , the localization problem being defined in a two-‐ or three-‐dimensional space. Let 𝑑!" and 𝑑!" be the measured values of true physical distances 𝑑!" and 𝑑!" corrupted with noise, so that 𝑑!" = 𝑑!" + 𝜉!" and 𝑑!" = 𝑑!" + 𝜉!" , where 𝜉!" and 𝜉!" denote measurement errors.
Thus the general WSN localization problem can be formulated in mathematical terms as follows: given the noisy distance measurements 𝑑!" , 𝑑!" and the positions of anchor nodes 𝑎! ∈ ℝ! , 𝑘 = 1,𝑚, estimate the locations of the nodes 𝑥! ∈ ℝ! , 𝑖 = 1, 𝑛 with initially unknown positions. [5] 2.2 Ranging Metrics There are several techniques to obtain range estimates based on measurements of certain characteristics of the radio signals exchanged between the sensors. These are collectively referred to as ranging metrics and will be presented in detail hereupon, without the pretension of completeness. 2.2.1 Received Signal Strength The concept behind the received signal strength (RSS) metric is that the transmitted signal experiences an energy loss that is proportional to the distance it travels. A simple model based on single-‐path radio propagation is given by
𝑃! = 𝑃! − 10𝛼 lg𝑅 (1) where 𝑃! and 𝑃! denote the received and transmitted signal strengths in dB and R is the distance between the transmitter and the receiver. 𝛼 is a coefficient regarded as the distance-‐power gradient and it is dependent on the propagation environment. [4] The relation between 𝑃! and 𝑃! can be expressed in a different from through the Friis transmission equation, as follows: 𝑃!
𝑃!= 𝐺!𝐺!
𝜆4𝜋𝑅
!
(2)
where 𝐺! and 𝐺! are the antenna gates of the transmitting and receiving antennas, while 𝜆 denotes the wavelength. In practice, the actual attenuation depends on multipath propagation effects, reflections, noise, etc.; therefore a more realistic model replaces 𝑅! with 𝑅!, where n is typically in the range of 3 and 5. [1] Due to this fluctuating behavior of received signal power, accurate ranging measurements are not always possible. This impacts the accuracy of location estimation, which is lower bounded by the Cramér-‐Rao lower bound (CRLB), expressed as follows for the simplistic RSS model presented earlier: 𝜎!"" ≥
ln 1010
𝜎!!𝛼𝑅 (3)
3
Here, 𝜎!"" is the standard deviation of RSS estimation and 𝜎!! is the variance for shadow fading. [4] The IEEE 802.11 standard defines a mechanism by which signal energy is to be measured by the circuitry on a wireless device. It is expressed as an integer value with an allowable range of 0 to 255, called the Receive Signal Strength Indicator (RSSI). The meaning of RSSI values usually differs from vendor to vendor and so does the maximum admissible value, RSSI_Max, which rarely reaches 255. [1] 2.2.2 Time of Arrival The time of arrival (TOA) distance estimation method relies on the time the signal spends traveling from the transmitter to the receiver. Since the speed of radio signal propagation is known both in free space and air, it is straightforward to estimate the distance between the transmitter and the receiver once the travel time is measured. The basic equation needed to obtain the distance is given as 𝑑 = ∆𝜏𝑐 (4) where ∆𝜏 is the time elapsed from the moment of transmission to the moment of reception and c is the speed of the radio signal (the speed of light). [4] The one-‐way version of the TOA method measures the one-‐way propagation time, and requires highly accurate synchronization of the clocks of the transmitter and the receiver. In order to minimize measurement errors, a two-‐way TOA method is often preferred, where the round-‐trip time of a signal is measured at the sender device. Figure 1 illustrates the difference between the two approaches.
Figure 1: One-‐way (a) and two-‐way (b) TOA For one-‐way measurements, the distance between two nodes i and j can be determined as 𝑑!" = 𝜏! − 𝜏! 𝑐 (5) where 𝜏! and 𝜏! are sending and receiving times of the signal. For the two-‐way approach, the distance is calculated as
𝑑!" =𝜏!! − 𝜏! − 𝜏!! − 𝜏!
2𝑐 (6)
where 𝜏!! and 𝜏!! denote the sending and receiving times of the response signal. [1] As already noted above, the accuracy of the TOA method depends on the synchronization between the clocks of the transmitter and the receiver. This can be achieved by regular data exchange between the two nodes or by introducing additional anchor nodes and measurement redundancy for correcting the clock bias. [4]
4
2.2.3 Time Difference of Arrival Time difference of arrival (TDOA) is a method whereby the receiver calculates the differences in the TOAs from different signals. This method has the advantage that the clock biases between the transmitters and receivers are automatically removed, since only the differences between the TOAs are considered. [4] The TDOA technique can be applied if two transmission mediums of very different propagation speeds are used. For example, let us consider radio waves propagating at the speed of light and ultrasound -‐travelling at the speed of sound. [2] Given two different signals issued by the transmitter at 𝜏!! and 𝜏!!, respectively received by the receiver at 𝜏!! and 𝜏!!, the distance between the two nodes can be calculated as follows: 𝑑 = 𝑣! − 𝑣! ∆𝜏! − ∆𝜏! (7) In the above equation, 𝑣! and 𝑣! represent the velocities of the two signals, while ∆𝜏! = 𝜏!! − 𝜏!! and ∆𝜏! = 𝜏!! − 𝜏!! denote the differences between the transmission and receiving times. There is another variant of TDOA that relies on a single signal to estimate the location of the sender using multiple receivers with known locations. The propagation delay 𝑑! for the signal to receiver i depends on the distance between the sender and receiver i. Through correlation analysis we obtain a time delay 𝛿 = 𝑑! − 4 𝑑! , which corresponds to the difference in path length to receivers i and j. The main disadvantage of this second approach is that the clocks of the receivers must be tightly synchronized, hence there is little advantage compared to simple TOA. [1] 2.2.4 Angle of Arrival
The angle of arrival (AOA) ranging technique is used to determine the direction of signal propagation. [1] A common approach for AOA bearing estimations is to use special structures called uniform linear arrays (ULA), which contain n nodes placed evenly along an axis with spacing d. Having such a structure, the signal’s direction of arrival can be estimated based the following formula: 𝜃 = arcsin
𝑐∆𝜏𝑑 (8)
where 𝜃 is the angle at which the signal is impinging upon the ULA, c is the speed of light and ∆𝜏 is the time difference between the arrivals of the signal at consecutive array elements. Similarly to RSS, the performance bounds for AOA estimation can also be expressed based on the CRLB, as follows:
𝜎!"! ≥𝑐 2 𝐵𝑁!
𝑛𝑑2𝜋𝑓!𝐴𝐴! sin 𝜃 (9)
where 𝜎!"! is the standard deviation for AOA estimation, B is the bandwidth of the signal, 𝑁! is the noise spectral density, A is the channel coefficient, while 𝐴! and 𝑓! are the amplitude and carrier frequency of the source signal. Knowing that the signal-‐to-‐noise ratio (SNR) of the signal can be expressed as
𝑆𝑁𝑅 =𝐴!𝐴!!
𝐵𝑁! (10)
5
the relation (9) can be rewritten in the following form:
𝜎!"! ≥𝑐 2
𝑛𝑑2𝜋𝑓! 𝑆𝑁𝑅 sin 𝜃 (11)
From (11) it can be seen that the CRLB is inversely proportional to the number of elements 𝑁, 𝑓! and SNR. Thus having a high frequency signal, a high number of array elements and a high SNR will yield a high resolution AOA estimation. [4] 2.3 Techniques for Location Estimation Once having the distance or bearing estimates measured by the metrics presented above, there are several different approaches for estimating the location of a node from the sensor network. Two such fundamental techniques will be presented in this section: trilateration and triangulation.
2.3.1 Triangulation
Triangulation is a method for estimating the location of a sensor node based on the geometric properties of triangles. It relies on bearing (angle) measurements from a number of anchor nodes. In two-‐dimensional space, at least two bearing lines and the locations of the associated anchor nodes are required to be able to localize a sensor node. Figure 2 illustrates the concept of triangulation using three anchors. The angles are expressed relative to a fixed baseline in the coordinate system.
Figure 2: Triangulation
Let the vector 𝐱 = 𝑥, 𝑦 represent the unknown location of a receiver node and 𝐱𝒊 = 𝑥! , 𝑦! be the position of n anchor points. The bearing measurements from the anchors are expressed as 𝛽 = 𝛽!,𝛽!,… ,𝛽! , 𝑖 = 1, 𝑛. The measured bearings do not perfectly match the actual angles denoted by 𝜃 𝐱 = 𝜃! 𝐱 , 𝜃! 𝐱 ,… , 𝜃! 𝐱 due to measurement uncertainty. The relationship between them is the following: 𝛽 = 𝜃 𝐱 + ∆𝜃 (12) where ∆𝜃 = ∆𝜃!,∆𝜃!,… ,∆𝜃! denotes the measurement errors. ∆𝜃 can be assumed as Gaussian noise with the covariance matrix 𝐒 = 𝜎!!,𝜎!!,… ,𝜎!! 𝐈. In two-‐dimensional space, the relationship between the bearings of n anchors and their locations can be expressed as: tan 𝜃! 𝐱 =
𝑦! − 𝑦𝑥! − 𝑥
(13)
6
Because we have to take into account the measurement noise, we must consider statistical approaches for estimating the location of the receiver node. A common example is the maximum-‐likelihood estimator (MLE) of the following form: 𝐱 = argmin
𝐱
12 𝜃 𝐱 − 𝛽 !𝐒!! 𝜃 𝐱 − 𝛽 =
= argmin𝐱
12
𝜃! 𝐱 − 𝛽! !
𝜎!!
!
!!!
(14)
This is a least square minimization problem that can be solved using the Newton-‐Gauss algorithm. [1]
2.3.2 Trilateration Trilateration is among the most popular basic techniques for positioning in wireless sensor networks and serves as a building block for more complex solutions. [2] It refers to the process of calculating a node’s position based on measured distances between itself and a number of anchor points with known locations. Given the estimated distance between a sensor node and an anchor, it is known that the sensor must be positioned somewhere along the circumference of a circle centered at the anchor’s position with a radius equal to the distance. In two-‐dimensional space, distance measurements from at least three (non-‐collinear) anchors are required to obtain a unique location. In three dimensions, the distances from at least four (non-‐coplanar) anchors are required. [1] Figure 3 illustrates the concept.
Figure 3: Trilateration
Let us consider three anchor nodes with their locations given as vectors 𝐱! = 𝑥! , 𝑦! , 𝑖 = 1,3. The distances between these anchor nodes and an unknown sensor location 𝐱 = 𝑥, 𝑦 are known as 𝑟! , 𝑖 = 1,3 . The following set of equations can be constructed based on the Pythagoras theorem: 𝑥! − 𝑥 ! + 𝑦! − 𝑦 ! = 𝑟!!, 𝑖 = 1,3 (15)
By subtracting the third ecuation from the first two and rearrangin the terms, we obtain a set of linear equations: 2 𝑥! − 𝑥! 𝑥 + 2 𝑦! − 𝑦! = 𝑟!! − 𝑟!! − 𝑥!! − 𝑥!! − 𝑦!! − 𝑦!!
2 𝑥! − 𝑥! 𝑥 + 2 𝑦! − 𝑦! = 𝑟!! − 𝑟!! − 𝑥!! − 𝑥!! − 𝑦!! − 𝑦!! (16)
This can be tronsformed into matrix form as
7
2𝑥! − 𝑥! 𝑦! − 𝑦!𝑥! − 𝑥! 𝑦! − 𝑦!
𝑥𝑦 =
𝑟!! − 𝑟!! − 𝑥!! − 𝑥!! − 𝑦!! − 𝑦!!
𝑟!! − 𝑟!! − 𝑥!! − 𝑥!! − 𝑦!! − 𝑦!! (17)
where the righ hand side vector only consists of known constants. In practice, the distances that form the basis of the trilateration technique are not perfect but only estimates. Thus, instead of 𝑟! , we must consider 𝑟! = 𝑟! + 𝜀! , where 𝜀! denotes the estimation error. Solving the above equation with 𝑟! will generally not yield the correct values for the unknown position 𝑥, 𝑦 .
The solution to this problem is to use 𝑛 > 3 anchors and redundant distant measurements. Thus we obtain an overdetermined system of equations for which a solution can be computed that minimizes the mean square error. [2] In case of n anchors, the following matrix can be built, which expresses the relationship between positions and distances [1]:
𝑥! − 𝑥 ! + 𝑦! − 𝑦 !
𝑥! − 𝑥 ! + 𝑦! − 𝑦 !
⋮𝑥! − 𝑥 ! + 𝑦! − 𝑦 !
=
𝑟!!
𝑟!!⋮𝑟!!
(18)
This can be transformed into a matrix equation of the form 2𝐀𝐱 = 𝐛 with the coefficient matrix
𝐀 =
𝑥! − 𝑥! 𝑦! − 𝑦!𝑥! − 𝑥! 𝑦! − 𝑦!
⋮𝑥! − 𝑥!!!
⋮𝑦! − 𝑦!!!
(19)
and the right side vector
𝐛 =
𝑟!! − 𝑟!! − 𝑥!! − 𝑦!! + 𝑥!! + 𝑦!!
𝑟!! − 𝑟!! − 𝑥!! − 𝑦!! + 𝑥!! + 𝑦!!⋮
𝑟!!!! − 𝑟!! − 𝑥!!!! − 𝑦!!!! + 𝑥!! + 𝑦!! (20)
The solution of this least square system is the pair 𝑥, 𝑦 that minimizes the 2-‐
norm 𝐀𝐱 − 𝐛 !, where 𝐱 = 𝑥, 𝑦 is the vector describing the unknown position. The 2-‐norm of a vector is defined as
𝐯 ! = 𝑣!!!
!!!
(21)
for any n-‐dimensional vector 𝐯 = 𝑣!, 𝑣!,… , 𝑣! . Observe that the square of the 2-‐norm can be expressed in the following form: 𝐯 !
! = 𝐯!𝐯. Hence, 𝐀𝐱 − 𝐛 !
! = 𝐀𝐱 − 𝐛 ! 𝐀𝐱 − 𝐛 = 𝐱!𝐀!𝐀𝐱 − 2𝐱!𝐀!𝐛 + 𝐛𝑻𝐛 (22) Minimizing this expression is equivalent to setting its gradient equal to zero, as follows: 2𝐀!𝐀𝐱 − 2𝐀!𝐛 = 0⇔ 𝐱 = 𝐀!𝐀 !!𝐀!𝐛 (23)
8
The above equation is guaranteed to have a unique solution under certain conditions and it can be solved by various methods, such as Cholesky factorization. [2] 3. Localization Schemes The previous sections have described the basic techniques that can be applied to determine the location of a node inside a wireless sensor network. These techniques are the foundation of most localization schemes, no matter how complex. Localization schemes must address a wide range of issues, including: network management and control, communication, cost-‐efficiency, etc. This section presents a survey of several different localization schemes proposed by the research community in recent years. Research on localization in wireless sensor networks can be classified into two broad categories: centralized and distributed localization approaches. 3.1 Centralized Localization Schemes Centralized localization relies on the migration of internode ranging and connectivity data to a sufficiently powerful central base station and then the migration of resulting locations back to respective nodes. The advantage of centralized algorithms are that it eliminates the problem of computation in each node, at the same time the limitations lie in the communication cost of moving data back to the base station. 3.1.1 MDS-‐MAP The authors of [6] the authors present a centralized localization scheme they call MDS-‐MAP. The technique they propose is based on multidimensional scaling (MDS), a set of related statistical techniques used for exploring similarities or dissimilarities in data. The algorithm consists of the following three steps: First the scheme computes shortest paths between all pairs of nodes in the region of consideration by the use of all pair shortest path algorithm such as Dijkstra’s or Floyd’s algorithm. The shortest path distances are used to construct the distance matrix for MDS. Next Torgerson–Gower scaling (the classic version of MDS) is applied to the distance matrix, retaining the first 2 largest eigenvalues and eigenvectors to construct a 2D relative map that gives a location for each node. Although these locations may be accurate relative to one another, the entire map will be arbitrarily rotated and flipped relative to the true node positions. Based on the position of sufficient anchor nodes, transform the relative map to an absolute map based on the absolute positions of anchors, which includes scaling, rotation, and reflection. The goal is to minimize the sum of squares of the errors between the true positions of the anchors and their transformed positions in the MDS map. The advantage of this scheme is that it does not need anchor nodes to start with. It builds a relative map of the nodes even without anchor nodes. Once having three or more anchor nodes, the relative map is transformed into absolute coordinates. The disadvantage of the MDS-‐MAP algorithm is that it requires global information of the network and centralized computation to build the map of coordinates. [3]
9
3.1.2 Centralized Localization Based on RSSI The following technique, presented in [7], localizes nodes through radio frequency (RF) attenuation in electromagnetic waves. The scheme consists of three distinct stages: The RF mapping of the network is obtained by conveying short packets at different power levels through the network and by storing the average RSSI value of the received packets in memory tables. The ranging model is created by processing all the tuples recorded between the two anchors. The processing is done at the central unit. This compensates the non-‐linearity and calibrates the model. Let (𝑖, 𝑗,𝑃!" ,𝑃!") be a generic tuple obtained in the RF mapping stage, where i is the transmitting node and j is the receiving node. Now the algorithm first corrects the received power as 𝑃!"! = 𝑓(𝑃!" ,𝑃!"). Here 𝑓() is a function that takes into account the modularity effects. The estimated distance between the nodes will be 𝑟!"! = 𝑚!!(𝑃!"! ), where 𝑚() is the ranging model previously created. In the last stage, the node positions are calculated as the solution of an optimization problem. The final result can be obtained by minimizing the following constrained function:
𝐸 = 𝑘!"𝑎!" 𝑟!" − 𝑟!"!!
!
!!!
!
!!!𝑟!" = 𝑑 𝑖, 𝑗 when i, j are anchors
(24)
where N is the number of nodes, 𝑎!" is an indicator variable (1 when the link is present, 0 otherwise) and 𝑑 𝑖, 𝑗 is the known distance between anchors. The advantage of this technique is that it is practical, self-‐organizing and can be deployed in outdoor environments. The major limitation is high power and the need for each node to forward much information to the central unit. [3] 3.1.3 Localization Based on Simulated Annealing Another scheme that addresses wireless sensor network localization as on optimization problem is presented in [8]. The authors propose an approach based on simulated annealing (SA) to localize the sensor nodes in a centralized manner. Let us consider a sensor network of m anchor nodes and 𝑛 −𝑚 sensor nodes with unknown locations. The proposed algorithm is implemented in a centralized manner, so it has access to estimated locations and neighborhood information of all localizable nodes in the system. The proposed scheme is based on two stages: In the first stage, SA is used to estimate the location of the localizable sensor nodes using distance constraints. Considering 𝑁! as the set containing all one-‐hop neighbors of node 𝑖, the authors formulate the localization problem as follows:
min!!,!!
𝐶𝐹 = 𝑑!" − 𝑑!"!
!∈!!
!
!!!!!
(25)
where 𝑑!" is the measured distance between neighboring nodes i and j, while 𝑑!" is the estimated distance. Then, according to the SA algorithm, a small displacement in a random direction is applied to coordinate estimate (𝑥! , 𝑦!) of any chosen node i and the new value of the cost function is calculated for the new location estimate. If ∆ 𝐶𝐹 =𝐶𝐹!"# − 𝐶𝐹!"# ≤ 0, then the perturbation is accepted and the new location estimate is used as the starting point of the next step. In case ∆ 𝐶𝐹 > 0, the probability for the
10
solution being accepted is 𝑃 ∆ 𝐶𝐹 = 𝑒!∆ !"
! , where 𝑇 is a control parameter (temperature). In the next stage of the algorithm, the authors eliminate the error caused by flip ambiguity. Flip ambiguity occurs when a node’s neighbors are placed in positions such that they are approximately collinear. Such a node can be reflected across the line of best fit produced by its neighbors with essentially no change in the cost function. Figure 4 illustrates flip ambiguity: the neighbors of node 𝐴 are nodes B, C, D and E, which are almost collinear and node A could be flipped across the line of best fit of nodes B, C, D and E to location 𝐴! with almost no change in the cost function.
Figure 4: Flip ambiguity In order to overcome this problem, the authors define a complement set 𝑁! of 𝑁! as a set containing all nodes, which are not neighbors of i. If R is the transmission range of a sensor node and a node 𝑗 ∈ 𝑁! is estimated such 𝑑!" < 𝑅, then j has been placed in the wrong neighborhood of i. So the minimum error due to the flip is 𝑑!" − 𝑅, and the localization problem can be reformulated as:
min!!,!!
𝐶𝐹 = 𝑑!" − 𝑑!"!
!∈!!
+ 𝑑!" − 𝑅!
!∈!!
!
!!!!!
(26)
By simulation results the authors show that this algorithm gives better results than localization schemes based on semidefinite programming (SDP). Localization schemes based on stochastic optimization will be presented in more detail in Part II. The advantage of the proposed method is that it does not propagate error in localization. The proposed flip ambiguity mitigation method works well in a sensor network with medium to high node density. However, when the node density is low, it is possible that a node is flipped and still maintains the correct neighborhood. [3] 3.2 Distributed Localization Schemes In distributed localization schemes all the relevant computations are done on the sensor nodes themselves. The nodes communicate with each other instead of a base station to get their positions in a network. 3.2.1 Collaborative Multilateration The trilateration method presented earlier has an obvious shortcoming: the location of a sensor node can only be estimated in the presence of at least three
11
neighboring anchor nodes. However, this technique can be extended to determine locations of nodes with less then three anchor neighbors. Collaborative multilateration presented in [9] consists of a set of mechanisms that enables nodes found several hops away from location aware anchor nodes collaborate with each other to estimate their locations. The technique consists of three phases: In the first phase, collaborative subtrees are formed. A subtree constitutes a configuration of non-‐anchor and anchor nodes for which the solution of the location estimates can be uniquely determined. The requirement of one-‐hop multilateration for an unknown node is that it is within the range of at least three anchors. A two-‐hop multilateration represents the case when the anchors are not always directly connected to the nodes but they are within a two-‐hop radius from the unknown node. Figure 5 illustrates the difference.
Figure 5: One-‐hop and two-‐hop multilateration In phase two, initial estimates are obtained, as illustrated by Figure 6. In this figure 𝐴 and 𝐵 are anchors and 𝐶 is the node with unknown location. If the distance between 𝐶 and 𝐴 is 𝑎, then the 𝑥 coordinate of 𝐶 is bounded by 𝑎 to the left and to the right of the 𝑥 coordinate of 𝐴: 𝑥! − 𝑎 and 𝑥! + 𝑎. The anchor node 𝐵 is two hops away from 𝐶 and bounds the coordinate of C within 𝑥! − (𝑏 + 𝑐) and 𝑥! + (𝑏 + 𝑐) . By knowing this information, node 𝐶 can determine that the bounds of its 𝑥 coordinate with respect to anchors 𝐴 and 𝐵 are: 𝑥! + (𝑏 + 𝑐) and 𝑥! − 𝑎 . The same applies to the 𝑦 coordinate. Then 𝐶 combines its bounds on 𝑥 and 𝑦 coordinates, to obtain a bounding box of the region where it lies.
Figure 6: Obtaining initial estimates through bounding box In the third phase, the initial node positions are refined using a Kalman filter implementation. Since most unknown nodes are not directly connected to anchors, they
12
use the initial estimates of their neighbors as the reference points for estimating their own locations. As soon as a node computes a new estimate, it broadcasts this estimate to its neighbors, which is then used by the neighbors to update their own position estimates. The advantage of the collaborative multilateration method lies in the fact that it allows sensor nodes to accurately estimate their locations by using known anchor locations that are not in their immediate vicinity. On the other hand, it is computationally expensive. 3.2.2 Cluster-‐Based Approach In [10], the authors propose a distributed algorithm for locating nodes in a sensor network in which the nodes have the ability to estimate the distance to nearby nodes. Since this technique relies heavily on graph theory, the distinction between non-‐rigid and rigid graphs must be discussed first. Non-‐rigid graphs can be continuously deformed to produce on infinite number of different realizations. Rigid graphs cannot, but there are two types of discontinuous deformations that can prevent a realization from being unique: flip ambiguities and discontinuous flex ambiguities. Flip ambiguities occur for a graph in a 𝑑-‐dimensional space when the positions of all the neighbors of some vertex span a (𝑑 − 1)-‐dimensional subspace. In this case, the neighbors create a mirror through which the vertex can be reflected. Discontinuous flex ambiguities occur when the removal of one edge allows part of the graph to be flexed to a different configuration and the removed edge reinserted with the same length. Figure 7 illustrates discontinuous flex ambiguity.
Figure 7: Discontinuous flex ambiguity The algorithm itself consists of two phases: cluster localization and cluster transformation. In the former phase each node becomes the center of a cluster and estimates the relative location of its neighbors, which can be unambiguously localized. For each cluster, all the robust quadrilaterals as well as the largest subgraph composed solely of overlapping robust quads are identified. The authors define robust triangles as triangles that satisfy the following inequality: 𝑏 sin! 𝜃 > 𝑑!"# (27) where 𝑏 is the length of the smallest side, 𝜃 is the smallest angle and 𝑑!"# is a threshold based on measurement noise. A quadrilateral is considered robust if it has four robust sub-‐triangles. The algorithm starts with a robust quadrilateral and when two quads have three nodes in common and the first quad is fully localized, the second quad can be localized by trilaterating from the three known positions. In the cluster transformation phase, the positions of each node in each local coordinate system are shared. As long as there are at least three non-‐collinear nodes in common between the two localizations, the transformation can be computed by rotation, translation and reflection.
13
The advantage of this scheme is that cluster-‐based localization supports dynamic node insertion and mobility. However, it has the disadvantage that in case of low node connectivity or high measurement noise, the algorithm may be unable to localize a useful number of nodes. [3] 3.2.3 Interferometric Ranging Based Localization The idea behind the Radio Interferometric Positioning System (RIPS) proposed in [11] is to utilize two transmitters to create the interference signal directly. If the frequencies of the two emitters are almost the same then the composite signal will have a low frequency envelope that can be measured by cheap and simple hardware readily available on a WSN node. Due to the lack of synchronization of the nodes there will be a relative phase offset of the signal at two receivers which is a function of the relative positions of the four nodes involved and the carrier frequency. By making multiple measurements it is possible to estimate the relative locations of the nodes. Localization using interferometric ranging is an NP-‐complete problem. The authors propose a solution based on a genetic algorithm (GA) approach. Furthermore, the search space is reduced with additional RSSI readings. Compared to more common techniques based on RSS, TOA and AOA ranging, interferometric ranging has the advantage of higher precision. However, as it requires a considerably larger set of measurements, its applicability is limited to small networks. 4. Range-‐Free Localization The localization approaches presented so far were based on distance estimations using ranging techniques and belong therefore to the class of range-‐based localization algorithms. Some localization schemes do not rely on distance or angle measurements, but estimate node locations based on connectivity information. Hence, they are collectively referred to as range-‐free. Such localization techniques do not require additional hardware and are therefore a cost-‐effective alternative to range-‐based techniques. [1] This section describes several different range-‐free localization approaches, without the pretention of completeness. 4.1 Ad Hoc Positioning System The Ad Hoc Positioning System (APS) presented in [12] is an example of a distributed connectivity-‐based localization algorithm that estimates node locations with the support of at least three anchor nodes, where localization errors can be reduced by increasing the number of anchors. Each anchor node propagates its location to all other nodes in the network using the concept of distance vector (DV) exchange. Nodes in a network exchange their routing tables with their one-‐hop neighbors periodically. In the most basic scheme of APS, called DV-‐hop, each node maintains a table {𝑋! ,𝑌! , ℎ!}, where {𝑋! ,𝑌!} is the location of node i and ℎ! is the distance in hops between the referred node and node i. When an anchor node obtains distances to other anchors, it determines an average size for one hop. This is referred to as the correction factor and is propagated throughout the network. The correction factor 𝑐! of anchor i is defined as:
𝑐! =𝑋! − 𝑋!
! + 𝑌! − 𝑌!!
ℎ! (28)
14
for all 𝑗 ≠ 𝑖. Given the locations of the anchors and the correction factor, a node is then able to perform trilateration to estimate its own location. Corrections are propagated via controlled flooding to insure that each node will only use the correction factor from the closest anchor. Figure 8 illustrates the concept for node S.
Figure 8: Example of DV-‐hop localization 4.2 Approximate Point in Triangulation The approximate point in triangulation (APIT) approach is an area-‐based range-‐free localization scheme similar to APS. It relies on the presence of several anchor nodes that know their own location. Any combination of three anchors forms a triangular region and a node’s presence inside or outside such a region allows a node to narrow down its possible locations. The key step in APIT is the point in tri-‐angulation (PIT) test that allows a node to determine the set of triangles within which the node resides. After a node M has received location messages from a set of anchors, it evaluates all the possible. A node is outside a given triangle 𝐴𝐵𝐶∆ formed by anchors 𝐴, B, and C, if there exists a direction such that a point adjacent to M is either further or closer to all points A, B, and C simultaneously. Otherwise, M is inside the triangle and triangle 𝐴𝐵𝐶∆ can be added to the set of triangles in which M resides. This concept is illustrated in Figure 9.
Figure 9: Localization estimation based on the intersection of anchor triangles
Unfortunately, PIT test is infeasible in practice since it would require that nodes can be moved in any direction. However, an APIT test can be used in networks with sufficient node density. The idea is to emulate the node movement in the perfect PIT test using neighbor information that is exchanged. For example, signal strengths between nodes and an anchor can be used to estimate which node is closer to the anchor. Then, if no neighbor of node M is further from or closer to the three anchors A, B, and C simultaneously, M assumes that it is inside the triangle 𝐴𝐵𝐶∆. If this condition is not satisfied, M assumes that it is outside the triangle. Once the APIT test completes, a position
15
estimate can be computed as the center of gravity of the intersection of all triangles in which M resides. [1] 5. Event-‐Driven Localization A third category of localization schemes is based on events that can be utilized to determine distances, angles, and positions. Such events can be the arrival of radio waves, beams of light, or acoustic signals at a sensor node. [1] This section presents a single event-‐based localization scheme in some detail. 5.1 The Lighthouse Approach In the lighthouse location system, sensor nodes can estimate their location with high accuracy without the need for additional infrastructure components, relying only on a base station equipped with a light emitter. Figure 10 illustrates the concept using an idealistic light source with the property that the emitted beam of light is parallel. Thus the width b remains constant.
Figure 9: The lighthouse localization technique The light source rotates and when the parallel beam passes by a sensor, the sensor will detect the flash of light for a certain period of time 𝜏!"#$ . This varies with the distance between the sensor and the light source. The distance d between the sensor and the light emitter can be computed as: 𝑑 =
𝑏
2 sin 𝛼2 (29)
where 𝛼 expresses the angle under which the sensor sees the beam of light. This can be defined as follows: 𝛼 =
2𝜋𝜏!"#$𝜏!"#$
(30)
Here, 𝜏!"#$ denotes the time the light source takes to perform a complete rotation. While b remains constant, a sensor can calculate 𝜏!"#$ = 𝜏! − 𝜏! and 𝜏!"#$ = 𝜏! − 𝜏!, where 𝜏! is the time the sensor detects the light for the first time, 𝜏! is the time the sensor loses sight light, and 𝜏! is the second time the sensor sees the light. This being said, perfectly parallel light beams are difficult to realize in practice and even small beam spreads can result in significant localization errors. To minimize measurement errors, the beam width should be as large as possible. [1]
16
Part II. Optimization Schemes for Wireless Sensor Network Localization 1. Introduction In [5] the authors compare four centralized distance-‐based localization algorithms, utilizing semidefinite programming (SDP), simulated annealing (SA), trilateration and simulated annealing (TSA) and trilateration and genetic algorithm (TGA). They first present all these methods in formal terms; afterwards the performance of each is discussed based on the results of simulation experiments. Performance is presented as a function of several sensor network characteristics, such as network size, node density and deployment. 2. Problem Formulation The authors extend the formal problem definition presented in the first part of this
project with the model of the optimization problem that minimizes the sum of errors in sensor positions. For simplicity, they assume that all sensors are placed on a plane, thus facing a two-‐dimensional localization problem. They consider two optimization approaches, which will be presented hereupon.
2.1 Formulation for Semidefinite Programming
Semidefinite programming proposes to minimize a linear function subject to the constraint that an affine combination of symmetric matrices is positive semidefinite. Such a constraint is nonlinear and non-‐smooth, but convex, so semidefinite programs are convex optimization problems. [13] The distance-‐based localization problem can be formulated as a quadratic optimization problem and transformed to a semidefinite programming problem. Semidefinite programs are a generalization of linear programs, so the authors propose to use the existing linear solvers to solve the transformed problem. Let us consider the network of n non-‐anchors and m anchors. For each pair of nodes, we introduce the upper bound 𝑑!"!"# and lower bound 𝑑!"!"# to the distance between 𝑎! and 𝑥! , and the upper bound 𝑑!"!"# and lower bound 𝑑!"!"# to the distance 𝑥! and 𝑥! . The authors note that these bounds should be determined by measurements, but do not present more details.
They redefine the model of the localization problem as follows:
min!
𝐽!"# = 𝑒!" +!∈!!
!
!!!
𝑒!"!∈!!
!
!!!
(1)
subject to 𝑑!"!"#
!− 𝑒!" ≤ 𝑥! − 𝑥! ≤ 𝑑!"!"#
! + 𝑒!" ,∀𝑖 ≠ 𝑗
𝑑!"!"#!− 𝑒!" ≤ 𝑥! − 𝑎! ≤ 𝑑!"!"#
!+ 𝑒!" ,∀𝑘, 𝑖
(2)
17
where 𝑒!" ≥ 0 and 𝑒!" ≥ 0 represent errors in sensor position estimates, while 𝑥! and 𝑥! are the estimated positions of nodes i and j. 𝑁! ,𝑁! are sets of neighboring nodes with regard to transmission ranges, defined as follows: 𝑁! = 𝑘, 𝑖 ∶ 𝑑!" ≤ 𝑟! , 𝑖 = 1, 𝑛
𝑁! = 𝑖, 𝑗 ∶ 𝑑!" ≤ 𝑟! , 𝑖 = 1, 𝑛 (3)
where 𝑟! and 𝑟! are parameters representing maximal transmission ranges.
This problem can be transformed into matrix form. The authors propose an approach to convert the quadratic distance constraints presented above into linear constraints by introducing a relaxation. Let 𝑋 = 𝑥! 𝑥! ⋯ 𝑥! be the 2 × 𝑛 matrix that needs to be determined and 𝑌 = 𝑋!𝑋. This can be relaxed to 𝑌 ≽ 𝑋!𝑋, which is equivalent to the linear matrix inequality 𝑍 = 𝐼 𝑋
𝑋! 𝑌 ≽ 0 (4) where the relation 𝑍 ≽ 0 means that Z is positive semidefinite. The problem can be reformulated as a standard SDP problem with (1) subject to 1; 0; 0 !𝑍 1; 0; 0 = 1
0; 1; 0 !𝑍 0; 1; 0 = 11; 1; 0 !𝑍 1; 1; 0 = 2
𝑑!"!"#!− 𝑒!" ≤ 0; 𝑣!"
!𝑍 0; 𝑣!" ≤ 𝑑!"!"#
!+ 𝑒!" ,∀𝑖 ≠ 𝑗
𝑑!"!"#!− 𝑒!" ≤ 𝑎!; 𝑣! !𝑍 𝑎!; 𝑣! ≤ 𝑑!"!"#
!+ 𝑒!" ,∀𝑘, 𝑖
𝑍 ≽ 0
(5)
where 𝑣!" is the vector with 1 in the i-‐th position, −1 in the j-‐th position and zero everywhere else, while 𝑣! is the vector of zeros except −1 in the i-‐th position. The above described optimization problem can be effectively solved by the interior point method based SDP solvers available for the optimization research community.
2.2 Formulation for Stochastic Optimization Another approach presented by the authors is based on stochastic methods applied to a nonlinear performance function:
min!
𝐽!" = 𝑥!−𝑎! − 𝑑!"!+
!∈!!
!
!!!
𝑥! − 𝑥! − 𝑑!"!
!∈!
!
!!!
(6)
Three stochastic methods are used to solve the optimization problem: simulated
annealing (SA), trilateration and simulated annealing (TSA) and trilateration and genetic algorithm (TGA). The last two are hybrid localization techniques designed by the authors, which use the trilateration method in combination with stochastic optimization.
18
2.2.1 Simulated Annealing Method
Simulated annealing is a heuristic inspired by the so-‐called annealing process in metallurgy. It is widely used for global optimization problems and it can be applied with success for the localization problem as well. It is implemented as a computer simulation of a stochastic process, performing point-‐to-‐point transformation. For their research, the authors implemented the classic form of SA with one modification: the cooling process is slowed down. At each value of the coordinating temperature parameter T, not one but 𝑃 ⋅ 𝑛 non-‐anchor nodes are selected for modification, where P is a reasonably large parameter to lead the system into thermal equilibrium. In each iteration, the algorithm calculates a new solution, which in our case is a vector of new node positions. A given node is randomly selected and is moved in a random direction at distance Δ𝑑. The value of Δ𝑑 depends on the control parameter T and is restricted by the so-‐called shrinking factor 𝛽 < 1 so that ∆𝑑! = 𝛽 ∙ ∆𝑑 . The authors propose a simple cooling scheme: 𝑇! = 𝛼 ∙ 𝑇, where 𝑇′ is the new value of the temperature parameter. The coordinating parameters of the SA method, such as the initial temperature 𝑇!, the starting distance ∆𝑑! and the coefficients 𝛼 and 𝛽 have to be tuned experimentally.
2.2.2 TSA and TGA Methods These are hybrid localization techniques proposed by the authors that combine the trilateration method with a stochastic optimization heuristic: simulated annealing and genetic algorithm. These methods operate in two phases: first an auxiliary solution (initial localization) is determined, which is later adjusted by applying the stochastic optimization methods. In the first phase a simple method of determining the relative positions of sensor nodes is applied. Trilateration uses the known locations of anchor nodes 𝑎! , 𝑘 = 1,𝑚 and the measured distance between pairs of nodes. The authors describe the method in more detail, however I will lay less emphasis on this phase, as trilateration has already been presented in the first part of the project. Due to distance measurement uncertainty, the coordinates obtained in the trilateration phase are estimated with non-‐zero errors. In addition, the positions of the nodes that have less than three localized neighbors cannot be estimated. In the second phase, stochastic optimization is applied to increase the accuracy of the initial location estimation. The authors implement and compare two variants: TSA and TGA. In TGA the abstract representations of candidate solutions (known as chromosomes in genetic computing) are vectors of the coordinates of all non-‐anchor nodes, of the form 𝑥! 𝑦! 𝑥! 𝑦! ⋯ 𝑥! 𝑦! , 𝑥! , 𝑦! ∈ ℝ . The initial population consists of a set of such chromosomes. The fitness function used is the one defined in (6). For the reproduction stage of the genetic algorithm, the authors propose tournament selection of size two. The crossover operation is defined as discrete recombination similar to element swapping applied to binary vectors. Both coordinates are recombined simultaneously. The mutation operator alters the components of a selected chromosome by adding a vector of generated 2 ∙ 𝑛 random variables following the Gaussian distribution. From the simulation experiments, the authors observed that an increased value of localization error is driven by incorrect location estimates calculated for a few nodes. The number of such incorrect results being small, their influence on the fitness function is not significant; hence the optimization algorithm ignores them. To mitigate this phenomenon, the authors added a correction operation to the algorithm during the second phase. Its objective is to make correction in the location estimates and it is
19
triggered each time the value of the performance function 𝐽!" defined in (6) is less than the threshold value 𝜃.
The correction algorithm functions in the following way: Three random nodes are selected from the group of neighbors of a given node i that violate a smaller number of constraints (3) than the other nodes. In this case the roulette selection method is used. If the new location obtained by trilateration based on the selected nodes is more accurate, it replaces the previous one. This operation is repeatedly applied until all constraints are fulfilled or the predefined iteration number limit is reached.
The threshold value 𝜃 depends on several factors, such as the number of anchor nodes m, the deployment of all nodes in the area, the power of radio devices and the expected noise measurement factor nf. The authors calculate it according to the following formula:
𝜃 =𝜇 ∙ 𝑛𝑓 ∙ 𝑠!,
𝑚𝑛 +𝑚 < 𝛾
𝜆 ∙ 𝑛𝑓 ∙ 𝑠!,𝑚
𝑛 +𝑚 ≥ 𝛾 (7)
where 𝜇, 𝜆 and 𝛾 are experimentally tuned parameters. The variable s represents the average number of neighbors of all nodes creating a network:
𝑠 =1
𝑛 +𝑚𝑐!"
!∈!!
!!!
!!!
𝑐!" =1, 𝑖 ∈ 𝑁!0, 𝑖 ∉ 𝑁!
(8)
where 𝑐!" is the connectivity between the nodes i and j, while 𝑁! is the set of neighbors of node j as defined in (3). In their original versions presented above, the SDP, SA, TSA and TGA schemes require centralized computation. In a later section, the authors also discuss a distributed variant of the two-‐phase methods. 3. Simulation In order to evaluate and compare the efficiency of the proposed optimization schemes, the authors have performed numerous tests for multi-‐hop network topologies. The key metric used for evaluating the localization schemes was the accuracy of the location estimates versus the costs of deployment, equipment, communication and computation. For the simulation, the authors considered sensor networks composed of 200-‐10000 nodes with randomly generated positions in a square region 0,1 ×[0,1]. The number of anchor nodes was 10% of all nodes. The authors have assumed a fixed transmission range r for all devices in each considered network, inversely proportional with the number of nodes. The parameters used by the SA algorithm were set to the following values: 𝛼 = 0.8 , 𝛽 = 0.94 , 𝑇! = 0.1 , 𝑇! = 10!!" , ∆𝑑! = 0.1 and 𝑃 = 4 . The parameters used in (7) to calculate the threshold value in the correction operation were 𝜇 = 0.2, 𝜆 = 0.1 and 𝛾 = 0.05. The distance measurement errors 𝜉!" and 𝜉!" described in the formal definition of the localization problem were assumed as independent zero-‐mean random variables
20
of normal distribution. Hence, the measured noisy distances 𝑑!" and 𝑑!" took the following form: 𝑑!" = 𝑑!" 1 + randn() ∙ 𝑛𝑓
𝑑!" = 𝑑!" 1 + randn() ∙ 𝑛𝑓 (9)
where 𝑑!" and 𝑑!" stand for the true physical distances between pairs of nodes, and 𝑛𝑓 = 10% is a noise factor. Due to measurement uncertainty, it is difficult to find a good metric to compare the results of the different localization methods. The authors have used as a metric the mean error between the estimated and the true location of the non-‐anchor nodes in the network, defined as follows:
𝐿𝐸 =1𝑛
𝑥! − 𝑥! !
𝑟!!100%
!
!!!
(10)
where LE denotes the localization error, 𝑥! is the true position of sensor node i, 𝑥! is the estimated position of node i, and 𝑟! is the radio transmission range of node i. The localization error is expressed as a percentage error and it is normalized with respect to the transmission range in order to allow a comparison of results obtained for networks of different size and range. 3.1 Evaluation The authors present the performance analysis of the two-‐phase technique based on results obtained for the TSA variant of the localization scheme. For a sensor network consisting of 200 nodes (20 anchor nodes and 180 regular nodes), 𝐿𝐸 = 10.64% was obtained after the trilateration phase and 𝐿𝐸 = 0.14% after the simulated annealing phase. Figure 1 shows the estimates of the positions of non-‐anchor nodes calculated in both phases of the TSA method. The estimated positions are marked with stars and the localization error is illustrated by lines connecting the real and estimated locations.
Figure 1: Localization results for phases I and II As expected, the correction operator has also impacted the localization error,
reducing it by up to half, depending on the nodes’ deployment. The authors have also analyzed the effect of varying the number of anchor
nodes. Figure 2 summarizes the results for the TSA method, showing that the
21
localization error decreases as the number of anchor nodes increases. We must note, however, that introducing more anchor nodes increases the network size and the related costs, so a tradeoff is necessary.
Figure 2: Impact of network size on localization accuracy The value of the transmission range r determines the number of neighbors of
each node in the network, so it has a substantial impact on localization accuracy depending on distance-‐based techniques. For inadequately low values of r, the trilateration phase of the TSA algorithm may perform poorly. Figure 3 illustrates the impact of the value of r on the localization error.
Figure 3: Impact of the radio range on localization accuracy The accuracy of localization techniques based on inter-‐node distance
measurements strongly depends on measurement errors. Figure 4 shows the impact of the value of the noise factor 𝑛𝑓 on the localization accuracy.
Figure 4: Impact of distance measurement errors on localization accuracy
22
In the described variant of the TSA method, the whole computation is performed on a single machine, in a centralized manner. Experimental results prove that the computation time increases proportionally to the square of the number of nodes. 3.2 Comparison
The authors compared the results obtained from the TSA and TGA schemes with those obtained applying the SDP and SA methods. The localization errors and computation times obtained for all the considered methods are presented in Table 1. From this table it can be seen that the TSA and TGA methods estimate the location of the nodes quite accurately, the localization error being below 3%. On the other hand, the localization errors for the SDP and SA algorithms applied to the same network are ten times higher, about 30%.
Method Localization error [%]
Computation time [s]
SDP 32.04 6.21 SA 34.34 2.94 TSA 0.16 0.44 TGA 2.34 2.91
Table 1: Localization error and computation time for different methods Localization accuracy strongly depends on the position of nodes inside the
network. In a first series of experiments the authors analyzed networks with evenly distributed non-‐anchor nodes, and evenly and unevenly distributed anchors. If the anchor nodes are evenly distributed, all four methods offer quite accurate solutions; otherwise the results of location estimation are much worse and in some cases not satisfied. Table 2 shows that the TSA method is more robust with respect to anchor distribution than SDP and SA. From the results of their experiments, the authors conclude that it is not suggested to apply distance-‐based localization methods to a network with unevenly distributed non-‐anchor and anchor-‐nodes.
Scenario Method Localization error [%]
Computation time [s]
anchors evenly distributed
SDP 0.18 6.95 SA 2.76 3.04 TSA 0.13 0.46 TGA 3.80 2.85
anchors unevenly distributed
SDP 174.91 5.51 SA 233.89 2.84 TSA 1.78 0.44 TGA 20.61 2.34
Table 2: Localization error and computing time for various deployments of anchor nodes
4. Distributed Version The localization schemes presented above are all centralized. This means that one has to gather measurements between all pairs of network nodes in a single computer to solve the optimization problem. The data transmission to the central unit involves time delays, high communication cost and high energy consumption. Because of
23
these disadvantages, there are many applications where centralized techniques cannot be employed. This is the motivation behind the research of distributed localization schemes. The authors of this paper also propose a fully distributed version of their two-‐phase method. All nodes are involved in the computation process, each being responsible for determining its own position using information about its neighbors. This approach offers a significant reduction in computation requirements because usually the set of neighbors consists only of several nodes. Thus the number of connections is orders of magnitude less then in case of the centralized algorithm. The distributed computation model is also more tolerant to node failures, as it distributes the communication cost evenly across the nodes. On the other hand, in case of a distributed variant of the two-‐phase method, one must consider two scenarios that may lead to a loss of information: the loss of information due to parallel computation and the loss of information due to an incomplete network map. The loss of information due to parallel computation is connected with how optimization in the second phase of the method is performed. In TSA, at each value of the temperature parameter T, P times n non-‐anchor nodes are randomly selected for modification and the coordinate estimates of the chosen nodes are perturbed with a small distance Δ𝑑 in a random direction. Modifications are done sequentially: the location of a current node is determined based on the previous transformations. In a distributed algorithm each node has to perform P small displacements. This is done in parallel and information is exchanged between neighboring nodes every P iterations. Each node updates its location using the information about the previous positions of its neighboring nodes. Because all movements are done independently and in parallel, there is no guarantee that the performance value is better after the iteration. The loss of information due to an incomplete network map impacts the correction process. The correction operation depends on the neighborhood constraints (3) involved in the transmission range. In the centralized approach, a complete network state is available at all times. Hence it is possible to detect the situation when the estimated distance between two adjacent nodes is greater than the transmission range. Similarly, it can be detected if the estimated distance between two nodes is less then the transmission range, but the two nodes are not part of the same neighborhood. In the distributed approach, the situation is quite different. Each node can detect if the estimated distance between it and its neighbor is greater than the transmission range, but it is impossible to find out if from the estimated position the node is close to another node outside its neighborhood. In order to overcome this problem, the authors propose to modify the correction operation. While in the centralized scheme all neighbors are considered, in the distributed one only one-‐hop and two-‐hop neighbors are taken into account.
Figure 1: Localization error for centralized and distributed variants
24
Figure 5 shows the quality difference between the centralized and the distributed TSA algorithms. In the distributed version, the two-‐hop correction was used. The results obtained by the authors confirm that, from the perspective of accuracy, the centralized algorithm performs slightly better than the distributed one in case of evenly distributed anchors, and much better in case of unevenly distributed anchors. 5. Summary and Conclusions The paper provides two methodologies for formulating the wireless sensor network localization problem as a linear and nonlinear task. The authors present the application of linear programming and stochastic algorithms to solve the problem. They describe and investigate novel two-‐phase methods that combine the geometric approach of trilateration with stochastic heuristics (simulated annealing and genetic algorithm). The originality in their approach lies in the combination of deterministic and stochastic techniques and the additional correction operation that improves accuracy. The numerical results they have obtained by simulation confirm that TSA and TGA are efficient and robust localization algorithms. Both these methods are centralized in their original form, but the authors have investigated the possibility of a distributed variant. The distributed approach improves scalability and reduces the calculation complexity, but it produces less accurate location estimates, especially in the case of uneven distribution. As a future research direction they intend to perform experiments on physical test networks and possibly extend their techniques to mobile networks.
25
Part III. Two-‐Phase Wireless Sensor Network Localization Scheme Based on Bacterial Foraging Optimization
Abstract It is often required in a wireless sensor network (WSN) application to able to estimate the locations of randomly deployed sensor nodes. The most common solution for the localization problem is to deploy a number of anchor nodes that have location awareness and to estimate the locations of non-‐anchor nodes through trilateration, using noisy distance measurements from three or more non-‐collinear anchors. This paper presents a centralized optimization scheme for the range-‐based localization problem based on the bacterial foraging optimization (BFO) algorithm. The localization task is formulated as a multidimensional optimization problem. The proposed localization scheme operates in two phases: first initial position estimates are determined by the trilateration method, which are later adjusted by the optimization algorithm.
1. Introduction Bacterial foraging optimization (BFO) is a young member of the family of biology-‐inspired optimization algorithms. It is a swarm-‐inspired method similar to particle swarm optimization (PSO) and ant colony optimization (ACO). The key idea behind the algorithm is the application of the group foraging strategy of a swarm of Escherichia coli bacteria in multidimensional function optimization. Bacteria search for nutrients in their environment and seek to maximize the energy obtained per unit time. Locomotion is achieved by a set of tensile flagella. Flagella help the bacterium to tumble or swim, which are two basic operations performed at the time of foraging. Figure 1 illustrates the actions a bacterium can take in a nutrient solution. The bacteria direct their movement toward nutrients and attempt to avoid noxious environment. This phenomenon is known in biology as chemotaxis. Each individual bacterium can communicate with others by sending signals. Information received from other individuals can influence the foraging decision. When nutriment is sufficient, a bacterium tends to grow to a fixed size and then reproduce through binary fission, producing an exact replica of itself. Due to random external shock, the chemotactic progress may be destroyed and a certain percentage of the bacteria population is dispersed into a new part of the environment or dies. [18]
Figure 1: Swim and tumble of a bacterium
26
The wireless sensor network (WSN) localization problem refers to determining the location of all the deployed sensor nodes. The most common approaches are based on noisy distance measurements between the nodes we want to localize and a number of neighboring anchors. If we consider two-‐dimensional space, given the distance measurements from at least three non-‐collinear anchors from its range, the simplest approach for estimating the location of a sensor node is to rely on the trilateration method. However, a simple method rarely yields an optimal solution. Hence, numerous more advanced localization schemes have been developed in search for more accurate location estimates. In this paper, I propose a two-‐phase localization scheme that applies the BFOA optimization algorithm to a set of initial location estimates determined by trilateration. By this I seek to reduce the estimation errors caused by measurement uncertainty. 2. Related Work Recently it is common to view the WSN localization problem as a multidimensional optimization problem and address it through population-‐based techniques. In [5], the authors describe and compare two-‐phase optimization schemes based on simulated annealing (SA) and genetic algorithm (GA), hybridized with trilateration. They compare the stochastic optimization techniques to convex optimization based on semi-‐definite programming (SDP). Such an SDP-‐based optimization scheme is presented in [14] presents in much more detail. PSO has also been proposed for centralized localization of WSN nodes in [15] and proven to provide more accurate solutions than the simulated annealing approach described in [16]. In [17], the authors present a distributed single-‐phase localization approach based on PSO and BFO. [19] presents a general performance analysis of the BFO algorithm compared to PSO. 3. Problem Formulation The two-‐dimensional range-‐based localization problem can be formulated in the following way: Let us have a set of m sensors called anchor nodes, each with a known position expressed as 𝑃! = 𝑥! , 𝑦! ∈ ℝ!, 𝑘 = 1,𝑚 and n sensors with unknown locations 𝑃! = (𝑥! , 𝑦!) ∈ ℝ!, 𝑖 = 1, 𝑛 . Let 𝑑!" = 𝑑!" + 𝜉!" and 𝑑!" = 𝑑!" + 𝜉!" be the estimates of true physical distances 𝑑!" and 𝑑!" corrupted with measurement noise. Given the distance measurements 𝑑!" ,𝑑!" and the positions of anchor nodes 𝑃! =𝑥! , 𝑦! ∈ ℝ!, 𝑘 = 1,𝑚, we need to estimate the locations of the nodes 𝑃! = (𝑥! , 𝑦!) ∈ℝ!, 𝑖 = 1, 𝑛. Each node (both anchor and non-anchor) has a transmission range of r units. We regard nodes i and j as neighbors if 𝑑!" ≤ 𝑟. Different neighborhoods of node i can be defined as follows: 𝑁!" = 𝑘, 𝑖 ∶ 𝑑!" ≤ 𝑟 , 𝑖 = 1, 𝑛
𝑁!" = 𝑗, 𝑖 ∶ 𝑑!" ≤ 𝑟 , 𝑖 = 1, 𝑛𝑁! = 𝑁!" ∪ 𝑁!"
(1)
For the optimization problem we consider the following cost function for a single node position:
27
𝐽 𝑃 = 𝐽 𝑥, 𝑦 =
1𝑚
𝑥 − 𝑥! ! + 𝑦 − 𝑦! ! − 𝑑!!
!
!!!
(2)
where 𝑃! = 𝑥! , 𝑦! represents the location of anchor k. [10] We consider an optimal solution a vector of estimated locations 𝐏 = 𝑃!,𝑃!,… ,𝑃! that minimizes the global cost function
𝐽 𝐏 = 𝐽(𝑃!)!
!!!
(3)
The total localization error is computed as the mean of squares of distances between actual node locations 𝑃! = (𝑥! , 𝑦!) and the estimated locations 𝑃! = (𝑥! , 𝑦!), as follows:
𝐸! =1𝑛
𝑥! − 𝑥! ! − 𝑦! − 𝑦! !!
!!!
(4)
It is easily provable that by seeking the minimum of the global cost function (3), we also minimize localization error (4). 4. Proposed Solution The localization scheme I propose hybridizes the trilateration method with the BFO optimization algorithm. First I will present the BFO algorithm in general and then I will point out the particularities regarding the localization problem. 4.1 The BFO Algorithm The BFO algorithm mimics the four principal mechanisms observed in a real bacterial system: chemotaxis, swarming, reproduction, and elimination-‐dispersal to solve a non-‐gradient optimization problem. Let us define a chemotactic step as a tumble followed by a tumble or a tumble followed by a run (swimming). Let 𝑗 be the index for the chemotactic step. Let k be the index for the reproduction step. Let l be the index of the elimination-‐dispersal event. Also consider the following parameters: p the dimension of the search space; 𝑆 the total number of bacteria in the population; 𝑁! the number of chemotactic steps; 𝑁! the swimming length; 𝑁!" the number of reproduction steps; 𝑁!" the number of elimination-‐dispersal; 𝑃!" elimination-‐dispersal probability; 𝐶 𝑖 the size of the step taken in the random direction specified by the tumble. Let 𝑃 𝑗, 𝑘, 𝑙 = 𝜃! 𝑗, 𝑘, 𝑙 | 𝑖 = 1, 𝑆 represent the position of each member in the population of S bacteria at the j-‐th chemotactic step, k-‐th reproduction step, and i-‐th elimination-‐dispersal event. Let also 𝐽(𝑖, 𝑗, 𝑘, 𝑙) denote the cost or nutrient surface at the location of the i-‐th bacterium 𝜃! 𝑗, 𝑘, 𝑙 ∈ ℝ!. The basic steps which compose the BFO algorithm will be described below:
28
• Chemotaxis: This process simulates the movement of a bacterium through swimming and tumbling using its flagella. Considering a bacterium at position 𝜃! 𝑗, 𝑘, 𝑙 and 𝐶(𝑖) the size of the step taken in the direction specified by the tumble, the chemotactic movement of the bacterium can be expressed as 𝜃! 𝑗 + 1, 𝑘, 𝑙 = 𝜃! 𝑗, 𝑘, 𝑙 + 𝐶(𝑖) ∆(!)
∆!(!)∆(!), where ∆ represents a
vector in a random direction with values from −1,1 . • Reproduction: The least healthy bacteria die, while healthy individuals split
into two bacteria, both being placed in the same location. This step ensures the demographic stability of the population. In our computational model, the least healthy bacteria are those, which yield the highest values for the cost function.
• Elimination-‐dispersal: Changes in the environment may negatively affect bacterial populations. Events can take place in such a fashion that all the bacteria in a region are killed or a group is dispersed into a new location. To simulate this phenomenon in BFO, some bacteria are eliminated at random with a very small probability, while the new replacements are randomly initialized over the search space. [19]
4.2 Pseudocode [1] Initialize parameters 𝑝, 𝑆, 𝑁! , 𝑁!, 𝑁!" , 𝑁!" , 𝑃!" , 𝐶(𝑖), 𝜃! [2] Elimination-‐dispersal loop: 𝑙 = 𝑙 + 1 [3] Reproduction loop: 𝑘 = 𝑘 + 1 [4] Chemotaxis loop: 𝑗 = 𝑗 + 1
[a] For 𝑖 = 1, 𝑆 take a chemotactic step for bacterium i as follows [b] Compute fitness function, 𝐽(𝑖, 𝑗, 𝑘, 𝑙) [c] Let 𝐽!"#$ = 𝐽(𝑖, 𝑗, 𝑘, 𝑙) to save this value [d] Tumble: generate a random vector ∆(𝑖) ∈ ℝ! with each element being a random number on −1,1 [e] Move: let 𝜃! 𝑗 + 1, 𝑘, 𝑙 = 𝜃! 𝑗, 𝑘, 𝑙 + 𝐶(𝑖) ∆(!)
∆!(!)∆(!). This results in a step of
size 𝐶(𝑖) in a random direction [f] Compute 𝐽(𝑖, 𝑗 + 1, 𝑘) [g] Swim: we consider only the i-‐th bacterium swimming, while the rest i) Let 𝑚 = 0 (counter for swim length) ii) while 𝑚 < 𝑁! (have not climbed down too long) Let 𝑚 = 𝑚 + 1 If 𝐽(𝑖, 𝑗 + 1, 𝑘, 𝑙) < 𝐽!"#$ (if doing better), let 𝐽!"#$ = 𝐽(𝑖, 𝑗 + 1, 𝑘, 𝑙) and let 𝜃! 𝑗 + 1, 𝑘, 𝑙 = 𝜃! 𝑗, 𝑘, 𝑙 + 𝐶(𝑖) ∆(!)
∆!(!)∆(!). Use this
location to compute a new 𝐽(𝑖, 𝑗 + 1, 𝑘, 𝑙) Else, let 𝑚 = 𝑁! (end of the while statement) [h] Go to the next bacterium 𝑖 + 1, if 𝑖 < 𝑆
[5] If 𝑗 < 𝑁! , go to step [4] (continue chemotaxis) [6] Reproduction:
[a] For the given k and l, and for each i, let 𝐽!!"#$!! = 𝐽(𝑖, 𝑗, 𝑘, 𝑙)!!!!!!! represent
the health cost of the bacterium (a measure of how many nutriments it got in its lifetime and how successful it was avoiding noxious environments). Sort bacteria in ascending order based on 𝐽!!"#$!! (higher value means lower health) [b] The 𝑆! bacteria with the highest 𝐽!!"#$!! value die and 𝑆! bacteria with the best values split
[7] If 𝑘 < 𝑁!" , go to step [3]
29
[8] Elimination-‐dispersal: For i with probability 𝑃!" eliminate and disperse each bacterium (this keeps the number of bacteria in the population constant)
4.3 Localization Scheme In the first phase of the localization scheme, preliminary estimates are determined for node locations through trilateration based on the noisy distance measurements 𝑑!" . These location estimates 𝑃! are then taken as input in the stochastic optimization phase and are used to initialize the bacterium population. Since we considered the two-‐dimensional localization problem, the search space will be 𝑝 = 2. The number of bacteria will be equal to the number of localizable nodes, hence 𝑆 = 𝑛. The locations of bacteria inside the search space will be initialized with the preliminary location estimates: 𝜃! = 𝑃𝑖, 𝑖 = 1, 𝑛. These initial locations, since they were determined by trilateration, should yield much better values for the cost function compared to the classical random initialization. Hence, it is reasonable to expect that the localization error would be kept low even if parameters 𝑁! , 𝑁!, 𝑁!" and 𝑁!" took small values. The size of the distance a bacterium can travel during a chemotactic step will be directly proportional and much smaller than the transmission range r of the anchor nodes, which upper bounds the measured distances: 𝑑! ≤ 𝑟,∀𝑖. Hence: 𝐶 𝑖 = 𝜆𝑟, 𝜆 ≪ 1. However, the actual calibration of these parameters requires simulations, which are beyond the scope of this work, but an obvious direction for future research. In simulations, measurement errors could be introduced as Gaussian white noise. In my proposed localization scheme all computations are considered centralized for the sake of simplicity. Designing and implementing a distributed version could also be the subject of future work. 5. Summary and Conclusions Studies [18][19] have proved the efficiency of the BFO algorithm for solving non-‐linear optimization problems. WSN localization can be formulated as a quadratic optimization problem, hence it can be addressed by stochastic optimization methods, as seen in [5][15-‐17]. The authors in [5] also present two hybrid localization techniques based on trilateration and stochastic optimization, which can be regarded as the starting point for the scheme I propose. The simulations in [5] prove the efficiency of these combined methods over the stochastic optimization only approaches. The reason behind this is that in case of the hybrid methods the population is not initialized randomly, but takes values of preliminary location estimates. BFO is a relatively new bio-‐inspired stochastic optimization algorithm which has not been so exhaustively studied with regard to WSN localization as SA, GA or PSO, making it an interesting algorithm to study. The purpose of this article was to outline the theoretical basics of a hybrid localization scheme based on trilateration and BFO. Simulations, efficiency tests and comparison with similar methods will be the subject of eventual future work.
30
Part IV. Bibliography [1] W. Dargie, Ch. Poellabauer – Fundamentals of Wirless Sensor Networks, John Wiley & Sons, 2010 [2] H. Karl, Andreas Willig – Protocols and Architectures for Wireless Sensor Networks, John Wiley & Sons, 2005 [3] A. Pal – Localization Algorithms in Wireless Sensor Networks: Current Approaches and Future Challenges, in: Network Protocols and Algorithms, Vol. 2, No. 1, 2010 [4] G. Mao, B. Fidan – Localization Algorithms and Strategies for Wireless Sensor Networks, Hershey, 2009 [5] E. Niewiadomska-‐Szynkiewicz, M. Marks – Optimization Schemes for Wireless Sensor Network Localization, in: International Journal of Applied Mathematics and Computer Science, 2009, Vol. 19, No. 2 [6] Y. Shang, W. Ruml, Y. Zhang, M. Fromherz – Localization from Mere Connectivity, in: Proceedings of ACM Symposium on Mobile Ad Hoc Networking and Computing, June 2003 [7] C. Alippi, G. Vanini – A RSSI-‐Based and Calibrated Centralized Localization Technique for Wireless Sensor Networks, in: Proceedings of 4th IEEE International Conference on Pervasive Computing and Communications Workshops (PERCOMW’06), March 2006 [8] A. A Kannan, G. Mao, B. Vucetic – Simulated Annealing Based Wireless Sensor Network Localization, in: Journal of Computers, Vol. 1, No. 2, May 2006 [9] A. Savvides, H. Park, M. Srivastava – The Bits and Flops of the N-‐Hop Multilateration Primitive for Node Localization Problems, in: Proceedings of the 1st ACM international Workshop on Wireless Sensor Networks and Applications (WSNA'02), September 2002 [10] D. Moore, J. Leonard, D. Rus, S. Teller – Robust Distributed Network Localization with Noisy Range Measurements, in: Proceedings of the 2nd ACM Conference on Embedded Networked Sensor Systems (SenSys'04), November 2004 [11] M. Maroti, B. Kusy, G. Balogh, P. Volgyesi, A. Nadas, K. Molnar, S. Dora, A. Ledeczi – Radio Interferometric Geolocation, in: Proceedings of 3rd International Conference on Embedded Networked Sensor Systems (SenSys), Nov. 2005 [12] D. Niculescu, B. Nath – Ad Hoc Positioning System (APS), in: GLOBECOM. IEEE, November 2001 [13] L. Vandenberghe, S. Boyd – Semidefinite Programming, in: SIAM Review, Vol. 38, No. 1, 1996 [14] L. Doherty, K. Pister, L. El Ghaoui – Convex Position Estimation in Wireless Sensor Networks, in: Proceedings of the IEEE 20th Annual Joint Conference of the IEEE Computer and Communications Societies INFOCOM 2001, vol. 3, 22–26 April 2001 [15] A. Gopakumar, L. Jacob – Localization in Wireless Sensor Networks Using Particle Swarm Optimization, in: Proceedings of the IET International Conference on Wireless, Mobile and Multimedia Networks, 2008 [16] A. Kannan, G. Mao, B. Vucetic – Simulated Annealing Based Localization in Wireless Sensor Network, in: Proceedings of the 30th Anniversary IEEE Conference on Local Computer Networks, 2005 [17] L. V. Kulkarni, G. K. Venayagamoorthy, M. X. Cheng – Bio-‐Inspired Node Localization in Wireless Sensor Networks, in: Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, October 2009 [18] S. Das, A. Biswas, S. Dasgupta, A. Abraham – Bacterial Foraging Optimization Algorithm: Theoretical Foundations, Analysis and Applications, in: Foundations of Computational Intelligence Vol. 3 [19] A. Biswas, S. Dasgupta, S. Das, A. Abraham – Synergy of PSO and Bacterial Foraging Optimization: A Comparative Study on Numerical Benchmarks, in: Proceedings of the 2nd International Symposium on Hybrid Artificial Intelligent Systems (HAIS), Vol. 44, 2007