comparison of spatial hashing algorithms for mobile wireless

Comparison of Spatial Hashing Algorithms for Mobile Wireless Network Simulations

Carl Hein Jon Russo

Lockheed Martin Advanced Technology Laboratories 3 Executive Campus

Cherry Hill, NJ 08002 856-792-9893, 856-792-9887

[email protected], [email protected]

Keywords: Mobile Network Modeling, Spatial Hashing

ABSTRACT: Mobile wireless network simulations require efficient methods for determining which nodes are reachable by other nodes. This operation must be performed repeatedly as wireless nodes move. Alternatively, methods are needed to predict when a node will come within, or go beyond, the range of another node. Such propagation calculations can dominate the simulation of large networks and ultimately limit a model's scalability. Several spatial hashing algorithms have been proposed to improve scalability. The algorithms rapidly locate only nodes within a given range from another node, without involving calculations on all the other nodes. A set of prior and new algorithms are described and surveyed in this paper. A set of criteria is described for comparing the methods based on the relative frequency of operations needed by mobile wireless network simulations. Four algorithms are evaluated, and the results are presented as a function of network size. Their relative cost and scalability are compared. 1. Introduction Simulations of mobile networks must calculate the locations and distances between moving objects, such as vehicles, and their relativity to stationary objects. The trajectories of moving objects may be specified by lists of way points. Each way point specifies an object’s position at a given time. The location of the object can be interpolated for any time between way points. Optionally, the way point data may contain higher order components, such as velocity and acceleration, to more accurately interpolate positions. In general, the object management services required to support mobile network simulations are: A. Instantiating and removing objects B. Periodically computing their positions (moving

objects) C. Asynchronously interpolating exact positions D. Finding near-by objects. We will refer to the four operations (A, B, C, D) throughout the remainder of this paper. For moving objects, the calculations must be performed repeatedly. Data structures and algorithms are required that will scale efficiently to support these services over arbitrary distances, time spans, velocities,

and number of objects. Table 1 shows the intended ranges to be handled.

Table 1. Intended Range of Distances, Time Spans, Velocities, and Population

Region Sizes

Urban to whole-hemisphere coverage = 1 km to 10,000 km. Typically 500 km on a side

Time Spans 1 minute to several weeks; typically several hours

Velocities 0.0 to ~28,000 km/h (orbital velocity); typically 50 km/h for ground vehicles and 900 km/h for aircraft

Populations 1 to 100,000 objects; typically 1,000 objects

We considered the baseline method to be a single, linear list of all the objects. To locate all objects within a given distance of every other object would require O(N^2) operations with this method, where N is the number of objects, since the whole list must be traversed for each object. Because this operation must be performed repeatedly in mobile network simulations, it could become the dominant computation and effectively limit the scalability of the model for large N. More scalable methods are therefore sought.

Spatial hashing methods generate location-based hash keys that can be used to directly access just the objects located within given sub-regions. Potentially, access time can therefore be relatively independent of the total number of objects being simulated, which could enable scaling simulations to much larger numbers of objects (N). To support the requirements above, several data structure/algorithm combinations were considered. Each differs in their scalability and efficiency to perform the above operations, so their selection depends on how often the above operations are performed. Previous studies [1-11] have suggested algorithms, but tended to consider one operation, such as distance-based access, without weighting the relative cost of the related maintenance operations based on the expected frequency with which they must be performed. However, in this study we assumed that for the intended scenarios, objects are instantiated once (A) and persist for long periods while being moved and queried (B, C, D) many thousands or millions of times. The relative frequencies of the balance of the operations are uncertain, but we assume that updating positions (B) occurs more frequently for most objects (perhaps every 30-seconds for all objects) than asynchronous queries (C) (perhaps only at specific

moments for a specific object). Likewise, finding near objects (D) may occur somewhat less frequently and be called by only a subset of objects as compared to updating positions (B). So a notional expectation of frequencies per object is: A. Once B. Millions of times C. Thousands of times D. Thousands of times Another issue for (D) finding near objects is the span of ranges. Some algorithms work well if the maximum distance-horizon is known ahead of time and is relatively constant. However, for radio applications, we know that we need to perform nearness searches within a given simulation over both short (1-2 km) and long (hundreds of km) ranges, as well as in between, especially when combinations of low-power UHF and high-power HF radios are being simulated simultaneously. Table 2 shows data structure/ algorithms that were investigated. 2. Experiment Design To obtain quantifiable measurements as to how the methods compare, the following test scenarios were defined, and the methods were evaluated against them.

Table 2. The Methods, Advantages, and Concerns of Four Data Structure/Algorithms Method Advantages Concerns

Linear Linked-List of Objects

Un-ordered linear linked list. Simple, dynamic, low-cost A, B, C Finding near objects (D) cost grows as square of number of objects tracked.

2D Hash Matrix

Space is divided into nXm grid cells. All objects located in a given grid cell are attached to a linear list of objects within that cell.

Finding near objects (D) cost is basically independent of the number of objects tracked, if cell size can be set optimally for given scenario.

Need to define region of opera-tion ahead of time. Some cost for maintaining structure with movement (B). Finding near objects (D) cost goes up either for large ranges or small ranges if many nodes. Need to tune cell size to case, and may not be able cover ranges efficiently.

Space Hash Tree

A balanced tree is maintain-ed where each node lists the range of nodes below it. Search is rapid because the tree typically remains only log (N) deep. Decision on which way to navigate at each node is made by comparing ranges. Technic-ally not a pure hash method as such, but a related acceler-ated logical data structure.

Dynamic. Finding near objects (D) cost goes up only as log of the number of objects tracked (i.e., almost constant, vanishing for large N.) Very general purpose, should be reasonably good at all levels, densities, sizes, with no bad blow-ups and without a priori info.

Some cost for maintaining structure with movement (B).

1D Hash Array (proposed after seeing 2D hash matrix results)

Like 2D Hash Matrix above, but space is divided in one direction only (east-west).

Finding near objects (D) cost is basically independent of the Number of objects tracked. A mix of advantages of pure Linear linked-list and 2D hash, possibly with only minimal down-sides of both.

Need to define longitude region of operation a priori. Similar concerns as 2D hash above, but less.

To determine the effects of scalability for each candidate approach, we ran each test with: 2 - objects 10 - objects 100 - objects 1,000 - objects 10,000 - objects 100,000 - objects 1,0000,000 - objects The map region was set to 50,000 meters on a side (North-South/East-West). The objects are initially instantiated randomly within the 50 km X 50 km region. Each movement step randomly selects one-quarter (N/4) of the objects and randomly moves each +/-50 meters North and East. To yield run times of sufficient duration to measure accurately, the tests are run iteratively, with a commensurate number of iterations versus the number of objects tracked to achieve a reasonable run time at all scales. Where practical, the tests were run with: Objects Iterations (Iter)

2 20,000,000 10 2,000,000 100 200,000 1,000 20,000 10,000 2,000 100,000 200 1,000,000 20 Four tests were run for each candidate approach and scale level: • Instantiation (A) • Movement ((N/4) * B)

• Finding Nearby Objects (D) • All (One A, (N/4) B, and D) The last case (All) is considered a realistic total scenario with reasonable mixture of operations. The prior tests help isolate the scalability of the individual operations. Computing exact position (C) is not investigated, since that would be the same for all methods. It is just the interpolation between two way points for the current exact time. All methods could have a pointer a given object's way points or could access the object through a name-lookup hash table to access any object in a similar way. Therefore, the primary concern of this investigation was to determine a method that can find nearest objects (D) while maintaining locations with movement (B). Experiment conditions: All tests were compiled with “gcc –O” and run on the same AMD PC under Redhat Linux Fedora Core 4. 3. Initial Results and Observations To normalize all results for comparison, the effective uSeconds per object per trial is plotted (Figures 1, 2, and 3). Initially, only the first three methods were tested. The fourth method was proposed after analyzing the initial results.

Key: Blue = Instantiation (A) Violet = Move Update (B) Green = Find Nearest (D) Red = All (A, B, D), weighted by relative frequency of operation

Figure 1. Linear-List Benchmark (Baseline)

Figure 2. Matrix Hash

Figure 3. Tree-Hashing Algorithm The initial results were surprising in some respects. The linear-linked list was faster and scaled to large cases better than expected. The matrix hash method was originally set to 256 grids, which was expected to be “coarse” for 50 km. However, its performance faired poorly in sparse situations due to excessive visiting of empty grid cells. Therefore, the 2D matrix was collapsed into a 1D array, such that space is hashed in the East-West direction only. All latitudes are stored at a given longitude hash. This was called 1D “array-hash” and was expected to reduce the area “squared” search to a “linear” search operation. 4. Secondary Findings Again like the matrix hash, the results of the array hash varied widely with density (Figure 4, 5, and 6). Therefore, a much coarser grid was tried: 16 cells

instead of 256. Both the array and matrix methods were tried with 16 cells. The array-hash algorithm then began to approach the operation of simple linear-linked list, and the performance improved dramatically but still was not quite as good overall as the linear list.

5. Conclusion In this paper, we studied the efficiency of six variant spatial indexing schemes through their instantiation, maintenance, and access processes. Under the expected usage conditions, none of the hash-based methods tested here consistently exceeded the baseline linear-linked list method in scalability by any significant amount. Much of this result is due to the relatively higher cost of maintaining the hashing structures when large numbers of nodes are moving. If the nodes were relatively stationary, the hash-based

Figure 4. Array Hash

Figure 5. Matrix Hash with 16 Grids (Coarse)

Figure 6. Array Hash with 16 Grids (Coarse)

methods would certainly offer superior access times, as stationary, then few accesses would be required. The unique aspects of mobile network simulations require frequent position updates and accesses. Considering its relative simplicity, the linear list method appears the best choice of the methods tested so far for the expected conditions of wireless network simulation. Further work should be done in finding and testing other hashing algorithms or designing new spatial hashing algorithms that exhibit lower maintenance costs, which might be more advantageous for mobile network simulations. 6. References [1] Davis, W. A. and C. H. Hwang, “Organizing and

Indexing for Spatial Data,” 2nd International Conference on Spatial Data Handling, Seattle, July 1986.

[2] Harle, Robert K., “Spatial Indexing for Location-Aware Systems,” Harle, Robert K., The First International Workshop on Mobile and Ubiquitous Context Aware Systems and Applications, Philadelphia, PA, August 6, 2007.

[3] Erin Hastings, Jaruwan Mesit, and Ratan Guha,, “A Scalable Technique for Large Scale, Real-Time Range Monitoring of Heterogeneous Clients,” 3rd International Conference on Testbeds and Research Infrastructures for the Development of Networks and Communities Orlando, Florida, May 2007.

[4] Mao Huaqing and Bian Fuling, “Design and Implementation of QR+Tree Index Algorithms,”

International Conference on Wireless Communications, Networking and Mobile Computing, Shanghai, China, Sept. 2007.

[5] Mathias Eitz and Gu Lixu, “Hierarchical Spatial Hashing for Real-time Collision Detection,” Proceedings of the IEEE International Conference on Shape Modeling and Applications 2007, Lyon, France, May 2007.

[6] Hadjieleftheriou, M., E.G. Hoel, and V.J. Tsotras, “SaIL: a Library for Efficient Application Integration of Spatial Indices,” Proceedings of 16th International Conference on Scientific and Statistical Database Management, Santorini Island Greece, June 2004.

[7] Brain, M. and A. Tharp, 1990. Perfect hashing using sparse matrix packing. Information Systems, 15(3), 281-290.

[8] Asserson, U. and T. Moller, 2000. “Optimized View Frustum Culling Algorithms for Bounding Boxes,” Journal of Graphic Tools 2000.

[9] Gross M., B. Heidelberger, M. Muller, D. Pomernats, and M. Teschner, 2003. “Optimized Spatial Hashing for Collision Detection of Deformable Models,” Vision, Modeling, and Visualization 2003.

[10] Hastings, E. and R. Guha, 2005. “Real-Time Range Monitoring Queries on Heterogeneous Mobile Objects by Spatial Hashing.”

[11] Lo, M. and C. Ravishankar. 1996. “Spatial Hash-Joins”. ACM SIGMOD International Conference on Management of Data 1996.

comparison of spatial hashing algorithms for mobile wireless

Documents