on the nearest neighbor of the nearest neighbor in multidimensional continuous and quantized space

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 54, NO. 9, SEPTEMBER 2008 4069

On the Nearest Neighbor of the Nearest Neighbor inMultidimensional Continuous and Quantized Space

Riccardo Rovatti, Senior Member, IEEE, and Gianluca Mazzini, Senior Member, IEEE

Abstract—The probability that an entity in a set of entities uni-formly distributed in space is the nearest neighbor of its nearestneighbor is evaluated for generic distances in a multidimensionalenvironment. Such an expression is then specialized for systemswith norm-based distances and for systems with quantized norm-based distance. Examples for scalar products and sup-norm arederived. When applicable, invariances with respect to the under-lying distance and entities density are highlighted. Dimensionalityeffects are investigated.

Index Terms—Dimensionality effect, Euclidean distances,nearest neighbor, Poisson point processes, quantized distances.

I. INTRODUCTION

M ANY physical, biological, and even social phenomenaas well as an increasing number of artificial systems are

modeled by the interaction of entities spread in a physical orabstract space in which distance regulates the easiness or im-portance of interaction (see, e.g., [1]–[3]).

A plethora of abstract signal processing methods also entail apopulation of points in multidimensional spaces where distanceimplicitly measure statistical dependence or predictability (see,e.g., the classical “nearest-neighbor” rule for classification [4]).

In all these cases, given an entity , locating its “nearestneighbor” (i.e., that entity minimizing the distance ) isof great interest. Entity , in turn, has its own nearest neighbor

that may or may not coincide with .The situation in which may be regarded as “critical”

in a sense that depends on the application. If we consider entityinteraction, prefers to interact with that, instead, prefers tointeract with . If we consider interference in wireless commu-nication transmits to suffering the disturbance of that iscloser to the receiver. Yet, when routing a broadcast message,the fact that implies that it is easier to forward informa-tion outside the small cluster of and [5], [6].

If we consider a similarity-based classifier can be classi-fied as even if other data points are more similar to it.Hence, the occurrence of the event under consideration may af-fect nearest neighbor based computations (see, e.g., [7]) and itsdual (see, e.g., [8]).

If we decide to build a graph having entities as nodes and adirected edge connecting each entity with its nearest neighbor.The “nearest neighbor of the nearest neighbor” question implies

Manuscript received April 24, 2007; revised March 11, 2008. Published Au-gust 27, 2008 (projected).

R. Rovatti is with the ARCES, University of Bologna, 40125, Bologna, Italy([email protected]).

G. Mazzini is with the ENDIF, University of Ferrara, 44100, Ferrara, Italy(e-mail: [email protected]).

Communicated by V. A. Vaishampayan, Associate Editor At Large.Digital Object Identifier 10.1109/TIT.2008.928246

Fig. 1. (A) A two-dimensional entity layout in which every entity is the nearestneighbor of its nearest neighbor. (B) A two-dimensional entity layout in whichno entity is the nearest neighbor of its nearest neighbor.

the possibility of partitioning such a graph into disjoint -cycles.Though this can be clearly generalized to -cycles we will limitourselves to this.

Yet, this graph-theoretic point of view, highlights how the“nearest neighbor of the nearest neighbor” question is linkedto the clustering features of the neighboring graph as definedand addressed in abstract and applied treatment of small-worldproblems (see, e.g., [9], [10], and [11]).

More formally, the general problem addressed here is thecomputation of the probability that such a critical situationdoes not appear when the entities are assumed to be randomlyand uniformly distributed in an -dimensional space.

We attack this problem from the statistical point of view since,for a given deterministic layout of entities it can be solved byexhaustive inspection and the two limit cases can be triviallybuilt.

In fact, it is easy to conceive a set of entities such thatand are very close to each other while all pairsare scattered in space very far from each other. This is a layout(see Fig. 1(A)) in which each entity is the nearest neighbor ofits nearest neighbor (in statistical terms ).

Moreover, even in the trivial one-dimensional case, posi-tioning the entity at coordinate forgives a layout in which is the nearest neighbor of butits nearest neighbor is . This is a layout (see Fig. 1(B)) inwhich no entity is the nearest neighbor of its nearest neighbor(in statistical terms ).

0018-9448/$25.00 © 2008 IEEE

4070 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 54, NO. 9, SEPTEMBER 2008

The paper gives a quite general framework for the proba-bility evaluation, depending on a general distance function inan -dimensional Euclidean space that we identify with .

The adoption of different distances can be used to model dif-ferent applications. Referring to the examples above, in fact,one may want to measure the electromagnetic distance for radiocommunication but also the number of hops needed to go fromone entity to the other in a routing scenario. Also, different mea-sures of similarity can be employed in a nearest neighbor clas-sifier.

The general formula is specialized for the case of distancesbased on norms and, in turn, for norms based on scalar productsand for the sup-norm.

The same general formula is also specialized for the case ofdistances obtained by quantizing a norm with a certain stepand, in turn, for quantized norm based on scalar products andfor the quantized sup-norm.

In all cases, dimensionality effect is investigated by lettingthe number of dimensions grow to infinity. As a result, it is high-lighted how the presence of a quantization leads to a completelydifferent asymptotic behavior.

The paper is structured as follows. In Section II, somecommon definitions about metric spaces are recalled with aparticular emphasis on the use of norms and quantized normsto define distance functions. Section III recalls the well-knownlink between uniform distribution over space and Poissondistribution.

The theoretical evaluation of is done in Section IV bymeans of an abstract and general approach based on the defi-nition of proper metric events. Results of this section are givenin terms of a multidimensional averaging operator and need fur-ther specialization to be applicable. At the end of the section, adiscussion on the role of the distance function and its interac-tion with the definition of the events is given to highlight thekey points that cause to depend on .

The specialization for distance based on norms is presented inSection V, where the generic results of Section IV are recast interm of a single -dimensional averaging operator. In thesame section, we specialize this for a norm implied by a scalarproduct and for the sup-norm. Asymptotic cases are analyzedrevealing that in both cases tends to .

A similar path is taken in Section V, where distances obtainedby quantizing a norm are analyzed. In this case, the analyticalexpression cannot be simplified further than a series of integralsof -dimensional averaging operators for which conver-gence is analyzed. In the same section, we specialize this for anorm implied by a scalar product and for the sup-norm. Asymp-totic cases are analyzed revealing that in both cases tendsto .

Finally, in Section VII, some conclusions are drawn. Notethat few details regarding derivations are reported in the twoappendices.

II. DISTANCES

We assume that the distance functionsatisfies the classical distance-defining axioms :

• and iff ;• ;• ;

as well as the following translation-invariance equality:

We indicate with the ball centered on and radius ,i.e.,

and with the inner ball of the same radius, i.e.,

The special case with deserves a special symbol.

In the following, we assume to deal with subsets of thatfeature only an integer number of dimensions. For any suchwe define to be its unique finite and nonvanishing Haus-dorff measure, i.e., the length for one-dimensional subsets, thearea for two-dimensional ones, etc., independently of the spaceit belongs to.

A. Norm-Based Distances

A class of distance satisfying the above requirements is that ofnorm-based distances. To define them we may resort to a norm

that must satisfy the norm-defining axiomsand :

• and iff ;• ;• ;

and define

From the scaling property of the norm we get. If we indicate with the frontier of ,

the same scaling property gives .

B. Quantized Distances

In this case, we assume that a generic distance is quantizedwith step , i.e., that we measure the distance between twoentities and as

where gives the least integer not smaller than its argument.We may prove that is a proper distance by verifying the

various axioms.• iff . In fact, iff iff

. Since is a distance it is nonnegative andwe must have that is equivalent to .

• trivially follows from the definition and from theproperty of the underlying distance .

• . In fact,

ROVATTI AND MAZZINI: ON THE NEAREST NEIGHBOR OF THE NEAREST NEIGHBOR IN MULTIDIMENSIONAL CONTINUOUS AND QUANTIZED SPACE 4071

From now on, only one distance function will be considered ateach time and we will drop the notation for quantized distancethat will be defined and indicated simply by .

III. UNIFORM DISTRIBUTION OF ENTITIES

We assume that the entities are distributed according to aPoisson point process with density , i.e., that the probability ofhaving entities in any region is

(1)

Intuitively speaking, this is equivalent to assuming that theentities are “uniformly and independently distributed” in . Infact, assume that entities are distributed on a domainwith measure and consider a region , withmeasure . The probability that exactly entities canbe found in is the binomial

(2)

Let now , , and while keeping. The limiting behavior of (2) is (1).

Once (1) is assumed, the probability of having no entity inis , whose complement is the probability ofhaving at least one entity in .

As a final remark, note that from the fact the distributionof entities is given by a Poisson point process, it follows that,without loss of generality, we may assume that the entity forwhich we want to know if it is the nearest neighbor of its nearestneighbor is located at the origin. In fact, assuming that an entityis at the origin would imply conditioning the Poisson distribu-tion to that event, i.e., the use of the corresponding Palm distri-bution. Yet, from the Slivnyak’s theorem [12] we know that forPoisson point processes, such a distribution coincides with theoriginal one.

IV. EVALUATION OF

Since we assume that distances are invariant by translationand thanks to the above mentioned property of Poisson pointprocesses, we may assume without loss of generality that thefirst entity coincides with the origin to write the formal defini-tion

stst

(3)

Then it can be read as meaning that there is no nearestneighbor of st that has a nearest neighborwhich is not (since ) . Note the use of strictinequality implies that, if multiple entities exist at the same min-imum distance from the reference one, all of them are labeledas “nearest neighbors.”

Assume now that a nearest neighbor of the origin is given.For to be the nearest neighbor of it is necessary and suffi-cient that no entity appears in . Yet, since is a nearestneighbor of we already know that no entity exists in .Summarizing, we get that, conditioned to the knowledge of anearest neighbor , the event whose probability is , is to have

Fig. 2. The balls involved in the definition of � as a nearest neighbor of � andthe domain in which no entity can appear if we want � to be the nearest neighborof �.

no entities in , where the use of inner ballsfollows from the strict inequality in the definition (3). The gen-eral situation is exemplified in Fig. 2, where distance-inducedballs are represented as circles for the sake of simplicity, and ashaded area highlights the domain in which an entity could ap-pear to be the nearest neighbor of instead of .

As discussed above, our assumption of a Poisson pointprocess implies that such a probability can be computed as

.Since the conditioning is not known but is a random entity

we get

where is the probability density function of the nearestneighbors of the origin. To compute it we further condition itas where is the prob-ability density function of the distance of the nearest neighborof the origin from the origin itself.

As far as the distribution of is concerned, let bethe probability that the distance of the nearest neighbor of theorigin from the origin itself is not larger than . Since for suchan event to occur it is necessary and sufficient that at least oneentity has a distance from the origin that is not larger than , wemay write

and thus

i.e., the well-known probability density function of nearestneighbors in Poisson point processes in which the dependencyon the measure of the distance-induced ball is highlighted.

As far as is concerned, conditioning ensuresthat . Yet, since neighborhooddepends only on distance, all points in have the sameprobability of being a nearest neighbor of the origin. Hence,

What we get putting all together is given in the equation atthe top of the following page.

To highlight the geometric nature of the inner integral wedefine the averaging operator that acts on a generic function

over a domain to yield


With this

(4)

This expression is the starting point of all subsequent deriva-tions that specialize it and investigate its asymptotic trend when

.Nevertheless, it is here possible to give an immediate and

completely general property of . In fact we may exploit themonotonic behavior of the average operator, of exponential andof measures to bound

Since the measure of a ball is independent of its center we get

that can be brought further noting thatand thus

which indicates that, independently of distance function and di-mensionality, the chance that an entity is the nearest neighborof its nearest neighbor is larger than .

V. NORM-BASED DISTANCES

In this case, the scaling property of norms implies that allvalues can be assumed by the distance function.Since the entities are uniformly distributed, they lie in all pos-sible relative positions and the support of is the sameinterval .

As far as the balls are concerned, the that fact that a distancecan assume any real value in implies that

. We also have that isthe frontier of .

From the scaling property of the norm we also have

With this we get

and thus

(5)

Symmetries in the shape of (and thus ) may causethe average to simplify substantially. In the Euclidean distancecase, for example, the measure of the intersection is independentof and the average disappears. Distances that are invariant forpermutation of axes also simplify the computation.

A. Norms Induced by Scalar Products

Let us restrict our considerations to norms that are implied byscalar products as in

Since , the scalar product is defined by a symmetric,positive-definite matrix such that .

Since can be diagonalized by an orthonormal change of co-ordinates that, in turn, does not affect the uniform distributionof points , we may focus on a diagonal whose positive diag-onal entries ( ) are such that

With this, is nothing but an -dimensional ellipsoidwith semi-axes parallel to the coordinate axes and with length

.Hence, given such that , we must compute

in which we may apply the substitutions andto obtain


Fig. 3. � as a function of � for distances based on scalar products in continuous space.

which is the measure of the intersection of two -dimensionalEuclidean spheres with radius whose centers are placed at unitdistance one from the other.

Exploiting the result in the Appendix we get

where .Since in this case we have that

Hence, the average over entailed by (5) can be identifiedwith the value of the function to be averaged at any point toyield

(6)

that is independent of . As an example and. In Fig. 3, the plot of as a function of is reported.

Note the asymptotic trend toward whose theoretical explana-tion and interpretation will be given in Section V-B.

B. Distance

As an example, consider the distance

where is the th coordinate of and the th coordinate of.In this case, is a hypercube of side centered at the

origin . The frontier is its external sur-

face which is made of faces, each of which is an -di-mensional hypercube of side centered in a point of the kind

.All faces are indistinguishable and thus we may limit our in-

tegral average to only one of them, namely, the one centered at. Hence, we will have .

The intersection is always a parallelepiped whosefirst side has length while the other sides have length .With this we have

where we have substituted for in the last equality.Note now that, given the polylogarithmic function

that is, such that then wemay define

(7)


Fig. 4. � as a function of � for distances based on � norm, compared with scalar product, in continuous space.

such that

With this, we may consider the generic -binary tupleits height to

write

From (7) we get• for

;•

andfor ;

• for .Considering that there are binary -tuples of

height we finally obtain

and thus

(8)

As an example, and . In Fig. 4, the casewith is reported and compared to the scalar product. Theasymptotic trend toward is discussed in Section V-C

.

C. Asymptotic Behavior

Since both the scalar-product and implicitly allow thedefinition of a coherent norm for any , we may investigate thelimit for .

In the scalar-product case we have

that can be plugged into (6) to yield .In the case, we begin by noting that

that can be compared with (8) to yield, for any finite


Let us now assume and note that each of the summandsin the first (finite) sum vanishes for . Hence

In essence

(9)

Note how this result gives ground to the empirical observa-tions we made when commenting Figs. 3 and 4.

VI. QUANTIZED NORM-BASED DISTANCES

We assume that is derived from a norm-based distance .In this case

(10)

where is the probability that the distance between and itsnearest neighbor is exactly , i.e.,

(11)

where we have highlighted which is noneother than the average number of entities in the sphere of radiusequal to the quantization step.

From (10) we also get that the only values of that appearin the integral (4) are . For those values we have

so that

We may then recast (4) into

(12)

where and, for , we may expand the average as

Note now that, due to the scaling property of norm-inducedballs, . Hence

in which we may substitute to obtain

(13)

Plugging (11) and (13) into (12) we finally obtain (14) shown atthe top of the following page.


(14)

When concerned with the number of terms to consider tocompute (14) one may observe that, if indicates the re-sults obtained considering only the first terms of the series,then (12) implies as well as

where we exploited the fact that is the average of a quantitythat is always less than unity and thus that . If we furtherresort to (11), we finally get .

With this, if we want to compute numerically with an errornot exceeding a prescribed we may chose as

A. Quantized Scalar–Product Distances

In the following, we specialize (14) when the underlying dis-tance is induced by a scalar product that, according to the abovediscussion, we assume defined by the diagonal matrix withnonvanishing entries .

To compute , we may start with the numeratorand, given such that (and thus such that

), we compute

Following the same path as in the nonquantized case wereadily obtain

and thus, since

which is independent of and . Hence, the whole average in(14) is equal to .

This considered we get (15), shown at the bottom of the page,again independent on the actual scalar product employed.

To simplify the plot, we consider , where. Fig. 5 reports the as a function of by varying

in the range from to . It is interesting to observe thatthe performance tends to . The theoretical discussion of thisbehavior is addressed in Section VI-B.

B. Quantized Distance

Assume now

where is the th coordinate of and the th coordinate of.In this case, is a hypercube of side centered at the origin

and .As far as the average is concerned, is the external

surface of a hypercube of side which is made of faces,each of which is an -dimensional hypercube of the sameside centered in a point of the kind .

All faces are indistinguishable and thus the integral overis times the integral over the one centered in

on which . For such an ,the intersection is always a parallelepiped whosefirst side has length while the other sides have length

.With this we have

where we have set to proceed with

where we have substituted for in the last equality.The preceding integral can be computed by defining the spe-

cialized hypergeometric function andnoting that

(15)


Fig. 5. � as a function of � for a Euclidean distance quantized, by varying � � �� .

can be directly checked to be such that

With this, we may consider the generic -binary tuple, its height , and

the corresponding two-values tupleifif

to write

Since depends only on the product of itsarguments, depends only on so that

Plugging all this into (14) we have

Note now that and that, by the definition ofthe hypergeometric function and by the finiteness of the integral,we can write


Fig. 6. � as a function of � for a quantized distance, by varying � � �� .

where a relationship between and is present anddue to the integral application.

The final expression of results in

(16)

C. Asymptotic Behavior

The most noteworthy asymptotic feature of the quantized caseis that, when the number of dimensions grows to infinity,grows to unity. This is apparent from the plots in Figs. 5 and6 and, though quantized cases depend on and , this trendseems to be a general property. Note in (14) that the effect of

and is concentrated in that, dependingon the underlying norm and on the quantization step exhibitsdifferent asymptotic trends when .

If we consider a scalar-product norm we have

so that independently of .If we consider a norm we have that

and thus that if and if .

1) The Case : In this case from (14) we get

that can be bounded from above discarding negative quantitieson the right, so that

Since we also have this implies .2) The Case : Let us start from (14) and observe that

though, in this case, it vanishes when . Withthis we may bound from below as

Moreover, since the function to be aver-

aged is always larger than and so is its average. There-fore

(17)

where we have implicitly defined .


Let us now define and as the integer such that. Since all summands in(17) are positive we

may write

(18)

Define now to be the best integer approximation ofand its error.

Note that

(i.e., tends to be an integer) only when we adopt a sup-normand for some positive integer . In the following,we will exclude this case and assume that either the limit abovedoes not exits or it is non-null.

Hence, an increasing sequence of indicesmust exists such that , and the subsequence

for any .For that subsequence we may compute

where we have first exploited the fact that, since is the integerpart of then for any , and then we have usedthe properties of the subsequence .

Continuing we have

where we have noted that in our two cases is either asymp-totically constant or vanishes at most as . Following exactlythe same path, we may prove that the same subsequence is suchthat

so that, by substitution into (18)

from which we finally get

for a wide range of settings.In essence, under our slightly restricting conditions on the

asymptotics of

(19)

in which the is strengthened to a pure when we dealwith a sup-norm with .

VII. CONCLUSION

In summary, the results we obtained on are the following:• a general formula taking into account general distance

functions(4);

• the specialization of the above formula in the case of dis-tances based on norms (5) that is independent of the densitywith which entities are deployed;

• the specialization of (5) for norms implied by scalar prod-ucts (6) that is independent of the actual scalar product con-sidered;

• the specialization of (5) for norms implied by the sup-norm(8);

• the fact that the effect of dimensionality makes both (6) and(8) tend to ;

• the specialization of the above formula in the case of dis-tances based on quantizing norms with a step (14) thatdepends on the compound quantity whichis nothing but the average number of entities in the ball ofradius ;

• the specialization of (14) for norms implied by scalar prod-ucts (15) that depends on but is independent of the actualscalar product considered;

• the specialization of (14) for norms implied by the sup-norm (16); and

• the fact that the effect of dimensionality makes both (15)and (16) tend to though paths to reach this limitmay differ: either each ball of unit radius grows to infinityand engulfs all entities that happen to be only one-stepaway each from the other, or it shrinks causing all pointsto be infinitely and equally distant.

Further, to these formal results and partially supported bywhat we got so far, we may advance a few conjectures.

First, we expect that, given entities distributed according toa Poisson point process, the asymptotic results distinguishingquantized distances from nonquantized distances hold indepen-dently of the distance function itself. Second, assume that thepoint process describing the entity distribution is modeled witha generic function giving the probability that entitiesare present in the region . Among all possible models, we maychose those in which depends only on and not onthe specific (this is obviously the case for Poisson point pro-cesses as modeled in (1)).

In this framework, dimensionality effects are concentrated inthe evaluation of the Hausdorff measure and we conjecturethat this is enough to preserve the asymptotic trend we found inthe Poisson case.

APPENDIX

We compute here the measure of the intersection oftwo -dimensional Euclidean spheres of radius whose centersare apart.

Without loss of generality, we assume that one of the sphere iscentered in the origin, while the other is centered in .

We should compute


If we set to be the measure of the-dimensional Euclidean sphere of unit radius, we have that the

inner integral is nothing but . Hence

in which we may substitute to obtain

with the implicit definition

with the noteworthy particular cases

and the general recursion on

ACKNOWLEDGMENT

The authors are grateful to one of the anonymous reviewerswhose comments and suggestions helped them to substantiallyimprove the correctness and the readability of the manuscript.

REFERENCES

[1] G. Csányi and B. Szendroi, “Structure of a large social network,” Phys.Rev. E 69, pp. 036131–036131, 2004.

[2] M. Mauve, J. Widmer, and H. Hartenstein, “A survey on position-basedrouting in mobile Ad Hoc networks,” IEEE Network Mag., vol. 15, no.6, pp. 30–39, Nov./Dec. 2001.

[3] T. Vicsek, A. Czirok, E. B. Jacob, I. Cohen, and O. Schochet, “Noveltype of phase transitions in a system of self-driven particles,” Phys Rev.Lett., vol. 75, pp. 1226–1229, 1995.

[4] T. M. Cover and P. E. Hart, “Nearest neighbor pattern classification,”IEEE Trans. Inf. Theory, vol. IT-13, no. 1, pp. 21–27, Jan. 1967.

[5] G. Jakllari, S. V. Krishnamurthy, M. Faloutsos, and P. V. Krishna-murthy, “On broadcasting with cooperative diversity in multi-hopwireless networks,” IEEE J. Sel. Areas Commun., vol. 25, no. 2, pp.484–496, Feb. 2007.

[6] A. Helmy, “Small worlds in wireless networks,” IEEE Commun. Lett.,vol. 7, no. 10, pp. 490–492, Oct. 2003.

[7] H. Wang, “Nearest neighbors by neighborhood counting,” IEEE Trans.Pattern Anal. Machine Intell., vol. 28, no. 6, pp. 942–953, Jun. 2006.

[8] Y. Tao, M. L. Yiu, and N. Mamoulis, “Reverse nearest neighbor searchin metric spaces,” IEEE Trans. Knowledge and Data Eng., vol. 18, no.9, pp. 1239–1252, Sep. 2006.

[9] X. F. Wang and G. Chen, “Complex networks: Small-world, scale-freeand beyond,” IEEE Circuits Syst. Mag. , pp. 6–20, First Quarter, 2003.

[10] P. Androutsos, D. Androutsos, and A. N. Venetsanopoulos, “Smallworld distributed access of multimedia data,” IEEE Signal Process.Mag., pp. 142–153, Mar. 2006.

[11] P. Holme and B. J. Kim, “Growing scale-free networks with tunableclustering,” Phys. Rev. E, vol. 65, pp. 026107–026107, 2001.

[12] I. M. Slivnyak, “Some properties of stationary flows of homogeneousrandom events,” Theory Probab. Its Applic., vol. 7, pp. 336–341, 1962.

on the nearest neighbor of the nearest neighbor in multidimensional continuous and quantized space

Documents