hierarchical data structure and algorithms for range data ... · hierarchical data structure and...

42
Master Thesis Hierarchical Data Structure and Algorithms for Range Data Management Hoang-Phong Thuy Nguyen Human Computer Interaction and Robotics UNIVERSITY OF SICENCE AND TECHNOLOGY February 2013

Upload: others

Post on 23-Jan-2020

16 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

Master Thesis

Hierarchical Data Structure andAlgorithms for Range Data

Management

Hoang-Phong Thuy Nguyen

Human Computer Interaction and Robotics

UNIVERSITY OF SICENCE AND TECHNOLOGY

February 2013

Page 2: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

Hierarchical Data Structure andAlgorithms for Range Data

Management

Hoang-Phong Thuy Nguyen

A Thesis Submitted in Partial Fulfillment of

Requirements for the Degree of Master

February 2013

UNIVERSITY OF SICENCE AND TECHNOLOGY

Human Computer Interaction and Robotics

Supervisor: Dr. Jinwook Kim

Page 3: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

We hereby approve the M.S.thesis of “Hoang-Phong Thuy Nguyen”

February 2013

Dr. Heedong Ko

Chairman of Thesis Committee

Dr. Jinwook Kim

Thesis Committee Member

Dr. Hwasup Lim

Thesis Committee Member

UNIVERSITY OF SICENCE AND TECHNOLOGY

Page 4: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

Acknowledgments

I would like to express my gratitude towards my advisor, Dr. Jinwook

Kim, who gave me the chance to work in his research group and many useful

advices either in research or in behavior.

I would also like to thank Dr. Heedong Ko and Dr. Hwasup Lim for ac-

cepting to be in my thesis examination committee and many suggestions about

my work.

It is an opportunity to reveal my deep regards to my labmates, Sungkuk

Chun, HeyongRae Choi, Irina Rogoza and Eyob Tarekgn for their help and

many fruitful discussions. I am especially indebted to Seungpyo Hong for the

work we did together.

I am also grateful to the KIST-UST lecturers who motivated me in the

education of master studies.

The financial support of International R&D Academy, KIST and the as-

sistance of the staffs there are gratefully acknowledged.

ii

Page 5: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

Many special thanks go to my Vietnamese and foreign friends in KIST

who turned my master years into a pleasant time.

Last but not least, I would like to thank my parents, my elder sister and

her little family as well as my darling girlfriend, Ms. Le Hai Yen, for their

lifetime love and support.

iii

Page 6: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

ABSTRACT*

Hierarchical Data Structure and Algorithms

for Range Data Management

RGBD camera, having appeared in recent years, provides rich visual and

range information about indoor environment at common frame-rate. In 3D re-

construction, to deal with large-scale data, full pair-wise matching and conven-

tional data structures such as k-d tree become inefficient at least in the extent

of computational time. To solve that problem, the thesis proposes a combined

hierarchical data structure: feature descriptor hierarchical tree for RGB images

and bounding sphere hierarchical tree for depth images. The former is built

from all visual feature descriptors while the latter is constructed by the spheres

minimally bounding 3D point sets converted from range images. The data man-

agement is expected to effectively handle loop detection, support global regis-

tration or accelerate rendering as well as used in on-the-fly applications.

* A thesis submitted to committee of the University of Science and Technology in a partialfulfillment of the requirements for the degree of Master conferred in February 2013

iv

Page 7: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

Contents

1. Introduction 1

2. Background 3

2.1 Input Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2.2 3D Modeling Procedure . . . . . . . . . . . . . . . . . . . 3

2.2.1 Acquisition . . . . . . . . . . . . . . . . . . . . . . 5

2.2.2 Registration . . . . . . . . . . . . . . . . . . . . . . 6

3. Large RGBD Data Management 9

3.1 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . 9

3.2 Overview of Proposed Data Structure . . . . . . . . . . . 11

3.3 Feature Descriptor Hierarchical Tree . . . . . . . . . . . . 17

3.3.1 Visual Feature Extraction and Description . . . . . 17

3.3.2 Visual Feature Matching . . . . . . . . . . . . . . . 17

3.3.3 Rough Transform Computation . . . . . . . . . . . 18

3.4 Bounding Sphere Hierarchical Tree . . . . . . . . . . . . . 19

3.4.1 Bounding Sphere . . . . . . . . . . . . . . . . . . . 19

3.4.2 Construction of BSHT . . . . . . . . . . . . . . . . 20

4. Experiment Results and Conclusions 23

4.1 RGBD Datasets and Benchmark . . . . . . . . . . . . . . 23

v

Page 8: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

4.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

4.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

Bibliography 28

vi

Page 9: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

List of Figures

Figure 1 An example of 3D reconstruction . . . . . . . . . . . . . . 1

Figure 2 A sample of input data . . . . . . . . . . . . . . . . . . . 4

Figure 3 A sample of RGBD frame . . . . . . . . . . . . . . . . . . 5

Figure 4 An example of using bilateral filter . . . . . . . . . . . . . 7

Figure 5 A pose-graph representation of a SLAM process . . . . . . 10

Figure 6 An example pipeline using the proposed data structure . . . 11

Figure 7 An example of back-projection equation usage . . . . . . . 14

Figure 8 SIFT features of a color image . . . . . . . . . . . . . . . 16

Figure 9 An example of visual feature matching . . . . . . . . . . . 18

Figure 10 Bounding sphere of a 3D point cloud . . . . . . . . . . . . 20

Figure 11 An example of BSHT insertion construction . . . . . . . . 21

Figure 12 5 different construction algorithms: volume and time . . . 22

Figure 13 An example of 3-level BSHT . . . . . . . . . . . . . . . . 24

vii

Page 10: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

1. Introduction

Figure 1: Point cloud reconstruction of Trafalgar Square from several thousand

Internet photos. The reconstructed cameras are shown as black wire-frame

pyramids. Inset: one of the input photos taken from approximately the same

viewpoint. Courtesy of [32]

Have you seen anything interesting like Figure 1 yet? It is three di-

mensional (3D) reconstruction. There are many kinds of models varying from

small object to large scene. Due to characteristics of objects or scenes, a lot

of approaches with different devices are proposed. Active methods with laser

rangefinder emitting light source or other devices emitting microwave, ultra-

sound actively interfere with objects and measure its reflected part. While pas-

sive methods only use sensors to measure the radiance reflected or emitted from

objects, commonly the visible light. Good 3D reconstruction often requires in-

put data including range images from the former and color images from the

1

Page 11: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

latter. Recently, they are provided simultaneously by one kind of inexpensive

RGBD (red-green-blue-depth) cameras, namely Microsoft Kinect, Asus Xtion

Pro Live...

The thesis deals with intelligent large-scale data management in static

indoor environment modeling. 3D object or scene modeling is widely used in

localization, recognition or context-aware human interaction. Many stages of

reconstruction procedure require data management whereas there are not many

reference documents available and open-access. Moreover, computational costs

of traditional management methods increase dramatically to data size [32, 11].

The thesis proposes a new data management to solve that. It is hopeful to show

a way to efficiently handle loop detection, global registration support or ren-

dering acceleration. The management intentionally provides the smaller in size

and appropriate dataset for some specific applications or algorithms.

The thesis is organized as follows: the next chapter gives a brief of nec-

essary background, and chapter 3 describes the data structure and algorithms

for large RGBD data management. Chapter 4 shows some experiment results

and the conclusion of the thesis.

2

Page 12: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

2. Background2.1 Input Data

This thesis uses a large dataset including 640× 480 RGB color images

and depth images captured from a consumer RGBD camera. The RGBD cam-

era acquires color and depth information simultaneously per pixel. Readers can

investigate further about its measuring technique, working condition or some

other advantages and disadvantages in [17, 39]. Its intrinsic parameters (focal

length, optical center, distortion...) of color sensor and depth sensor as well as

the registration from the frame of the depth camera to the one of the color cam-

era are assumed to be known through the careful calibration [16]. Its extrinsic

parameters (rotation, translation) or camera pose do not matter. It means they

are maybe known or maybe not but will be found out during further process.

Figure 2 shows a sample dataset with 100 color and depth images captured

from Kinect mounted on a pioneer robot moving around a room. Data source

is mentioned in Section 4.1.

2.2 3D Modeling Procedure

The state-of-the-art 3D modeling or reconstruction using RGBD cam-

era [15, 25] seems more and more complicated but to the thesis writer’s knowl-

edge, always contains two main stages: acquisition and registration.

3

Page 13: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

(a) color images

(b) depth images

Figure 2: Here are 100 color and depth image samples captured from Kinect

mounted on a pioneer robot moving around a room. Courtesy of [34]

4

Page 14: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

2.2.1 Acquisition

(a) A color image (b) A depth image

Figure 3: A RGBD frame includes one RGB image and one corresponding

depth image. Depth image which is registered to the frame of color camera has

some black areas caused by occlusion from the viewpoint of IR camera and IR

light absorption

Normally, camera imaging stream is captured at or below frame rate. To

RGBD camera, one RGB image and one corresponding depth image are gen-

erated. Figure 3 shows one RGBD frame, including one color image and one

depth image. It is recognized easily some black areas in the depth image. It

has the black borders of the image, the weird available visual information after

undistoring and some black objects boundaries because of the occlusion from

IR emitter, RGB camera and IR received sensor. These cameras are located

in slightly different positions in the horizontal frame. The above phenomena

happens when the depth image is registered to the color image in order to have

the correct depth value per the color pixel.

5

Page 15: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

Next optional step is to reduce the arbitrary noises: distance-dependent

noise, depth value fluctuation, no-depth-measurement pixel, incoherent neigh-

borhood and noisy edges [6]. The thesis has not used any kind of noise re-

moval yet because it mainly deals with the data management not the transform

computation. However, noise reduction is very important to achieve the better

accuracy in most applications. Newcombe et al. (2011) [25] used the con-

ventional Bilateral Filter [36] for each received depth image but improve its

quality further by accumulating many consecutive registered depth ones in vol-

umetric data structure. Meanwhile, Camplani and Salgado (2012) [6] used the

extended version of Bilateral Filer, called Adaptive Joint Bilateral Filter, to re-

duce the spatial noise, preserve and improve the edge depth measurements as

well as apply the Adaptive Kalman Filter to decrease the temporal fluctuations

of pixel values in order to have the consistent depth maps in the end. Figure 4

shows an example using Bilateral Filter to reduce the noises. It is illustrated in

surface normal image. Surface normal at a pixel is computed as a cross product

vector from four neighbor pixels.

2.2.2 Registration

When the images are captured, more importantly, camera extrinsic pa-

rameters at that time should be estimated. In robotics, robot in early era uses

accelerometer sensor or movement of gears to estimate its position. This me-

6

Page 16: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

(a) (b)

Figure 4: Noise removal in depth image using bilateral filter: (a) Surface nor-

mal of raw depth image (b) Surface normal of bilateral filtered depth image

chanical feedback is now improved with computer vision techniques where the

robot or camera position is computed directly from the achieved visual informa-

tion. The following describes a typical approach for registration. Due to time

constrains, initial camera pose is often found online by frame-to-frame tracking

compared directly with one previous registered frame or developed into frame-

to-model tracking compared with the existing registered model in [25]. RGB

image, depth image or both of them can be used. After that, Iterative Closest

Point (ICP) algorithm [4, 38] is usually refined the pose. From the good initial

guess for relative transformation, the algorithm iteratively generates the corre-

spond point pairs and minimizes the error metric. ICP has many variants [28] in

the way to compute error metric, select corresponding points, weight and reject

the pairs... In the end, global registration [32, 12, 19] calculates the final cam-

7

Page 17: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

era pose from the relative transformation with some or all other ones. Global

registration is similar to ICP but run for several images insteadly.

8

Page 18: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

3. Large RGBD Data Management3.1 Related Works

As mentioned in chapter 1, data management is fairly important in 3D

modeling. RGBD camera has its quite limited view, such as for Kinect, 57◦ hor-

izontally, 43◦ vertically, and working distance approximately from 40 centime-

ters to 4 meters as well as the received images in 640×480 resolution [17, 39].

So a single frame or several frames hardly model any object properly. Imagine

that a small space like a working table needs around 500 snapshots, but indoor

environment is usually a living room, a floor of some building with a dataset

of ten, fifty thousands of color and depth images or much more than that. K-

nearest-neighbor (k-nn) search [18] is a simple algorithm but has a practical

significance on many applications. Naive search is to use the pairwise com-

parison between the test image and the other. The computational complexity

of this search is O(N) whereas N is number of frames but with the proposed

management algorithm in the thesis reduces to O(logN) in average case.

There are some typical data managements in 3D modeling or related

fields such as Simultaneous Localization And Mapping (SLAM) [11]. Oc-

tree [22] in 3D space, quadtree [8] in 2D and its generalized one, k-d tree [3],

are probably the most popular to deal with the above computing example, k-

nn search. However, they are the data structures appropriately for 2D or 3D

9

Page 19: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

points not for 640× 480 image. Because the dimension of space in this case

is very large, more than 300,000; consequently, representative points tend to

scatter and it becomes difficult to pick a good splitting dimension. So applying

the above trees is nearly impossible and not efficient [18]. There is another

approach to manage data based on graph [11]. They only use 6D space of cam-

Figure 5: A pose-graph representation of a SLAM process. Every node in

the graph corresponds to a robot pose. Nearby poses are connected by edges

that model spatial constraints between robot poses arising from measurements.

Courtesy of [11]

era pose (3-parameter for rotation, 3-parameter for translation). A frame with

its pose considers as a node, and the relationship of two nodes is estimated by

some distance like Euclidean distance. This method is probably good to visu-

alize data but to reduce computational cost, a hierarchical graph construction

10

Page 20: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

needs defining, which is not an easy task.

3.2 Overview of Proposed Data Structure

Figure 6: A scenario to deal with a new-coming frame uses the proposed data

structure. Input data from [34]

In the thesis, hierarchical data structure and algorithms for large RGBD

data management are proposed. They exploit the advantages of the spatial space

partitioning trees, namely Feature Descriptor Hierarchical Tree (FDHT) in the

thesis, to manage color images and use one kind of bounding volume hierar-

11

Page 21: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

chical tree, Bounding Sphere Hierarchical Tree (BSHT), for depth images. The

usage and advantages of the proposed data structure are illustrated by process-

ing a scenario when a new frame comes. When using a new frame, for instance,

to update the modeling and two trees, it needs registering. If its pose is unknown

or not accurate enough, a pipeline in Figure 6 can help compute and refine it.

FDHT finds initial transformation for the new frame while BSHT accelerates

global registration, which is to refine the transformation, by searching search

related frames with the new one. Global registration is able to run for all exist-

ing frames and the new-coming one but more slowly than using only a proper

subset of registered images which are related, or overlapped more precisely , to

the new frame.

Suppose that there are hundreds or thousands of frames including color

images and corresponding aligned depth ones. First, each depth image is con-

verted into its 3D coordinates through back-projection equation. More infor-

mation is referred to “Multiple View Geometry in Computer Vision” book [13]:

z = depth value

x =(u− cx)∗ z

fx

y =(v− cy)∗ z

fy(3.1)

whereas (u,v) are row and column index of pixel in depth image, depth value

is value at (u,v), (cx,cy) and ( fx, fy) are principal center and focal length of

12

Page 22: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

depth camera respectively. The last two are found in calibration process. To

calculate with these 3D points, all of them need transforming into a common

coordinate, such as the coordinate of first image. To find overlapped frames

with some frame, a pair-wise matching is probably used but its cost is too high.

Therefore, a hierarchical data structure for all depth images is built. Instead

of popular k-d tree or k-means tree, Bounding Sphere Hierarchical Tree [14]

is applied because primitive geometric object here is a set of 3D points sim-

plified by a minimal bounding sphere [37] with a center and radius. BSHT is

constructed with some algorithms and rules explained later. Thanks to BSHT, a

depth image with its bounding volume can find easily their intersected bound-

ing volumes with the order of O(logN) whereas N is number of bounding vol-

umes in average case. To verify the valid overlapped frame, pair-wise matching

is used but number of comparisons is downsized significantly according to the

result of previous step.

Obviously, it may happen that poses of some or even all frames are un-

available because camera tracking gets lost sometimes or all images are sim-

ply snapshotted without any processing. In the situation, their poses or trans-

formation matrices are estimated efficiently by some method. One way is to

compute directly from depth image or a set of back-projected and transformed

3D points, called 3D point cloud, by point feature matching [29] or higher-

13

Page 23: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

(a) Back-projection of pin hole camera. C is camera center and f is focal length. x is

one pixel in depth image and is projected from 3D point X . Courtesy of [13]

(b) A color image (c) A depth image

(d) 3D colored point cloud rendering in similar viewpoint with 2D image

Figure 7: An example of back-projection usage with input data from [34]

14

Page 24: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

dimensional feature (line, plane...) matching [20]. Nevertheless, depth infor-

mation of each image is unstable, continuously updated when a new image

is added into tree, so it is quite risky when using them. Unchangeable color

images are taken place. As seen from literature, matching using visual infor-

mation is well-developed in recent years. Snavely (2011) summarized in [31]

into three main approaches:

Cluster features Put all features descriptors into a global data structure.

Filter image pairs Compute very fast similarity measure between images and

Run visual features matching with similar images.

Cluster images Cluster images into some global descriptor’s groups and

Match a set of cluster representatives.

Global solution for the problem is the last method but the first is simpler and

has acceptable result in this case. Therefore, a new hierarchical tree based

on visual features computed from color images is used. It is called Feature

Descriptor Hierarchical Tree. It helps find rough transformation between an

unregistered new frame with one of registered frames in database.

15

Page 25: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

Figure 8: SIFT features of a color image are drawn in yellow circle and cyan

arrow for position, radius and orientation of features. Input image is from [34]

16

Page 26: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

3.3 Feature Descriptor Hierarchical Tree

3.3.1 Visual Feature Extraction and Description

There are a lot of visual feature extractors and descriptors at the cur-

rent time. Most remarkably, SIFT [21] computes features invariant to scale,

rotation, partially invariant to illumination, viewpoint, noise, and distinctive

(easy to match with other features) as well as some advantages. However, SIFT

computation is not real-time or nearly real-time and feature description vec-

tor is fairly long, 128-bit. Bay et al. (2006) developed SURF [2], inspiredly

from SIFT, with improvement in running performance and half of ancestor’s

vector size. Some other descendant’s feature extractors and descriptors, for in-

stance, NARF [33], BRIEF [5], ORB [27], DAISY [35]..., later overcome the

disadvantages and increase the accuracy. Through some comparison and sur-

vey papers [23, 30], SURF is chosen to use in the thesis but other extractors

and descriptors work as well.

3.3.2 Visual Feature Matching

To accelerate the visual feature matching speed, Fast approximate near-

est neighbor search (FLANN) [24] method is applied. All features of the whole

color dataset build a hierarchical tree, randomized k-d tree or k-means tree, and

this tree will find very fast one nearest neighbor for each feature of a new test

17

Page 27: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

Figure 9: All best matched features between a new image and the most similar

image in the tree. Input data are taken from [34]

frame. Then the most similar image in database (tree) is pointed according to

number of feature matches. Finally, some optional steps should be applied to

refine the matches: set threshold for distance (degree of difference between two

features), remove feature having no depth information and ensure one-to-one

feature correspondence mapping between two images.

3.3.3 Rough Transform Computation

2D visual features take their depth values from corresponding depth im-

ages, which is registered to color one, at the same indices. Note that the reg-

istration is not perfect with two-or-more-pixel error. Moreover, the features

positions are not integer type, so their depth should be interpolated from sur-

round known values or simply use the convert-to-integer locations with some

error. After that, 2D matched features with depth are converted into 3D points

by Eq. 3.1. Arun et. al (1987) [1] showed a method based on Singular Value

18

Page 28: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

Decomposition (SVD) to find a transformation between two sets of 3D points.

Although the above matching looks quite robust, the result may contain some

outliers. Therefore, the computed transformation is not really accurate. It can

refined by Random Sample Consensus (RANSAC) iterative process [9] over the

subset (list of corresponding 3D points above) to remove outliers and compute

new transformation from inliers only. However, the proposed pipeline exploits

the power of ICP and runs for two depth images - two whole 3D point sets, not

just the subset.

3.4 Bounding Sphere Hierarchical Tree

Back to BSHT which speeds up the k-intersected-leaf search, there are

some problems to solve: How to find minimal bounding sphere for a set of 3D

points, how to construct hierarchical structure for a groups of spheres, how to

insert or remove some sphere into or out of the tree... The following will clarify

them.

3.4.1 Bounding Sphere

Bounding sphere, not other geometric shapes, is chosen to represent for a

3D point cloud because it is the most memory-efficient (3 floating values for the

center, 1 value for radius), rotation-invariant, very easy for translation (translate

the center only). The first sub-problem: How to find minimal bounding sphere

19

Page 29: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

Figure 10: Bounding sphere of a 3D point cloud with depth image from [34]

for a set of 3D points answers straightforwardly if examining all of points twice.

The first time is to compute a central point by averaging all points’ coordinates.

The second is to find which is the biggest difference in each coordinate from

the center to all points. This thesis work inherits the simple implementation

and very fast computation in low dimensions (using here is in 3 dimensions) of

Gaertner (2006) [10].

3.4.2 Construction of BSHT

To decrease the searching computational cost, the hierarchical data struc-

ture needs building from a group of minimal bounding spheres, simplified from

20

Page 30: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

the huge amount of points. Derived from [7, 26], there are three main primary

categories of construction methods which consecutively computes the bound-

ing sphere enclosing two other spheres minimally and manage all enclosed and

bounding spheres smartly: top-down, bottom-up and insertion. Only the last

meets the requirement using this data structure online (continuously insert or

remove a new/old frame). This method always finds the best sibling in the

tree for a new sphere such that the total tree volume increases by the smallest

amount.

A simple example of the online insertion or incremental construction is shown

in Figure 11. There are 4 depth-images (4 point clouds), represented by 4

Figure 11: A new sphere N inserts to existing BSHT such that the volume

expansion of BSHT is minimum. It becomes a new node N which is a new

sibling of node A under the new parent node P. Courtesy of [26]

spheres. They construct the 4-level BSHT. When a new depth-image or a new

sphere adds, the tree is descended, and the total volume expands minimally. It

is contributed by the volume of new parent node and the total expansion volume

of ancestors above it in the tree. The latter always increases while the former

21

Page 31: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

decreases. As the search succeeds in finding the best insert location, a result is

stored. There are perhaps more than one location satisfying the requirement. If

all the prospective locations are ordered by their ancestor expansion and the best

of them is chosen as the final result, it is call full insertion algorithm whereas

cheap insertion one uses the first-found result. In Figure 12, absolute values of

(a) (b)

Figure 12: BSHT volume and construction time of 5 different construction

algorithms. Full insert (ins) and cheap insert (chp ins) are two methods used in

the thesis. Courtesy of [26]

y-axis are not meaningful as the ratio of them. Full insert method is among the

smallest BHST volume and cheap insert one is the second best for construction

time. It suggests that the latter should be always used in online mode while the

former can be run to rebuild the tree when the system is idle.

22

Page 32: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

4. Experiment Results and Conclusions4.1 RGBD Datasets and Benchmark

Some experiments with real data have to run in order to test the valid-

ity and efficiency of the proposed data structure for large RGBD data man-

agement. Section 2.1 has a careful description about input data. Fortunately,

RGBD datasets provided by Computer Vision Group in Technical University of

Munich [34] satisfy the requirements and include ground truth camera trajec-

tories, obtained by commercial optical motion tracking system, for evaluation

purposes.

4.2 Results

Figure 10 shows a bounding sphere for one depth image in “pioneer

slam3” dataset. It is rendered by OpenGL. As can be seen, volume of sphere

enclosing a 3D point cloud is minimum. It means the bounding sphere compu-

tation is valid.

Figure 13 shows an example of 3-level BSHT for 4 depth images. 4

leaves or 4 bounding spheres of depth images are drawn in red. The upper

level, green sphere, encloses each two red and the top level is the blue sphere

covering all. Moreover, the returned result listing some spheres intersected with

one test sphere is true. It means the BSHT construction and intersection test al-

23

Page 33: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

gorithms are correct.

Table 1 and Figure 14 show the running time of some algorithms in the

Figure 13: 3-level BSHT for 4 depth images: level 0, 1, and 2 are in blue, green,

and red respectively. Input data are from [34]

thesis. The absolute time values are not important as the ratio of them. Find

the minimal bounding sphere for a 3D point cloud is mostly constant around

32ms while time for frame loading which includes loading one color image

and one depth image, converting depth image to point cloud and computing its

24

Page 34: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

(a) Frame loading (blue line) and bounding sphere computation (green line) time

against with number of frames

(b) k-intersected-sphere search using BSHT (blue line), k-intersected-sphere search

(pair-wise) (green line) and BSHT construction (red line) time against with number of

frames

Figure 14: Computational time of some algorithms

25

Page 35: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

Table 1: Mean and standard deviation (std) of computational time of some

algorithms

Algorithm Computational Time Mean (ms) Std (ms)

Frame loading 142.202 65.608

Bounding sphere computation 32.055 9.268

BSHT construction time 0.109 0.274

k-intersected-sphere search using BSHT 0.396 0.398

k-intersected-sphere search (pair-wise) 0.334 0.462

surface normals fluctuates at 142ms. BSHT construction computational time

is the smallest and k-intersected-sphere searching time with or with out data

management are nearly same. It is not as expected that the hierarchical data

structure would reduce the searching time because bounding sphere is quite big

as compared with the total scene volume, so a new sphere is intersected with

most existing spheres. Initial matching, ICP refinement and global registration

time are not recorded but they take much time, especially the last algorithm

despite the fact that it is accelerated by the proposed data management.

4.3 Conclusion

The thesis proposes the combined hierarchical data structure for large

RGBD data management: the FDHT for color images and the BSHT for depth

images. Thanks to the insertion construction of bounding sphere hierarchical

26

Page 36: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

tree, the data structure can be used in on-the-fly applications. It is expected

that using this structure is more efficient than without any or with some simple

data structure, especially in large-scale data. However, the above results show

the equivalent performance for both of them. Many further experiments, such

as number of frames increases much more, indoor environment varies from

small space (a working table) or larger scene (a floor of building), should be

tested. To be more effective, the data management needs the more complicated

algorithms. They should consider about viewpoint of camera to improve the ac-

curacy of overlapped frame search and the way to update the tree, add, remove

or blend the new image towards the existing tree more intelligently.

27

Page 37: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

Bibliography

[1] K. Arun, T. Huang, and S. Blostein. Least-squares fitting of two 3-d point

sets. Pattern Analysis and Machine Intelligence, IEEE Transactions on,

PAMI-9(5):698–700, 1987.

[2] H. Bay, T. Tuytelaars, and L. Van Gool. Surf: Speeded up robust features.

Computer Vision–ECCV 2006, pages 404–417, 2006.

[3] J. Bentley. Multidimensional binary search trees used for associative

searching. Communications of the ACM, 18(9):509–517, 1975.

[4] P. Besl and N. McKay. A method for registration of 3-d shapes. IEEE

Transactions on pattern analysis and machine intelligence, 14(2):239–

256, 1992.

[5] M. Calonder, V. Lepetit, C. Strecha, and P. Fua. Brief: Binary robust

independent elementary features. Computer Vision–ECCV 2010, pages

778–792, 2010.

[6] M. Camplani and L. Salgado. Adaptive spatio-temporal filter for low-cost

camera depth maps. In Emerging Signal Processing Applications (ESPA),

2012 IEEE International Conference on, pages 33–36. IEEE, 2012.

[7] C. Ericson. Real-time collision detection. Morgan Kaufmann, 2004.

28

Page 38: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

[8] R. Finkel and J. Bentley. Quad trees a data structure for retrieval on com-

posite keys. Acta informatica, 4(1):1–9, 1974.

[9] M. Fischler and R. Bolles. Random sample consensus: a paradigm for

model fitting with applications to image analysis and automated cartogra-

phy. Communications of the ACM, 24(6):381–395, 1981.

[10] B. Gartner. Fast and robust smallest enclosing balls. Algorithms-ESA99,

pages 693–693, 1999.

[11] G. Grisetti, R. Kummerle, C. Stachniss, and W. Burgard. A tutorial on

graph-based slam. Intelligent Transportation Systems Magazine, IEEE,

2(4):31–43, 2010.

[12] G. Grisetti, R. Kummerle, C. Stachniss, U. Frese, and C. Hertzberg. Hi-

erarchical optimization on manifolds for online 2d and 3d mapping. In

Robotics and Automation (ICRA), 2010 IEEE International Conference

on, pages 273–278. IEEE, 2010.

[13] R. Hartley and A. Zisserman. Multiple view geometry in computer vision,

volume 2. Cambridge Univ Press, 2000.

[14] H. Haverkort. Results on geometric networks and data structures. PhD

thesis, Utrecht University, 2004.

29

Page 39: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

[15] P. Henry, M. Krainin, E. Herbst, X. Ren, and D. Fox. Rgb-d mapping: Us-

ing kinect-style depth cameras for dense 3d modeling of indoor environ-

ments. The International Journal of Robotics Research, 31(5):647–663,

2012.

[16] C. Herrera, J. Kannala, et al. Joint depth and color camera calibration

with distortion correction. Pattern Analysis and Machine Intelligence,

IEEE Transactions on, 34(10):2058–2064, 2012.

[17] K. Khoshelham and S. Elberink. Accuracy and resolution of kinect depth

data for indoor mapping applications. Sensors, 12(2):1437–1454, 2012.

[18] A. Kibriya. Fast algorithms for nearest neighbour search. Master’s thesis,

The University of Waikato, 2007.

[19] R. Kummerle, G. Grisetti, H. Strasdat, K. Konolige, and W. Burgard. G¡

sup¿ 2¡/sup¿ o: A general framework for graph optimization. In Robotics

and Automation (ICRA), 2011 IEEE International Conference on, pages

3607–3613. IEEE, 2011.

[20] T. Lee, S. Lim, S. Lee, S. An, and S. Oh. Indoor mapping using planes

extracted from noisy rgb-d sensors. In Intelligent Robots and Systems

(IROS), 2012 IEEE/RSJ International Conference on, pages 1727–1733.

IEEE, 2012.

30

Page 40: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

[21] D. Lowe. Distinctive image features from scale-invariant keypoints. In-

ternational journal of computer vision, 60(2):91–110, 2004.

[22] D. Meagher. Octree encoding: A new technique for the representation,

manipulation and display of arbitrary 3-d objects by computer. Electri-

cal and Systems Engineering Department Rensseiaer Polytechnic Institute

Image Processing Laboratory, 1980.

[23] K. Mikolajczyk and C. Schmid. Scale & affine invariant interest point

detectors. International journal of computer vision, 60(1):63–86, 2004.

[24] M. Muja and D. Lowe. Fast approximate nearest neighbors with automatic

algorithm configuration. In International Conference on Computer Vision

Theory and Applications (VISSAPP09), pages 331–340, 2009.

[25] R. Newcombe, A. Davison, S. Izadi, P. Kohli, O. Hilliges, J. Shotton,

D. Molyneaux, S. Hodges, D. Kim, and A. Fitzgibbon. Kinectfusion:

Real-time dense surface mapping and tracking. In Mixed and Augmented

Reality (ISMAR), 2011 10th IEEE International Symposium on, pages

127–136. IEEE, 2011.

[26] S. Omohundro. Five balltree construction algorithms. Technical report,

International Computer Science Institute, 1989.

31

Page 41: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

[27] E. Rublee, V. Rabaud, K. Konolige, and G. Bradski. Orb: an efficient

alternative to sift or surf. In Computer Vision (ICCV), 2011 IEEE Inter-

national Conference on, pages 2564–2571. IEEE, 2011.

[28] S. Rusinkiewicz and M. Levoy. Efficient variants of the icp algorithm.

In 3-D Digital Imaging and Modeling, 2001. Proceedings. Third Interna-

tional Conference on, pages 145–152. IEEE, 2001.

[29] R. B. Rusu. Semantic 3D Object Maps for Everyday Manipulation

in Human Living Environments. PhD thesis, Technische Universitaet

Muenchen, Germany, October 2009.

[30] C. Schmid, R. Mohr, and C. Bauckhage. Evaluation of interest point

detectors. International Journal of computer vision, 37(2):151–172, 2000.

[31] N. Snavely. Scene reconstruction and visualization from internet photo

collections: A survey. IPSJ Transactions on Computer Vision and Appli-

cations, 3:44–66, 1991.

[32] N. Snavely, I. Simon, M. Goesele, R. Szeliski, and S. Seitz. Scene recon-

struction and visualization from community photo collections. Proceed-

ings of the IEEE, 98(8):1370–1390, 2010.

[33] B. Steder, R. Rusu, K. Konolige, and W. Burgard. Narf: 3d range image

features for object recognition. In Workshop on Defining and Solving

32

Page 42: Hierarchical Data Structure and Algorithms for Range Data ... · Hierarchical Data Structure and Algorithms for Range Data Management RGBD camera, having appeared in recent years,

Realistic Perception Problems in Personal Robotics at the IEEE/RSJ Int.

Conf. on Intelligent Robots and Systems (IROS), volume 44, 2010.

[34] J. Sturm, N. Engelhard, F. Endres, W. Burgard, and D. Cremers. A bench-

mark for the evaluation of rgb-d slam systems. In Proc. of the Interna-

tional Conference on Intelligent Robot Systems (IROS), Oct. 2012.

[35] E. Tola, V. Lepetit, and P. Fua. Daisy: An efficient dense descriptor ap-

plied to wide-baseline stereo. Pattern Analysis and Machine Intelligence,

IEEE Transactions on, 32(5):815–830, 2010.

[36] C. Tomasi and R. Manduchi. Bilateral filtering for gray and color images.

In Computer Vision, 1998. Sixth International Conference on, pages 839–

846. IEEE, 1998.

[37] E. Welzl. Smallest enclosing disks (balls and ellipsoids). New results and

new trends in computer science, pages 359–370, 1991.

[38] C. Yang and G. Medioni. Object modelling by registration of multiple

range images. Image and vision computing, 10(3):145–155, 1992.

[39] Z. Zhang. Microsoft kinect sensor and its effect. Multimedia, IEEE,

19(2):4–10, 2012.

33