institutionen för systemteknik - diva...

65
Institutionen för systemteknik Department of Electrical Engineering Examensarbete Moving object detection in urban environments Examensarbete utfört i Reglerteknik vid Tekniska högskolan vid Linköpings universitet av David Gillsjö LiTH-ISY-EX--12/4630--SE Linköping 2012 Department of Electrical Engineering Linköpings tekniska högskola Linköpings universitet Linköpings universitet SE-581 83 Linköping, Sweden 581 83 Linköping

Upload: others

Post on 01-Jan-2021

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

Institutionen för systemteknikDepartment of Electrical Engineering

Examensarbete

Moving object detection in urban environments

Examensarbete utfört i Reglerteknikvid Tekniska högskolan vid Linköpings universitet

av

David Gillsjö

LiTH-ISY-EX--12/4630--SE

Linköping 2012

Department of Electrical Engineering Linköpings tekniska högskolaLinköpings universitet Linköpings universitetSE-581 83 Linköping, Sweden 581 83 Linköping

Page 2: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions
Page 3: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

Moving object detection in urban environments

Examensarbete utfört i Reglerteknikvid Tekniska högskolan vid Linköpings universitet

av

David Gillsjö

LiTH-ISY-EX--12/4630--SE

Handledare: James Underwoodacfr, University of Sydney

Tim Baileyacfr, University of Sydney

Johan Dahlinisy, Linköpings universitet

Examinator: Thomas Schönisy, Linköpings universitet

Linköping, 24 september 2012

Page 4: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions
Page 5: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

Abstract

Successful and high precision localization is an important feature for autonomousvehicles in an urban environment. GPS solutions are not good on their own andlaser, sonar and radar are often used as complementary sensors. Localizationwith these sensors requires the use of techniques grouped under the acronymSLAM (Simultaneous Localization And Mapping). These techniques work bycomparing the current sensor inputs to either an incrementally built or knownmap, also adding the information to the map.

Most of the SLAM techniques assume the environment to be static, which meansthat dynamics and clutter in the environment might cause SLAM to fail. To ob-tain a more robust algorithm, the dynamics need to be dealt with. This studyseeks a solution where measurements from different points in time can be usedin pairwise comparisons to detect non-static content in the mapped area. Parkedcars could for example be detected at a parking lot by using measurements fromseveral different days.

The method successfully detects most non-static objects in the different test datasets from the sensor. The algorithm can be used in conjunction with Pose-SLAMto get a better localization estimate and a map for later use. This map is good forlocalization with SLAM or other techniques since only static objects are left in it.

iii

Page 6: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions
Page 7: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

Acknowledgments

I would like to begin by thanking my supervisors at the ACFR, James Underwoodand Tim Bailey. Thank you for all the valuable feedback, directions and expertiseduring our weekly discussions. I’m also grateful for the introduction to back-street-Thai, the workplace and all the helpful people working there. These I’d liketo thank for the daily lunch discussions and a friendly welcome to the Australianculture. I will remember the great concept of pizza seminars and use it wisely.

I also thank my Swedish supervisor Johan Dahlin for feedback on numerous edi-tion of this thesis. On the topic of feedback, I’d like too acknowledge my officeneighbour and sounding board; Hanna Nyqvist, thank you for all the discussionsand lunch walks.

Sincere thanks to my friends and family in Sweden for your support during thisexciting and challenging semester. I’d also like to thank my Australian "family"Johan Wågberg and Emanuel Walldén Viklund for putting up with me and beingsuch nice and understanding room mates.

Linköping, September 2012David Gillsjö

v

Page 8: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions
Page 9: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

Contents

1 Introduction 11.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 Thesis outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2 SLAM and LiDAR 32.1 Feature-based SLAM . . . . . . . . . . . . . . . . . . . . . . . . . . 32.2 Pose-SLAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.3 LiDAR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.4 Free, occluded, unknown and occupied space . . . . . . . . . . . . 10

3 Related work in SLAM and motion detection 113.1 Including moving objects in SLAM estimate . . . . . . . . . . . . . 113.2 Structure of common solutions . . . . . . . . . . . . . . . . . . . . . 12

3.2.1 Alignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123.2.2 Detection of candidates . . . . . . . . . . . . . . . . . . . . . 123.2.3 Classifying movement . . . . . . . . . . . . . . . . . . . . . 13

4 Motion detection using multiple viewpoints 174.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174.2 Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204.3 Spectral registration . . . . . . . . . . . . . . . . . . . . . . . . . . . 214.4 Selection of scans . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224.5 Motion compensation . . . . . . . . . . . . . . . . . . . . . . . . . . 234.6 ICP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244.7 Map free-space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254.8 Project scan and detect points . . . . . . . . . . . . . . . . . . . . . 27

5 Results 315.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315.2 Design choices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

5.2.1 Grid size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325.2.2 Handling of no returns . . . . . . . . . . . . . . . . . . . . . 325.2.3 Effects of thresholds . . . . . . . . . . . . . . . . . . . . . . 36

vii

Page 10: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

viii CONTENTS

5.2.4 Thresholding on one comparison or several . . . . . . . . . 405.3 Result and limitations . . . . . . . . . . . . . . . . . . . . . . . . . . 42

5.3.1 Viewpoint difference . . . . . . . . . . . . . . . . . . . . . . 425.3.2 Lack of background . . . . . . . . . . . . . . . . . . . . . . . 425.3.3 Segmentation related detection errors . . . . . . . . . . . . 425.3.4 Occlusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425.3.5 One versus many . . . . . . . . . . . . . . . . . . . . . . . . 45

6 Concluding remarks 476.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 476.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

6.2.1 Adaptive threshold . . . . . . . . . . . . . . . . . . . . . . . 486.2.2 Alternative ways of choosing scans . . . . . . . . . . . . . . 496.2.3 Adjust segmentation . . . . . . . . . . . . . . . . . . . . . . 496.2.4 Handling of no returns from laser . . . . . . . . . . . . . . . 496.2.5 Final weighting . . . . . . . . . . . . . . . . . . . . . . . . . 506.2.6 Local density . . . . . . . . . . . . . . . . . . . . . . . . . . . 506.2.7 Implementation details . . . . . . . . . . . . . . . . . . . . . 50

Bibliography 51

Page 11: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

1Introduction

Robotic systems that are able to observe and autonomously react in differentenvironments are increasing in numbers and complexity. Systems like these of-ten need a way to localize themselves in the surroundings or even create a map.One type of the solution to this is called Simultaneous Localization and Mapping(SLAM).

As the name suggests, the problem is for a system (e.g. robot) to build a map ofits surroundings while at the same time position itself in this map. A map canalso be given but is updated with information during the session. This enablesa robot to be aware of its surroundings and react according to instructions. Forexample, planning of paths for a robot to take in a changing environment is madepossible, since new information like a temporary obstacle can easily be incorpo-rated with already known areas of the map. By making localization and mappingone problem instead of two separate, the correlation between them can be used.

Pose-SLAM is one of the solutions to the SLAM problem and the application con-sidered here. In this technique, the whole sensor measurement is taken and com-pared with the following measurement, estimating the position and orientationof the robot. The map is all these measurements combined together, for examplein an occupancy grid.

Most solutions to SLAM assume that the surroundings are static to reduce thecomplexity of the problem, which makes it easier to solve in real time. Thisassumption makes the mapping and localization less robust in dynamic environ-ments, since moving objects might cause false associations.

1

Page 12: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

2 1 Introduction

This thesis seeks an algorithm for detection of non-static objects by pairwise com-parison of measurements from a 3D LiDAR sensor in urban environments. Thealgorithm can be integrated with pose-SLAM and will utilize different viewpointsof the same scene. This will make it possible to detect changes that have occurrednot only around the time of measuring, but also dynamic objects that are station-ary for long periods. Parked cars or a group of people standing still conversingcan for example be detected and left out from the map.

1.1 Contributions

• A technique using several viewpoints from a 3D LiDAR sensor in pairwisecomparisons to successfully detect moving objects. This is done by selectingreference measurements from a larger data set and compare them with themeasurement of interest.

• A greedy optimization algorithm for selecting which measurements to com-pare against.

• A case study showing results achieved and limitations of the technique.

1.2 Thesis outline

SLAM and LiDAR are explained in Chapter 2. Existing work in the area of SLAMand motion detection is then discussed in Chapter 3 followed by a description ofthe implemented algorithm in Chapter 4. A case study in Chapter 5 presents theresults, with conclusions and future work in Chapter 6.

Page 13: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

2SLAM and LiDAR

This chapter first describes the concepts of SLAM (Simultaneous LocalizationAnd Mapping) and one of its solutions; pose-SLAM. Then follows a section onLiDAR (Light Detection And Ranging) which is the sensor for the application. Atthe end of the chapter the terminology for the sensor measurements is explained.

2.1 Feature-based SLAM

Feature-based SLAM is one of the first frameworks used and works by determin-ing landmarks in the environment which the robot’s position can be related toby sensor measurements. Depending on sensor choice the landmarks can rangefrom simple signal emitting beacons to feature extractions in camera images. Thisexplanation on feature-based SLAM follows the one given in Durrant-Whyte andBailey [2006]. It is posed as a probabilistic problem with the following variablesat time step k:

• The robot state vector, xk , with position and orientation.

• The robot control signal, uk , which determined the robot state transitioninto this time step.

• The landmarks, m = (m1, m2, ..., mk), where mi denotes the i:th landmark.

• An observation, zi,k , is the sensor measurement of landmark i at time k.The measurement of all landmarks at time k will be denoted zk .

A graphical description of the variables can be seen in Figure 2.1. The green tri-angles represent the state, which are connected by control signals to illustrate thestate transition. The blue circles are landmarks and red arrows are observations.

3

Page 14: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

4 2 SLAM and LiDAR

xk−1

xk

xk+1

uk

uk+1

m1

m2

m3

z1,k−1

z3,k

z3,k+1

z2,k−1

z2,k

Figure 2.1: This figure explains the variables typically used in probabilisticSLAM with landmarks. The green triangles represent the robots state in dif-ferent time steps, the control signals causing a state transition are shown asdashed black lines. The red arrows are observations of the blue landmarks.

For example, the green triangle in the middle representing state xk is caused byapplying control signal uk in the former state; this is the dashed line between thestates. In the state xk , observations are made of the blue landmarks m2 and m3,these observations are shown as red arrows originating from the green rectangle.

When the feature-based probabilistic SLAM solution is derived, it is assumedthat the robot state transition is a Markov process. This means that the robot statedoesn’t depend on all the preceding states, only the previous state and the currentsignals. This gives the motion model formulated as a probability distribution

P (xk |xk−1, uk). (2.1)

It is also assumed that the landmarks are static, i.e. a landmark’s position is notdependent on the landmark’s preceding positions. The final assumption is thatthe observations are conditionally independent from each other. This means thatan observation only depends on the landmarks and the current state. The lasttwo assumptions give the following relations:

mk = mk+1, (2.2)

P (zk |xk , m). (2.3)

Page 15: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

2.1 Feature-based SLAM 5

x0 xk−1 xk

z0 zkzk−1

uk−1 uk

m

Figure 2.2: Graphical model of the probabilistic SLAM problem. Arrowsillustrate dependencies between the variables for state x, observation z, con-trol signal u and landmarks m. Observation zk , which is the observationmade by the robot in time step k, is for example dependent on the state xkand the landmarks m.

These assumptions give the problem a structure with few chained dependencieswhich can be solved using recursive solutions, for example a Kalman filter. Thiscan be seen in the graphical model in Figure 2.2, where the arrows symbolizedependencies. The variable that the arrow is pointing at is dependent on thevariable at the start of the arrow.

Since the landmarks are assumed static the information keeps accumulating anduncertainty can only decrease with time. If the landmarks were to move theywould need motion models, and these motion models need to be very accurateto maintain good localization. This is difficult for objects in most environmentssince their movements are not predictable enough. It would also be very compu-tational demanding [Wang et al., 2003b]. This is why moving objects often areremoved from the SLAM estimate rather than accounted for.

Page 16: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

6 2 SLAM and LiDAR

2.2 Pose-SLAM

Pose-SLAM is a solution which doesn’t estimate the positions of landmarks. In-stead it keeps estimating a trajectory of robot poses by setting up constraintsbetween them. Constraints are made by comparing sensor measurements andestimating how the robot moved between them, see Figure 2.3. The map is thenconstructed by global alignment of the measurements, given the vehicle’s esti-mated trajectory.

The estimation of the trajectory can be made with different methods; except fromcommon state estimation techniques there exists solutions like Fast-SLAM andgraph-SLAM. The former keeps several trajectories hypotheses as particles, esti-mating the most likely. The latter is also a state estimation technique but repre-sents the problem with a graph, each node being a scan and the edges constraintsfrom the measurement matching. More on Pose-SLAM and other methods can befound in Bailey and Durrant-Whyte [2006].

Since whole measurements are used instead of features, this method is good insituations where it is hard to find good recognizable and distinguishable features.As long as there are enough static objects around, the dynamics will not affect thepose estimate too much either. It would still be beneficial to have the moving ob-jects removed from the measurement. This makes the localization better and themeasurements are also used to generate the map. If the map could be generatedwithout the moving objects it would match better in later use.

An illustration of how moving objects affect the measurement matching is shownin Figure 2.4. The moving object causes an error which can be seen in the match-ing of the two static objects. The estimated motion will therefore not be thesame as in Figure 2.3 where the two measurements only contained static objects.Dynamic objects might also introduce wrong alignments and therefore false hy-potheses; this isn’t shown in the example.

Page 17: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

2.2 Pose-SLAM 7

1. 2.

Alignment

R

q

Figure 2.3: Illustration of a correct alignment. The two upper pictures illus-trate the surroundings in two measurements, only the robot itself is moving.By aligning the two measurements an estimate of the robots translation qand rotation R can be calculated.

Page 18: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

8 2 SLAM and LiDAR

1. 2.

Alignment

Figure 2.4: Illustration of an incorrect alignment. The two upper pictures il-lustrate the surroundings in two measurements, here we have one object andthe robot itself moving. When the two measurements are aligned the staticobjects won’t match up because the moving object pulls the alignment. Thetranslation and rotation estimate will not be correct. The alignment mightalso turn out completely wrong if the algorithm makes false associations dueto moving objects, this is not shown here.

Page 19: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

2.3 LiDAR 9

2.3 LiDAR

The sensing technique Light Detection And Ranging (LiDAR) is used for mea-suring range and light reflexivity. The technique uses focused light, often laserpulses, to illuminate the target. Depending on application, the light can be inthe ultraviolet, visible or infrared spectrum. A sensor close to the light sourcethen measures the returning light intensity and travel time to calculate the dis-tance to the target. With waveform analysis it is possible to increase accuracy andestimate distance from more than one return of the light.

Using light makes it possible to measure various types of targets depending onthe wavelength used. Both cars, pavement, pedestrians, smoke, clouds and dustcan give large enough reflections. For the detection to be more robust against fogand particles in the air the infrared spectra should be used. Lasers with largerrange (higher power output) also benefits from longer wavelengths, since it ismore safe for the eye. On the other hand, current sensing techniques have higheraccuracy at shorter wavelengths so a trade-off has to be made. See Fujii andFukuchi [2005] for LiDAR details.

The Velodyne HDL-64E S2 sensor (Figure 2.5) used for this application uses lightjust inside the infrared spectrum at 905 nm. It has a range of 120m and its lightis safe for the eye; a requirement since it is to be used in the streets.

Figure 2.5: A picture of the sensor used, the HDL-64E S2. The picture isfrom the press section at Velodyne’s web site.

Page 20: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

10 2 SLAM and LiDAR

2.4 Free, occluded, unknown and occupied space

When reasoning about what is known in the environment given the measure-ments, the terms free, occluded, unknown and occupied space will be used. Thisworks very well if the range sensor only return one reflection per ray as it does inthis application. The free-space area in the environment is where the light fromthe sensor passed through before hitting an object. The endpoints of these lightrays that reflected light back to the sensor make up the occupied space. Thesetwo types of areas are the ones considered known and can be reasoned aboutwhen it comes to deciding if objects have moved or not.

Unknown parts of the surroundings are where the sensor didn’t pick up any re-turns. Depending on the sensor and algorithm this could also be classified as freespace up to the maximum range of the sensor. More about this in Chapter 4. Theoccluded space is the area hidden behind the occupied space, since nothing isknown it can not be used when comparing measurements to determine if objectshave moved or not. An illustration of this can be seen in Figure 2.6. The figureis a 2D illustration, where tiled shapes represent the actual objects, the legenddescribes the different areas in the measurement.

Free

Unknown

Occluded

Occupied

Sensor

Figure 2.6: Description of the terms used for the different areas in the en-vironment relative to the measurements. The black tiled objects are objectsin the environment, the blue thick lines are the measurements as well as theoccupied space. For simplicity it is in 2D and measurements are continuous.

Page 21: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

3Related work in SLAM and motion

detection

This chapter describes different approaches taken in the research community andhow this thesis relates to them. The chapter will focus on the research with laseras sensor.

3.1 Including moving objects in SLAM estimate

There are solutions which deals with dynamic environments by including themoving objects in the estimates. Examples include alternative methods of match-ing measurements which takes moving objects into consideration [van de Venet al., 2010] or SLAM frameworks where the landmarks’ state vector is extendedwith velocities [Bibby and Reid, 2007, Wang et al., 2003b].

Since the sought application is in conjunction with pose-SLAM and landmarksaren’t used in the same sense, the latter alternative isn’t the first option comingto mind. It could be used but is computational expensive and difficult to do.Considering motion during the scan matching is certainly good; it gives more ro-bustness against highly dynamic environments with very little stationary objects.The downside is that the moving objects will not be explicitly marked, which isbeneficial if a static map is expected as a result of the pose-SLAM.

Therefore a method that removes dynamic objects will be used. Most of the pa-pers on SLAM solutions for dynamic environments do this. Since the measure-ments representing uncertain or dynamic objects are removed, the environmentcan be considered static by the SLAM algorithm. This was also how the firstmethod for outdoor SLAM in dynamic environments was constructed [Wang andThorpe, 2002].

11

Page 22: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

12 3 Related work in SLAM and motion detection

3.2 Structure of common solutions

This section explains in general how moving objects usually are handled withpose-SLAM and choices made in regard to this. First the process of alignmentis discussed, followed by the detection of possible candidates corresponding tomotion. At last various implementations of the classification that decides what ismoving or not are mentioned and compared.

3.2.1 Alignment

Alignment is the procedure of using a registration technique to match two mea-surements to get the transform relationship between them. For this thesis, a vari-ant of the commonly used algorithm ICP (Iterative Closest Point) [Rusinkiewiczand Levoy, 2001] is a prerequisite. There are a few alternatives to ICP includ-ing the RANSAC [Fischler and Bolles, 1981] and Fourier-Mellin transform [Chenet al., 1994].

ICP minimizes an error metric based on the distance difference between the mea-surements. As long as the initial guess is good, an estimate will be globally opti-mal according to that metric. Basic outlier checks are usually made and excludesmeasurements from the estimate. RANSAC is probabilistic and tries differenthypothesis regarding what data to consider outliers or not. This makes it morerobust to noise, but knowing when the best result has been found is hard andruntime may vary between matches. Since outliers, in this case mostly changesor motion, is handled by other methods, ICP is a valid choice for this algorithm.The Fourier-Mellin transform is better than ICP at finding matches without agood initial guess. It is however not as accurate and has therefore been used as apre-alignment step.

3.2.2 Detection of candidates

A common first step in the detection of candidates is to do a segmentation ofthe measurements. This involves filtering the points representing ground andgrouping points that are close to each other. Hopefully each group represents anobject in the environment, but this is not always the case. Therefore there existsmethods for keeping multiple hypotheses and choosing the most likely at thetime. One such is the multi scale algorithm used in Yang and Wang [2011]. Thisimplementation will however be using a single hypothesis algorithm [Douillardet al., 2011].

Page 23: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

3.2 Structure of common solutions 13

There are three common actions after the segmentation; the first is to just passon the non-ground segments. The other two are to either select segments viafree-space violation detection [Wang et al., 2003a, 2007, Vu et al., 2011] or selectsegments based on local alignment errors [Katz et al., 2008]. The first alterna-tive leaves it to a higher level of reasoning, for example comparisons with mo-tion models in the next step. The free-space approach detects objects that haveappeared between the sensor and the former closest objects. The approach is sim-ple and also handles possible occlusions, but adds another layer of computation.Using the already calculated alignment errors on the other hand adds a layer tohandle occlusions.

These differences are illustrated in Figure 3.1 where two measurements from ascenario with a moving car are compared. In the first situation a car is drivingin between two objects, occluding one of them which is of similar shape and size.In the second situation the car is out of sight and only the two stationary objectsremain. Situation two is searched for moving objects using both methods. We seethat using only alignment errors may lead to a false detection of the static objectto the left, an occlusion check must be added for this technique.

Since they are not exclusive to each other, using both is also an alternative, asin [Miyasaka et al., 2009]. The free-space method is used in this project, partlybecause occlusions are implicitly handled, but also due to issues with the ICPalgorithm used which still was under development.

3.2.3 Classifying movement

In this step, it is decided which segments that corresponds to moving objects.Most literature use either motion model prediction [Wang et al., 2003a,b, 2007,Zhao et al., 2008, Miyasaka et al., 2009] or dynamic occupancy grid maps [Wolfand Sukhatme, 2005, Wang and Thorpe, 2002]. A combination is also possible[Vu et al., 2011].

The Kalman filter is often the first choice when using motion models to predictthe movement of the objects. Selection is then made based on what segmentsthat are consistent with a motion model over some time frame. These meth-ods can be used without candidate selection in the former step (only delete theground). Several models are often combined using a technique called InteractiveMultiple Models (IMM) [Mazor et al., 1998, Vu et al., 2011], since motion of arbi-trary objects are hard to predict. This technique optimizes the track estimate byweighting estimates from several motion models. This is often combined with aMultiple Hypothesis Tracker (MHT) [Bar-Shalom and Fortmann, 1988, Vu et al.,2011] which keeps several tracks for each target and outputs the most probableone at the time.

Occupancy grids for dynamic objects are based on a model for the probabilityof a moving object occupying a cell. Cell values in a grid are increased whenthey are deemed occupied by a candidate moving object. If the values of thegrids occupied by a segment is above some threshold, the segment is said to bedynamic.

Page 24: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

14 3 Related work in SLAM and motion detection

1. 2.

1. 2.

Moving Car

Standing Car

Building

Measurements

Error detection in 2 Free-space detection in 2

Free-space

Detection

Sensor

Figure 3.1: This figure compares using only errors from the alignment orfree-space detection. The first two boxes are the environments in situationone and two. The next pair of boxes are the measurements from the sensor.The last pair is detection of moving objects in the measurement from situ-ation two with alignment errors or free-space violations. The ellipse showsa detection by the algorithm, even though this is occluded space in the firstmeasurement. An occlusion check is needed for this method.

Page 25: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

3.2 Structure of common solutions 15

These methods which keep a state of the moving objects can handle situationswhere a moving object makes a temporary stop. They can also catch slow movingtargets that may be difficult to detect, since they won’t be intruding that muchinto free space in between measurements. The downside to these methods isthat they require a correct data association, especially with motion models. Forexample, a Kalman filter needs to compare segments that actually represent thesame object. This might be hard sometimes due to occlusions which causes theobject to quickly change shape.

There are a few other types of methods as well, for example modeling objectsaccording to distributions and using gating criteria [Katz et al., 2008], or usingweights in an expectation maximization framework [Rogers et al., 2010]. Mostare however only concerned with online application, or at least to handle all mea-surements in sequence. What is sought in this application is similar to Levinsonand Thrun [2010], which during a map building phase collects measurementsfrom the area at different times. A grid map is used where each cell is a Gaussianvariable describing the average and variance of the LiDAR intensity at return.A high intensity deviation in a grid cell implies a moving object which can beremoved from the map.

The strength in a method like this is that it does not only detects objects movingat that time, but also detects objects that have been stationary a long time but canbe moved, e.g. a parked car. This makes it possible to over time create maps withonly stationary objects which can be used as models or more robust maps for lo-calization. The next chapter will present a method using range measurements todetect motion and changes in the surroundings using measurements from severalviewpoints and points in time.

Page 26: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions
Page 27: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

4Motion detection using multiple

viewpoints

This is a description of the algorithm used to detect moving objects in a sensormeasurement, also refered to as a scan, using a set of other scans taken in thevicinity. The algorithm is written in MATLAB and contains functions suppliedby the ACFR.

4.1 Overview

The overall flow of the algorithm is described in Figure 4.1 and more detailed inAlgorithm 1. The algorithm takes one active scan, which is the measurement thatneeds to have its moving objects removed. It also takes a set of reference scans,which are measurements taken in the proximity of the active scan, but can befrom various different times. The basic idea is to compare the occupied space inthe active scan with the free space in a selection of the reference scans.

Together with the measurements, the algorithm also expects that each scan hasbeen segmented. The segmentation algorithm deletes measurements correspond-ing to ground and creates groups of points that seem to correspond to the sameobject. This is made with an external algorithm from ACFR and is further ex-plained in Section 4.2.

Since measurements are collected 20 times per second the algorithm would speedup considerably if the set were to be thinned out. This is the first step and thescans left will be either more than a meter or a second apart. The thresholds canprobably be increased to higher values without a significant impact on the result.

17

Page 28: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

18 4 Motion detection using multiple viewpoints

Active Reference

Thin

Spectral reg.x1, y1

xN , yN Selection

Motion comp.

ICPq1, R1

qM , RMDetection

Figure 4.1: The flow of the algorithm. One active scan and several possiblereference scans are the inputs. In the end a selected few reference scansare compared with the active scan and moving objects in the active scan aredetected.

Page 29: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

4.1 Overview 19

Algorithm 1 General algorithm for the method used in this thesis.

Require: Active scan Sa, Reference scans Sr , Thresholds Θ, No. ref. scans n1: Read files(measurements, segmentation, relativeP ositions, groundT ruth)2: Sr ← Thin(Sr ) . Make scans more sparse in positions and time3: for all i = Sr do4: (Rsi , qsi)← SpectralRegistration(i, Sa) . Supplied by ACFR5: end for6: Sc ← SelectScans(Sr , n, T , q)7: Sa, Sc ←MotionCompensate(Sa, Sc) . Supplied by ACFR8: for all i = Sc do9: (R, q)← ICP(i, Sa, Rsi , qsi) . Supplied by ACFR

10: Sap ← RSa + q11: grid ←MapFreeSpace(Si)12: Di ← DetectSegments(Sap, grid,Θ)13: end for14: movingSegments← CombineResult(D)

The leftover scans are aligned with the active scan to get their positions. Thealignment is done with spectral registration supplied by the ACFR, which is de-scribed in Section 4.3. Based on the position and time of the scans with regardto the active scan, n scans are now chosen for the active scan to be comparedagainst. This is done with a simple greedy optimization to maximize distancein space and time between the chosen scans; this is further explained in Section4.4. The selected scans and the active scan are then motion compensated with analgorithm from the ACFR to account for the movement of the sensor during themeasurement. This procedure is further described in Section 4.5.

To detect changes and motion, the chosen reference scans needs to be comparedwith our active scan. The relation between a reference scan’s free space and theactive scan’s occupied space is exploited for this. The free space first needs tobe represented in a better data structure; this procedure is described in Section4.7. To make the comparison the reference scans need a better alignment to theactive scan, especially after the motion compensation. This is handled by an ICPalgorithm from the ACFR, which is described in Section 4.6.

The active scan is rotated and translated according to the ICP output to fit eachreference scan. This projected version of the active scan is compared with thefree-space data structure created earlier. Further details can be found in Section4.8. How the results from the different reference scans are at last combined isalso described in the aforementioned section.

Page 30: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

20 4 Motion detection using multiple viewpoints

4.2 Segmentation

The segmentation part of the algorithm tries to delete points corresponding tothe ground in the measurement and then assigns labels to the remaining points.Points that are close to each other are assumed to correspond to the same objectand are marked with the same label. One set of points with the same label iscalled a segment, which hopefully corresponds to an object in the physical world.This is illustrated in Figure 5.11. A brief description of the segmentation algo-rithm will follow, for more details see Douillard et al. [2011].

The algorithm works with the concept of radial lines and raster lines. Measure-ments on a radial line are gathered from the same time instance and therefore ona line originating from the sensor. Measurements on a raster line stem from thesame individual laser during one scan (one rotation) and resemble a ring aroundthe sensor. The algorithm begins by removing the ground. To do this, a terrainmesh is first created based on these lines. This mesh is a graph where each pointis a node. The node has edges to the previous and next measurement taken bythe laser, and the laser below and above. The gradient is then calculated for allnodes in all four directions; the gradient with maximum L2 norm is assigned tothe node.

Figure 4.2: A segmented measurement with ground and segments markedin separate colours. The picture is reproduced from Douillard et al. [2011]with permission from the authors.

Page 31: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

4.3 Spectral registration 21

The marking of ground points is then initiated by assuming that the longest setof nodes in the closest scan line with a gradient value below a threshold maxgradcorresponds to ground. Ground classification is then propagated throughout thegraph with conditions on neighbourhood to ground labeled nodes and the gradi-ent being less than maxgrad. Transition points are also removed from the mea-surement by searching for local deviations along each raster line and radial line.

For the segmentation part, a cubic voxel grid is created and populated with allthe 3D points not marked as ground or transition from the previous step. Pointsare grouped based on local voxel adjacency, depending on several parameters.All parameters were used at their default values except the parameter specifyingthe minimum amount of voxels required for a segment, which was empiricallyset to eight.

4.3 Spectral registration

This algorithm is used as a preprocessing step for the ICP since it requires an ini-tial guess. The algorithm is fast and can handle large translational and rotationaldisplacement, but doesn’t give as accurate results as the ICP. It could with somemodification also handle scaling, but this is not an issue with the laser rangemeasurements.

For faster processing the 3D laser range measurement is projected to a 2D planeand then fitted to a occupancy grid where each cell has a binary value, occupiedor free. A brief explanation of the principle now follows. See Checchin et al.[2009] for more details regarding the basis of this particular implementation, orChen et al. [1994] for more detailed theory.

The algorithm makes use of the shift property of the Fourier transform, meaningthat a translation in the variables of a function corresponds to phase shift in itsFourier transform. Assume we have two 2D range scan grids g1 and g2 that aretranslated by (∆x,∆y) from each other so that g2(x, y) = g1(x − ∆x, y − ∆y), theshift property then gives the Fourier transforms the following relation

G2(u, v) = G1(u, v)e−2πi(u∆x+v∆y). (4.1)

Proceeding by forming the normalized cross power spectrum

FCorrn(x, y)

=G∗1|G∗1|

·G2

|G2|=

G∗1G1

|G∗1||G1|e−2πi(u∆x+v∆y) = e−2πi(u∆x+v∆y), (4.2)

with ∗ denoting the complex conjugate. We see that taking the inverse transformof this results in a Dirac pulse δ (x − ∆x, y − ∆y). This function is only nonzero atthe coordinates corresponding to our sought translation, therefore we can expressthe translation as

(∆x,∆y) = argmax(x,y)(Corr(x, y)). (4.3)

Page 32: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

22 4 Motion detection using multiple viewpoints

When rotation is added to the mix, G1 and G2 needs a base change into polarcoordinates to represent the rotation as a translation. This image can then gothrough the same process as g1 and g2 above. For more details, see the referredpapers earlier in this section.

4.4 Selection of scans

This algorithm tries to spread the chosen scans so that they are as dispersed intime and space as possible. It is not done in an optimized way but rather in agreedy optimized fashion. The sequence of actions is described in Algorithm 2.

It starts by removing scans that are too far away to give valuable information; thisvalue is set as half the sensor’s range. Feature vectors are then formed and nor-malized; in the implementation only x, y and time are used. The normalization isdone so that the furthest distance in space equals the furthest distance in time. Adistance matrix is then calculated and used to determine the nearest neighbourof each vector and the corresponding distance. The greatest of these distancesis determined to be the most distant vector from all of the other vectors; this isillustrated in Figure 4.3. The vector is chosen and made an invalid choice untilthe next loop. Worth mentioning is that the active scan is an invalid choice fromthe start. This method is not an optimization in coverage, but in sparseness.

Algorithm 2 Selection algorithm

Require: Active scan position pa and time ta, Reference scans positions pr andtimes tr , No. scans to select n

1: Exclude scans with ||pr − pa|| > 602: Form feature vector Va = (pa, ta) and vectors Vr = (pr , tr )3: (Va, Vr )← Normalize(Va, Vr )4: Build distance matrix D5: for i = 1 : n do6: Ci ←Max(Min(D))7: Mark Ci as not valid choice in D.8: end for

Page 33: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

4.5 Motion compensation 23

min

min

max

Candidate

Chosen

Active

Figure 4.3: Illustrates how the most remote point is picked by the selectionalgorithm. Assume that the algorithm already have chosen the green mea-surements and is currently choosing between the blue squares. The mini-mum distance from the candidate measurements to the already chosen andthe active measurement is then calculated, these are marked with arrowsin the figure. The candidate measurements with the largest of these mini-mum distances is chosen by the algorithm. This procedure is repeated untila specified number of measurements are selected.

4.5 Motion compensation

The laser range scanner is rotating and moving continuously but measurementsextracted are treated as a point cloud from one given time instance. This assump-tion gives errors in the range measurements, the correction of these is what hereis called motion compensation. The problem is exaggerated and illustrated inFigure 4.4. The algorithm simply assumes a constant velocity during the mea-surement and applies part of the rotation and translation to each point; more tothose early in the measurement and less to those closer to the end.

Figure 4.4: The left side illustrates the lasers different positions duringmovements while the laser is rotating. Each line corresponds to a measure-ment after a small rotation. The right side illustrates the effect of assumingthese measurements where taken at the same time. The effect isn’t this largein reality but illustrates the need for compensation.

Page 34: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

24 4 Motion detection using multiple viewpoints

4.6 ICP

The Iterative Closest Point (ICP) algorithm is commonly used for geometry basedregistration between models or measurements in 3D. Since it has been widelyused there are several different methods. They differ in how points are selected,distance metric, error metric etc. More examples can be found in Rusinkiewiczand Levoy [2001]. The algorithm used in this work is currently being developedand details are not allowed to be disclosed here. It has similarities with the point-to-plane variant of ICP by Yang and Medioni [1992], which will now be brieflyexplained to describe the principles of ICP.

In this simple description ICP is divided into three steps:

• Calculate distance between a chosen set of points and their correspondingplane in the reference measurement.

• Calculate a translation and rotation that minimizes an error metric for thesedistances.

• Perform translation and rotation and repeat from the first step, unless theerror is within a specified threshold.

Denote the ith chosen point in the new measurement pi and its corresponding ref-erence measurement point as ri , where i ranges from one to the number of chosenpoints N . Points can be chosen based on for example surrounding geometry oruniform sampling. Correspondence between points is also defined differently be-tween ICP versions. Options include nearest neighbour in different metrics andnormal shooting. In the point-to-plane ICP the distance between two points isdefined as the length of the distance vector projected to the normalized surfacenormal ni of the reference point. The transformation matrix T also needs to beintroduced, which denotes the translation and rotation for the new measurement.This means that the distance Di is

Di = ‖ni · (T pi − ri)ni‖ = ‖ni · (T pi − ri)‖. (4.4)

See Figure 4.5 for an illustration. An initial estimate of the transformation matrixis needed because the point pairs need to be decided before forming this distance.For the ICP to converge it also needs to be good since it easily falls into a localminimum.

Step two is a parameter estimation of T where the error is minimized. Oftenoutliers are excluded before this stage, the easiest approach to this is to excludepoint pairs with distance above a specified threshold. In this example, a leastsquares estimation is used, which can be written as

T ← argminT

N∑i=1

D2i . (4.5)

Page 35: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

4.7 Map free-space 25

n

Di

ri

pi

Figure 4.5: In this 2D example two lines are to be matched. The distance Diis the distance vector projected to the normal of the reference point ri .

These steps are then repeated until a small enough average error is achieved. Max-imum number of iterations might also be a restriction, should it not converge.

4.7 Map free-space

To detect points in the active scan that correspond to moving objects, the freespace in the reference scans needs to be compared to the occupied space in the ac-tive scan. To make it an effective operation, the free space needs to be representedin an alternative way. In this case a lookup table is used.

The table is a grid in azimuth and elevation angle in the spherical coordinates.Each grid cell stores the range of the point closest to the laser among all thepoints that falls into that grid cell. A 2D example can be seen in Figure 4.6 wherediscretization is made over the azimuth angle. This also illustrates the conse-quences of having too low resolution, the free-space between the laser and theobjects in the upper left doesn’t get correctly mapped. A too high resolution onthe other hand gives problem in the detection process; this will be shown in theresults (Chapter 5).

The figure also illustrates that no measurements in a section yields an unknownrange. According to the manual, a measurement without return can be regardedas free space up to 65 meters. During development, no returns were also observedin other cases. These cases stem from specular reflection which means that thelight didn’t disperse and reflect back to the sensor.

This can be seen on for example car roofs at certain angles, which often doesn’tgive returns due to this phenomena. This is unpredictable and often gives a highindication of motion when the object in fact is standing still. These sections aretherefore treated as unknown range and cannot be used for detection of free-space violation. Results showing this can be seen in Chapter 5.

Page 36: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

26 4 Motion detection using multiple viewpoints

When the grid is created, the active scan is also trimmed to not have points out-side the grids boundaries. Indices for all points corresponding to their respectivecell in the lookup table are also created. In total, the algorithm has four designparameters: the range for elevation angles in the table and the resolution of thegrid in both angles. The resolution is set slightly lower than the resolution speci-fied by the Velodyne manual; this is to be on the safe side to not get empty cellsat an object’s location. Angle ranges are also set with some margin, at least formeasurements toward the ground. Parameters are listed in Table 4.1.

?

R

?

?

Figure 4.6: Illustrative figure displaying the free-space mapping in 2D. Thedots are laser measurements; each circle section have a radius equal to therange of the closest point. Question marks illustrate that nothing can be saidin those sections since no measurements exist. R is the laser position in thereference scan being free space mapped.

Parameter ValueAzimuth resolution 0.5 degrees/cellElevation resolution 0.4 degrees/cellMax elevation 25 degreesMin elevation -4 degrees

Table 4.1: Parameter table for freespace mapping

Page 37: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

4.8 Project scan and detect points 27

4.8 Project scan and detect points

Since the lookup table is in the coordinate frame of the reference scan, the activescan needs to be projected into the same frame. Projecting the reference scan intothe active scan’s coordinate frame would be more error prone since the center ofthe lookup table wouldn’t be the center of the reference scan. This will give a falsepicture of the free space and will give false detections. Figure 4.7 and Figure 4.8shows examples of these situations. The former figure uses the reference scansensor position as center for the free space mapping, while the latter uses theactive scan’s. The latter case will detect the object which is now further awayfrom the sensor, even though it could also be an object that were occluded before.The occluded space is falsely marked as free space.

In the free-space mapping the full reference measurement was used to give goodfree-space mapping. When attaining the rotation R and translation q from theICP algorithm the segmented scan was used for alignment. This because theground points in both scans tend to draw the alignment so that the sensor posi-tions match and this is not desired. With the rotation R and translation q fromthe ICP algorithm the active scan can be aligned with the reference scan givingthe new projected active scan Sap

Sap = RSa + q. (4.6)

Each point in Sap is then compared with the corresponding cell in the free spacelookup table. Points within a static threshold are marked as candidate points formoving objects. The threshold is based on estimated range measurement devia-tion from earlier experiments and empirical testing. A threshold requiring 0.3mdifference between the range measurements was found to work sufficiently. Thisis a bit higher than what would be necessary if the measurements are taken fromthe same viewpoint, but this is mostly not the case.

Segments are then marked as moving if they pass two other thresholds. A certainpercentage of the segments’ points are required to be candidate points. This is toreduce the effects of different viewpoints, discretization and segmentation faults.

The second threshold is a minimum required candidate points per segment. Thisthreshold serves to avoid marking segments that consist of very few points thatmay arise from dust, strange reflections etc. Both of these are only empiricallytested. Standard thresholds used are given in Table 4.2.

This procedure is for one comparison between a reference scan and the activescan. Now the combined result needs to be brought together. This is simply doneby classifying a segment as moving if any of the comparisons yielded that result.Different variants of the final weighting will be shown in the results.

Page 38: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

28 4 Motion detection using multiple viewpoints

R A

Free space Free space

Figure 4.7: The dots A and R represents the viewpoint when the active andreference scan where taken. Scans are matched and free space is mappedwith R as center. Blue dashed line is the object’s position in the referencescan, blue dashed and dotted represents the measurement points. None ofthem are in free-space, as it should be.

R A

Free space

Free space

Figure 4.8: The dots A and R represents the viewpoint when the active andreference scans were taken. Scans are matched and free space is mappedwith R as center. Blue dashed line is the objects position in the referencescan, blue dashed and dotted represents the measurement points. We seethat some of them are in free-space; this is not a valid conclusion since itcould simply be another object in Sr blocking the one represented in Sa.

Page 39: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

4.8 Project scan and detect points 29

There is a drawback in this approach that arises when scans from different view-points are compared. The laser sensor has an uncertainty in both range andangular precision which in this angular grid approach means that an angularuncertainty could lead to a range difference when there should not be any, or arange uncertainty could make the measurement end up in the wrong grid cell.An illustration of the range uncertainty case can be seen in Figure 4.9.

A

R

Figure 4.9: Difference in viewpoint between active and reference scansmight cause the measurement of the same spot to go into different grid cellsdue to range error.

Threshold ValueRange difference 0.3mPercentage candidates 50%Minimum candidates 5

Table 4.2: Thresholds used in detection

Page 40: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions
Page 41: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

5Results

In this chapter the data sets used are explained. Results that illustrate designchoices are also presented, followed by figures showing the limitations of thealgorithm.

5.1 Data

The data used is supplied by the ACFR and is collected using a HDL-64E S2 fromVelodyne. This is a LiDAR (Light Detection and Ranging) sensor with 64 lasersvertically mounted in a unit rotating 360 degrees. All of the data is collectedin Sydney with the sensor doing 20 revolutions per second; Table 5.1 describesthe data sets used. There are three sets where the sensor moves and two whereit is stationary. Redfern1 is useful for testing performance at stand still in anurban environment and is compared with the moving data sets. The other setwith a non-moving sensor, Aikido, is useful for testing detection of slight humanmotion. Redfern2 has cars driving by in higher speed than Redfern3 which hasmore cars and pedestrians moving. The Opera set is useful for testing pedestrianmotion detection during sensor motion and also features a loop closure, makingit possible to use scans from different passes of the same area.

31

Page 42: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

32 5 Results

Data set Laser ScenarioRedfern1 Still Laser base standing still by a intersection, truck driving

by and pedestrians walking.Redfern2 Moving Laser base meets two cars on road with cars parked on

both sides.Redfern3 Moving Laser base approaches an intersection just as the lights

go green and drives through, cars and people movingaround.

Opera Moving Laser base drives in by the Sydney Opera House, a largenumber of people moving around. It also comes backthe same way and therefore gives data from differentpoints in time for the same area.

Aikido Still Laser base standing in the grass during Aikido practicewith a crowd standing mostly still and watching.

Table 5.1: Table describing the different data sets used.

5.2 Design choices

This section shows results regarding design choices.

5.2.1 Grid size

Grid size was an important factor for correct detection. Here are three exam-ples: too small, close to the truth and too large. As seen in Figure 5.1, a smallgrid size (high resolution) yields less true detections because the points are oftencompared to empty cells in between the cells that holds a range value. With toolarge cells (low resolution) the free space in the scan gets smaller than it actuallyis and detections might therefore be missed. Since the cells are large the errorsfrom viewpoint difference are less, which can be seen at the wall in the figure.The large cells however also makes objects larger, so the car in the reference scanthat is background to the car in the active scan stretches into the space that is un-known. Therefore a higher detection rate will be seen on objects that have littlebackground in the reference scan; other objects will see a lower detection rate.

5.2.2 Handling of no returns

In the algorithm, areas that doesn’t give a laser measurement are considered un-known. Figure 5.2 shows the result of treating these areas as maximum rangemeasurements. Figure 5.3 shows the result of treating them as unknown. Byassuming a maximum range measurement, the amount of false candidate pointsincreases.

Page 43: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

5.2 Design choices 33

(a) Too high resolution yields less truepositives since the measurements are of-ten compared with empty grid cells.

(b) A resolution close to the lasers true,it is slightly lower than the true resolu-tion. Detections along the wall are fromviewpoints difference and discretization.

(c) A too low resolution enlarges thebackground when there is a lack of it,therefore the number of detection pointsare higher. In general the free space getssmaller and the detection rate lower.

Figure 5.1: These figures show the effects of different grid cell sizes in thefree space map. Green points are the active scan, magenta are the pointswithin range threshold in free space and blue asterisks are centers of gridcells in the reference scan’s free space map.

Page 44: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

34 5 Results

(a) Both active scan and detected points.

(b) Showing only detected points.

Figure 5.2: Green points belong to the active scan and the magenta pointsare within the range threshold in the comparison. Green asterisk is the activescans laser position, magenta asterisk is the reference scan’s. Black ellipsesmark the true moving objects. Treating lack of returns as max range readingsgives many points that are falsely detected by the range threshold.

Page 45: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

5.2 Design choices 35

(a) Both active scan and detected points.

(b) Showing only detected points.

Figure 5.3: Green points belong to the active scan and the magenta pointsare within the range threshold in the comparison. Green asterisk is the activescans laser position, magenta asterisk is the reference scan’s. Black ellipsesmark the true moving objects. Treating lack of returns as unknown gives lessfalse detections but also less for true moving objects missing background.

Page 46: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

36 5 Results

5.2.3 Effects of thresholds

This section holds figures showing the difference in result when changing thethresholds. Both the minimum required percentage of detected points in the seg-ment and the minimum required number of detected points are tested in differentscenarios.

The percentage threshold has the most impact on the results of the two. It is usedto lessen the impact of errors from sensor, segmentation and viewpoint difference.In figure 5.4, a scenario is shown where the sensor is in the same position in bothof the compared scans. The four persons training Aikido and the passing personare marked by black ellipses and are certainly known to have moved. Only the10 percent setting manage to detect all of them in this scenario. This is on theother hand a difficult situation, where the objects move on almost the same spotand only one second has passed in between.

In the next scenario, the measurements are taken from different viewpoints andthe percentage threshold needs to be increased to a higher value. This can beseen in Figure 5.5, where the actual moving objects also are marked with blackellipses. The 10 percent setting gives many false positives and even detects awall. 30 percent still detects the same correct segments as the former setting andgives 7 false positives. At 50 percent, 2 segments are missed of the true movingsegments that were detected at the 10 percent setting. This setting however onlygives two false positives and is the one used to generate the other results. Thesetting is chosen to not risk deleting large static segments. Since many viewpointsare used, it still gives a high detection rate.

The threshold specifying the minimum number of required points hasn’t beentested very much. In the used data sets from an urban environment it doesn’thave as much use as the percentage threshold. If it is set too high, it might causepedestrians in the sensor’s outer range to be marked as static. Additionally, anysmall static object that might be detected as moving and excluded from the mapwon’t make much difference for the localization with the map. It might howeverbe of interest if the goal is to make a model of the static environment.

Most of the results are generated with this threshold set to five, the algorithmstill seems to detect far away pedestrians and also avoids detection of some smallstatic objects. Figure 5.6 shows one comparison from the Opera data with thethreshold set to 0, 5 and 10. In the case with 10, pedestrians close to the water(top of figure) aren’t detected. Threshold at five give the same true detections aszero, but one less false positives. Looking at the results in general it seems thatthis threshold can be skipped in these environments, at least if the percentagethreshold is set as high as 50%.

Page 47: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

5.2 Design choices 37

(a) The two scans compared, moving ob-jects are marked with ellipses.

(b) Percentage threshold set to 10%, de-tected segments are cyan.

(c) Percentage threshold set to 30%, de-tected segments are cyan.

(d) Percentage threshold set to 50%, de-tected segments are cyan.

Figure 5.4: This figure shows the detected segments in a scenario with fourpersons practicing Aikido and one person passing by. The difference in timebetween the scans is approximately one second. Only the 10 % case settingmanages to detect all of them, but a lot of other objects are also detected.Minimum number of points is set to five.

Page 48: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

38 5 Results

(a) The two scans compared, moving ob-jects are marked with ellipses.

(b) Percentage threshold set to 10%, de-tected segments are cyan.

(c) Percentage threshold set to 30%, de-tected segments are cyan.

(d) Percentage threshold set to 50%, de-tected segments are cyan.

Figure 5.5: This figure shows the detected segments in a scenario where acar drives through an intersection. The difference in time between the scansis approximately one second. Minimum number of points is set to five. Inthis scenario, the 10 percent threshold gets a lot of falsely detected segments.The 30 percent case gets around 7 false positives but still detects all the onesdetected correctly with 10 percent. The 50 percent setting misses 2 of themoving segments but only gets 2 false positives.

Page 49: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

5.2 Design choices 39

(a) The two scans compared. (b) Minimum detected points thresholdset to zero, detected segments are cyan.

(c) Minimum detected points thresholdset to five, detected segments are cyan.

(d) Minimum detected points thresholdset to 10, detected segments are cyan.

Figure 5.6: This figure shows the detected segments in a scenario at theroundabout close to the Sydney Opera house. There is a large number ofpeople walking alongside the water. If the minimum points threshold is setto five, the only difference in detection from without the threshold is a smallstatic object marked by the black ellipse. If the threshold is increased to 10,more objects goes without detection, the two at the top of the picture arepeople that should be detected. The percentage threshold is here set to 50%.

Page 50: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

40 5 Results

5.2.4 Thresholding on one comparison or several

There are several options for which thresholds that can be used or how to usethem. The two thresholds used when generating these results have been appliedin two ways. The approach used the most is to apply the thresholds for eachpairwise comparison and then assume a segment detected in any comparison tobe moving.

The other approach is to use all the candidate points, i.e. points within free spaceaccording to the range threshold, from all the pairwise comparisons. This meansthat a point is a detected point if it was within the range threshold in any of thepairwise comparisons. The percentage and minimum number of points thresh-olds are then applied to these detected points collected from all reference scansthat were compared with the active scan.

An example is shown in Figure 5.14 and Figure 5.7. The former figure shows thefirst approach, any segment that is not green is detected as moving. In the latterfigure the second approach is shown, where all the cyan segments are detectedas moving. Since both approaches has been using minimum 5 and at least 50%detected points as thresholds, the latter approach which fuse the candidate pointswill give more detections.

Although the latter approach combines the information it also combines the er-rors from all the measurements, therefore it might be necessary to increase thepercentage threshold. Doing this will however make detection of objects that canonly be seen in one of the different reference scans more difficult.

Page 51: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

5.2 Design choices 41

Figure 5.7: The figures shows the result of thresholding combined candidatepoints from four pairwise comparisons. The thresholds used are 50% andminimum 5 points. The ellipses shows the segments detected with this wayof thresholding that were not detected by thresholding every pairwise com-parison. One of them is a correctly detected segment, but the others are falsepositives.

Page 52: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

42 5 Results

5.3 Result and limitations

Here are different cases presented to illustrate the capabilities and limitations ofthe algorithm.

5.3.1 Viewpoint difference

Figure 5.8 shows points within range threshold when the same viewpoint is usedat different times. Figure 5.9 shows points within range threshold when the scansare taken from different viewpoints. In the latter case there are more false posi-tives and they are more clustered. This makes it difficult to distinguish true mov-ing objects from some static objects. This effect is due to the sensor uncertaintyin combination with the different viewpoints discussed earlier.

5.3.2 Lack of background

Figure 5.10 shows a case where a car in the active scan is within the sensors blindspot in the reference scan. Since that is unknown space in the reference scan thepoints in the active scan have no grid cells to compare against and it cannot bedetected. Similar cases appear with an empty street as background.

5.3.3 Segmentation related detection errors

The segmentation algorithm sometimes has problems separating objects whenthey are too close, which will be referred to as under-segmentation. The oppositecase might also occur when measurements from one object results in several seg-ments, this will be referred to as over-segmentation. In Figure 5.11a a case witha pedestrian is shown. The pedestrian in question is shown by the black ellipseand most of its points are within the range threshold; these are colored magenta.This pedestrian isn’t detected because it is in the same segment as the wall. Thewall contains a large number of points that aren’t within the free space and thewhole segment is therefore marked as static.

A case of over-segmentation is shown in Figure 5.11b. The truck is split intothree segments and the one marked by the ellipse doesn’t have enough pointsin the free space. This part of the truck is therefore considered static in thatcomparison. In this specific case the segment was still detected in comparisonwith other reference scans.

5.3.4 Occlusions

Some objects will not be detected due to occlusions in the reference scan. Thesesituations are helped by using several reference scans, as this algorithm does. Anexample is shown in Figure 5.12. In this example, the sensor position in thereference scan makes most of the pedestrian occluded. Only the points markedin magenta are deemed inside free space, which isn’t enough to detect it as non-static. Green points belong to the active scan and blue points are the ones thefree space grid is based on.

Page 53: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

5.3 Result and limitations 43

(a) Both active scan, reference scan anddetected points.

(b) Showing only detected points.

Figure 5.8: Green points belong to the active scan, red to reference scan andthe magenta points are within the range threshold in the comparison. Greenasterisk is the active scan’s laser position, magenta asterisk is the referencescan’s. Black ellipses mark the true moving objects. When the same view-point is used there are less false positives.

(a) Both active scan, reference scan anddetected points.

(b) Showing only detected points.

Figure 5.9: Green points belong to the active scan, red to reference scan andthe magenta points are within the range threshold in the comparison. Greenasterisk is the active scan’s laser position, magenta asterisk is the referencescan’s. Black ellipses mark the true moving objects. When different view-points are used there are more false positives than when same viewpoint isused.

Page 54: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

44 5 Results

Figure 5.10: The car marked by the ellipse isn’t detected since it’s in theblind spot of the reference scan. Green is the active scan, blue are the pointsincluded in the free space map and magenta are the points within rangethreshold.

(a) Pedestrian marked by the ellipse isnot detected because it is in the samesegment as the wall.

(b) The truck is split into three seg-ments, the one marked by the ellipseisn’t detected because not enough pointsare deemed to be in free space.

Figure 5.11: Figures showing under- and over-segmentation. The formercase might result in an object not being detected, while the latter might resultin part of an object not being detected.

Page 55: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

5.3 Result and limitations 45

Figure 5.12: Points forming the free space border are blue, active scan isgreen and magenta are points deemed inside free space. The sensor positionin this reference scan makes most of the pedestrian occluded, only a bit ofits left side is detected.

5.3.5 One versus many

Figure 5.13 shows an example where one scan is used as a reference scan. Figure5.14 shows the result of using four scans as reference scan instead. Each color isa detection from one scan. More true moving objects are detected this way. Theonly one missed is a part of the car, indicated with a red ellipse.

Page 56: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

46 5 Results

Figure 5.13: Detections comparing the active scan and one reference scan.Correctly detected objects are marked with black ellipses, red indicates mov-ing objects not detected. 50% and at least 5 detected points required.

Figure 5.14: Detections using four reference scans. Correctly detected ob-jects are marked with black ellipses, red indicates moving objects not de-tected. 50% and at least 5 detected points required.

Page 57: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

6Concluding remarks

This chapter first discusses what is achieved by this algorithm and then suggestssome improvements that can be made.

6.1 Conclusions

The algorithm developed can detect non-static objects in LiDAR measurementsby comparison with other measurements. It can utilize measurements from dif-ferent points in time and it can therefore be used both offline and online. This isaccording to the specifications made. The detection rate seems high, especiallyfor only using static thresholds. Though some more testing needs to be done tosee how good it actually is. The test cases show a potential for experimentationwith different settings. Not only the static values, but also how the final result fora scan is decided.

The threshold requiring a minimum number of detected points might very wellbe dropped in the scenarios tested, a threshold much larger than five is at leastnot recommended. The threshold requiring a minimum percentage of detectedpoints was found to be crucial for a good balance between true positives andfalse negatives. Of the two different methods for combining the results from thepairwise comparisons, the one that requires a segment to be detected in at leastone comparison seems most beneficial. With this setting, a percentage thresholdaround 30-50% gives good results. This threshold can be varied depending onthe requirements regarding false detections.

47

Page 58: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

48 6 Concluding remarks

Quantifying tests have also been tried, such as ROC (Receiver Operator Statistics)[Metz, 1978] and simple true and false positive counts. These have been difficultto evaluate since it’s hard to tell what is good; it depends on the application inmind. These statistics don’t reflect the fact that some situations are not possibleto detect due to shapes or lack of returns from the laser. Tuning parameters withthese statistics might therefore cause the algorithm to perform worse in scenariosit should handle just to get a few detections in situations that are more difficult forthe algorithm. Therefore the result have been presented as case studies reflectingthe difficult situations appearing and some examples with different parameters.

The results show an increase in false positives among candidate points when dif-ferent viewpoints are used. This problem arises because of sensor uncertaintyand the way free space is represented. The algorithm tries to create diversityamong the reference scans in both space and time, but this result seems to in-dicate no space difference and much time difference would be most preferable.Different viewpoints are however necessary to get a variety of background in-formation. An object far away in the active scan might not have anything closeenough behind it to give any reflections that reaches the laser in reference scanstaken from the same sensor position. A reference scan taken closer to this objectmight however get data from the ground below it or something else behind thisobject and can therefore be used to validate that the object has moved. This iswhy different viewpoints still are preferable.

The algorithm was developed with Pose-SLAM as the sought application. Howthe algorithm works together with Pose-SLAM is unfortunately not tested in thisthesis. But since many of the steps in this algorithm are also used in Pose-SLAMsolutions, it should be possible to extract detection-specific steps from the pre-sented algorithm and include in a Pose-SLAM algorithm. This is because Pose-SLAM also works with full sensor measurements and requires alignment of mea-surements.

6.2 Future work

This section discusses possible improvements to the algorithm. Some are imple-mentation technical, but most suggest other ways of designing parts of the algo-rithm.

6.2.1 Adaptive threshold

The threshold used for the range difference when selecting candidate points isstatic in this implementation. This is however not realistic since the laser accu-racy decreases with the range of the target. If a model was created for this uncer-tainty, the threshold could be a of function the compared measurements range.The threshold τR would then be:

τR = σ (Rr ) + σ (Ra). (6.1)

Page 59: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

6.2 Future work 49

6.2.2 Alternative ways of choosing scans

While this probably wouldn’t affect the results greatly in this project, anothermethod of selecting the reference scans might be considered. The current imple-mentation simply creates distance diversity in a combined space and time coordi-nate system, given a fixed number of scans that should be chosen. An algorithmthat measures and optimizes a sense of information gain might be able to makebetter choices and also decide how many scans that is needed, instead of alwaysselecting a predefined number. This would improve performance, since it mightreduce the number of scans required for comparison. If occlusions also could beaccounted for it should improve the results further.

6.2.3 Adjust segmentation

In the results an instance of under-segmentation was presented: a pedestrianwalking close to a wall becomes part of the wall segment. An algorithm whichdetects this and severs the points corresponding to the moving pedestrian fromthe wall segment would add robustness to the algorithm.

The challenge in creating this function is the similarities between this case andother correct segmentations. Compare the pedestrian and wall case to, for ex-ample, a truck moving orthogonal to the lasers movement. Both cases generatespoints in free space in a specific portion of the segment. In most cases the pedes-trian would probably not be the same height as the wall, however, this assump-tion does not distinguish the cases either. A truck doesn’t have to be consistent inheight, and even if it mostly is, some of the background behind the truck mightbe too far away. This would yield a height difference in the candidate points de-tected, due to the limitations in the handling of no returns from the laser. Shapematching with templates or previously detected segments is an option.

6.2.4 Handling of no returns from laser

As mentioned in the result, the algorithm handles lack of returns from the laseras unknown space. This is because a no-return could either indicate no objectswithin 65 meters, as the manual say, or an occurrence of specular reflection wherethe laser isn’t diffused against the surface and the light isn’t reflected towardsthe receivers. This is two very different cases and the assumption of unknownmust therefore be made. However, if an algorithm existed for detecting these no-returns caused by specular reflection, the missing points could be interpolatedor called unknown. The most important gain in this would be the possibility tocorrectly classify the former case of no-returns as the manual says, no objectswithin 65 meters.

This might be solved by shape matching with templates to guess the nature of theunknown areas. A camera can also be used to provide extra information aboutthe object.

Page 60: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

50 6 Concluding remarks

6.2.5 Final weighting

Different ways of combining the results of the pairwise comparisons can also beconsidered. Segments detected can for example be marked with an uncertainty,depending on how far from the sensor the object is and how much backgroundthat is available for the detection. These can then be combined and checked withsome gating criteria. Simpler methods can be to require detection from say atleast two scans. This however imposes limitations since some objects might onlybe detectable from one reference scan.

6.2.6 Local density

For a segment to be detected, a certain percentage of the points need to be withinthe free space threshold. Another way could be to require a certain density ofpoints in free space to be present in some part of the segment. By doing this, theeffect of different viewpoints might be somewhat mitigated. This may howeverbe hard to implement in a time effective way. Low pass filtering in combinationwith a clustering technique could be a solution, but the points probably needs tobe structured in a graph or tree to speed computations.

6.2.7 Implementation details

In this implementation, the reference scans are compared by spectral registrationwith the active scan to get an initial positioning. An improvement would be touse the relative positions, required by the motion compensation, where possible.Spectral registration would then only be used when needed, e.g. to do the firstalignment of measurements in different data sets or in completely different pointsof time. This was tested but bugs or incremental errors caused it to fail, time wastoo short to tell what was the cause.

Page 61: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

Bibliography

T. Bailey and H. Durrant-Whyte. Simultaneous localization and mapping (SLAM):part II. Robotics & Automation Magazine, IEEE, 13(3):108–117, September2006. ISSN 1070-9932. doi: 10.1109/MRA.2006.1678144. URL http://dx.doi.org/10.1109/MRA.2006.1678144. Cited on page 6.

Yaakov Bar-Shalom and Thomas E. Fortmann. Tracking and data associa-tion. 1988. URL http://books.google.se/books/about/Tracking_and_data_association.html?id=B_FQAAAAMAAJ&redir_esc=y.Cited on page 13.

Charles Bibby and Ian Reid. Simultaneous localisation and mapping in dynamicenvironments (SLAMIDE) with reversible data association. In Robotics: Sci-ence and Systems, 2007. URL http://www.robots.ox.ac.uk/~lav/Papers/bibby_reid_rss07/bibby_reid_rss07.pdf. Cited on page 11.

Paul Checchin, Franck Gérossier, Chritophe Blanc, Roland Chapuis, andLaurent Trassoudaine. Radar Scan Matching SLAM using the Fourier-Mellin Transform. In Field and Service Robotics, July 2009. URLhttp://www.rec.ri.cmu.edu/fsr09/papers/FSR2009_0007_e5691a8c07c4146597ec6f4b92004434.pdf. Cited on page 21.

Qin-Sheng Chen, M. Defrise, and F. Deconinck. Symmetric phase-only matchedfiltering of Fourier-Mellin transforms for image registration and recognition.IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(12):1156–1168, December 1994. ISSN 01628828. doi: 10.1109/34.387491. URL http://dx.doi.org/10.1109/34.387491. Cited on pages 12 and 21.

B. Douillard, J. Underwood, N. Kuntz, V. Vlaskine, A. Quadros, P. Morton, andA. Frenkel. On the segmentation of 3D LIDAR point clouds. In 2011 IEEEInternational Conference on Robotics and Automation, pages 2798–2805. IEEE,May 2011. ISBN 978-1-61284-386-5. doi: 10.1109/ICRA.2011.5979818. URLhttp://dx.doi.org/10.1109/ICRA.2011.5979818. Cited on pages 12and 20.

H. Durrant-Whyte and T. Bailey. Simultaneous localization and mapping: part

51

Page 62: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

52 Bibliography

I. Robotics & Automation Magazine, IEEE, 13(2):99–110, June 2006. ISSN1070-9932. doi: 10.1109/MRA.2006.1638022. URL http://dx.doi.org/10.1109/MRA.2006.1638022. Cited on page 3.

Martin A. Fischler and Robert C. Bolles. Random sample consensus: a paradigmfor model fitting with applications to image analysis and automated cartog-raphy. Commun. ACM, 24(6):381–395, June 1981. ISSN 0001-0782. doi:10.1145/358669.358692. URL http://dx.doi.org/10.1145/358669.358692. Cited on page 12.

Takashi Fujii and Tetsuo Fukuchi. Laser remote sensing. Marcel Dekker,2005. ISBN 9780824742560. URL http://www.worldcat.org/isbn/9780824742560. Cited on page 9.

R. Katz, J. Nieto, and E. Nebot. Probabilistic scheme for laser based motion de-tection. In Intelligent Robots and Systems, 2008. IROS 2008. IEEE/RSJ Inter-national Conference on, pages 161–166. IEEE, 2008. ISBN 978-1-4244-2057-5. doi: 10.1109/IROS.2008.4650636. URL http://dx.doi.org/10.1109/IROS.2008.4650636. Cited on pages 13 and 15.

Jesse Levinson and Sebastian Thrun. Robust vehicle localization in urban envi-ronments using probabilistic maps. In 2010 IEEE International Conferenceon Robotics and Automation, pages 4372–4378. IEEE, May 2010. ISBN 978-1-4244-5038-1. doi: 10.1109/ROBOT.2010.5509700. URL http://dx.doi.org/10.1109/ROBOT.2010.5509700. Cited on page 15.

E. Mazor, A. Averbuch, Y. Bar-Shalom, and J. Dayan. Interacting multiple modelmethods in target tracking: a survey. Aerospace and Electronic Systems, IEEETransactions on, 34(1):103–123, January 1998. ISSN 0018-9251. doi: 10.1109/7.640267. URL http://dx.doi.org/10.1109/7.640267. Cited on page13.

C. E. Metz. Basic principles of ROC analysis. Seminars in nuclear medicine, 8(4):283–298, October 1978. ISSN 0001-2998. URL http://view.ncbi.nlm.nih.gov/pubmed/112681. Cited on page 48.

T. Miyasaka, Y. Ohama, and Y. Ninomiya. Ego-motion estimation and movingobject tracking using multi-layer LIDAR. In Intelligent Vehicles Symposium,2009 IEEE, pages 151–156. IEEE, June 2009. ISBN 978-1-4244-3503-6. doi: 10.1109/IVS.2009.5164269. URL http://dx.doi.org/10.1109/IVS.2009.5164269. Cited on page 13.

John G. Rogers, Alexander J. B. Trevor, Carlos Nieto-Granda, and Henrik I. Chris-tensen. SLAM with Expectation Maximization for moveable object tracking.In 2010 IEEE/RSJ International Conference on Intelligent Robots and Sys-tems, pages 2077–2082. IEEE, October 2010. ISBN 978-1-4244-6674-0. doi:10.1109/IROS.2010.5652091. URL http://dx.doi.org/10.1109/IROS.2010.5652091. Cited on page 15.

S. Rusinkiewicz and M. Levoy. Efficient variants of the ICP algorithm. In 3-D

Page 63: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

Bibliography 53

Digital Imaging and Modeling, 2001. Proceedings. Third International Con-ference on, pages 145–152. IEEE, 2001. ISBN 0-7695-0984-3. doi: 10.1109/IM.2001.924423. URL http://dx.doi.org/10.1109/IM.2001.924423.Cited on pages 12 and 24.

J. van de Ven, F. Ramos, and G. D. Tipaldi. An integrated probabilistic model forscan-matching, moving object detection and motion estimation. In Roboticsand Automation (ICRA), 2010 IEEE International Conference on, pages 887–894. IEEE, May 2010. ISBN 978-1-4244-5038-1. doi: 10.1109/ROBOT.2010.5509586. URL http://dx.doi.org/10.1109/ROBOT.2010.5509586.Cited on page 11.

Trung-Dung Vu, Julien Burlet, and Olivier Aycard. Grid-based localization andlocal mapping with moving object detection and tracking. Information Fusion,12(1):58–69, January 2011. ISSN 15662535. doi: 10.1016/j.inffus.2010.01.004.URL http://dx.doi.org/10.1016/j.inffus.2010.01.004. Cited onpage 13.

Chieh-Chih Wang and C. Thorpe. Simultaneous localization and mapping withdetection and tracking of moving objects. In Robotics and Automation, 2002.Proceedings. ICRA ’02. IEEE International Conference on, volume 3, pages2918–2924. IEEE, 2002. ISBN 0-7803-7272-7. doi: 10.1109/ROBOT.2002.1013675. URL http://dx.doi.org/10.1109/ROBOT.2002.1013675.Cited on pages 11 and 13.

Chieh-Chih Wang, C. Thorpe, and A. Suppe. LADAR-based detection andtracking of moving objects from a ground vehicle at high speeds. In Intel-ligent Vehicles Symposium, 2003. Proceedings. IEEE, pages 416–421. IEEE,June 2003a. ISBN 0-7803-7848-2. doi: 10.1109/IVS.2003.1212947. URLhttp://dx.doi.org/10.1109/IVS.2003.1212947. Cited on page 13.

Chieh-Chih Wang, C. Thorpe, and S. Thrun. Online simultaneous localiza-tion and mapping with detection and tracking of moving objects: theoryand results from a ground vehicle in crowded urban areas. In Roboticsand Automation, 2003. Proceedings. ICRA ’03. IEEE International Confer-ence on, volume 1, pages 842–849 vol.1. IEEE, 2003b. ISBN 0-7803-7736-2.doi: 10.1109/ROBOT.2003.1241698. URL http://dx.doi.org/10.1109/ROBOT.2003.1241698. Cited on pages 5, 11, and 13.

Chieh-Chih Wang, Charles Thorpe, Sebastian Thrun, Martial Hebert, and HughDurrant-Whyte. Simultaneous Localization, Mapping and Moving ObjectTracking. The International Journal of Robotics Research, 26(9):889–916,September 2007. ISSN 1741-3176. doi: 10.1177/0278364907081229. URLhttp://dx.doi.org/10.1177/0278364907081229. Cited on page 13.

Denis F. Wolf and Gaurav S. Sukhatme. Mobile Robot Simultaneous Localizationand Mapping in Dynamic Environments. Autonomous Robots, 19(1):53–65,July 2005. ISSN 0929-5593. doi: 10.1007/s10514-005-0606-4. URL http://dx.doi.org/10.1007/s10514-005-0606-4. Cited on page 13.

Page 64: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

54 Bibliography

Chen Yang and Gérard Medioni. Object modelling by registration of multiplerange images. Image and Vision Computing, 10(3):145–155, April 1992. ISSN02628856. doi: 10.1016/0262-8856(92)90066-C. URL http://dx.doi.org/10.1016/0262-8856(92)90066-C. Cited on page 24.

Shao-Wen Yang and Chieh-Chih Wang. Simultaneous egomotion estimation, seg-mentation, and moving object detection. J. Field Robotics, 28(4):565–588, 2011.doi: 10.1002/rob.20392. URL http://dx.doi.org/10.1002/rob.20392.Cited on page 12.

Huijing Zhao, M. Chiba, R. Shibasaki, Xiaowei Shao, Jinshi Cui, and HongbinZha. SLAM in a dynamic large outdoor environment using a laser scanner. InRobotics and Automation, 2008. ICRA 2008. IEEE International Conferenceon, pages 1455–1462. IEEE, May 2008. ISBN 978-1-4244-1646-2. doi: 10.1109/ROBOT.2008.4543407. URL http://dx.doi.org/10.1109/ROBOT.2008.4543407. Cited on page 13.

Page 65: Institutionen för systemteknik - DiVA portalliu.diva-portal.org/smash/get/diva2:561482/FULLTEXT01.pdf · neighbour and sounding board; Hanna Nyqvist, thank you for all the discussions

Upphovsrätt

Detta dokument hålls tillgängligt på Internet — eller dess framtida ersättare —under 25 år från publiceringsdatum under förutsättning att inga extraordinäraomständigheter uppstår.

Tillgång till dokumentet innebär tillstånd för var och en att läsa, ladda ner,skriva ut enstaka kopior för enskilt bruk och att använda det oförändrat för icke-kommersiell forskning och för undervisning. Överföring av upphovsrätten viden senare tidpunkt kan inte upphäva detta tillstånd. All annan användning avdokumentet kräver upphovsmannens medgivande. För att garantera äktheten,säkerheten och tillgängligheten finns det lösningar av teknisk och administrativart.

Upphovsmannens ideella rätt innefattar rätt att bli nämnd som upphovsmani den omfattning som god sed kräver vid användning av dokumentet på ovanbeskrivna sätt samt skydd mot att dokumentet ändras eller presenteras i sådanform eller i sådant sammanhang som är kränkande för upphovsmannens litteräraeller konstnärliga anseende eller egenart.

För ytterligare information om Linköping University Electronic Press se förla-gets hemsida http://www.ep.liu.se/

Copyright

The publishers will keep this document online on the Internet — or its possi-ble replacement — for a period of 25 years from the date of publication barringexceptional circumstances.

The online availability of the document implies a permanent permission foranyone to read, to download, to print out single copies for his/her own use andto use it unchanged for any non-commercial research and educational purpose.Subsequent transfers of copyright cannot revoke this permission. All other usesof the document are conditional on the consent of the copyright owner. Thepublisher has taken technical and administrative measures to assure authenticity,security and accessibility.

According to intellectual property law the author has the right to be men-tioned when his/her work is accessed as described above and to be protectedagainst infringement.

For additional information about the Linköping University Electronic Pressand its procedures for publication and for assurance of document integrity, pleaserefer to its www home page: http://www.ep.liu.se/

© David Gillsjö