large-scale point cloud classification benchmark · 2016. 8. 15. · igp & cvg, eth zürich...
TRANSCRIPT
||IGP & CVG, ETH Zürich
www.semantic3d.net
7/6/2016www.semantic3d.net, [email protected] 1
Large-Scale Point Cloud Classification
Benchmark
||IGP & CVG, ETH Zürich
People
7/6/2016www.semantic3d.net, [email protected] 2
Timo
Hackel
Nikolay
Savinov
Ľubor
Ladický
Jan Dirk
Wegner
Konrad
Schindler
Marc
Pollefeys
||IGP & CVG, ETH Zürich
Terrestrial Laser Scanning
Architecture, Archaeology , City Modelling, Mining,
Monitoring, …
7/6/2016www.semantic3d.net, [email protected] 3
~200.000.000 $ annually spend for
terrestrial laser scanners
Measure up to 2.000.000 3D points
per second
Laser scanner
||IGP & CVG, ETH Zürich
Objective
Assign a class label to each 3D point individually
7/6/2016www.semantic3d.net, [email protected] 4
||IGP & CVG, ETH Zürich
Related Benchmark Tests
[1] Vallet, Bruno, et al. "TerraMobilita/iQmulus urban point cloud analysis benchmark."
Computers & Graphics 49 (2015): 126-133.
[2] Silberman, Nathan, et al. "Indoor segmentation and support inference from RGBD
images." Computer Vision–ECCV 2012. Springer Berlin Heidelberg, 2012. 746-760.
7/6/2016www.semantic3d.net, [email protected] 5
iQmulus & TerraMobilita NYU
||IGP & CVG, ETH Zürich 7/6/2016www.semantic3d.net, [email protected] 6
||IGP & CVG, ETH Zürich
Data
{x, y, z, intensity, r, g, b}
7/6/2016www.semantic3d.net, [email protected] 7
||IGP & CVG, ETH Zürich 7/6/2016www.semantic3d.net, [email protected] 8
Intensity
External illumination sensor used to remove external illumination (sun, etc.)
Value range: -2047 to + 2048
||IGP & CVG, ETH Zürich 7/6/2016www.semantic3d.net, [email protected] 9
Color
Color was recorded after scanning
||IGP & CVG, ETH Zürich 7/6/2016www.semantic3d.net, [email protected] 10
Cube maps
||IGP & CVG, ETH Zürich 7/6/2016www.semantic3d.net, [email protected] 11
Scanning Artefacts
||IGP & CVG, ETH Zürich 7/6/2016www.semantic3d.net, [email protected] 12
Classes
Man made terrain
Natural terrain
High vegetation
Low vegetation
Buildings
Clutter (hard scape)
Scanning artefacts
Cars
||IGP & CVG, ETH Zürich
training and test set
Over 3 billion labelled points
7/6/2016www.semantic3d.net, [email protected] 13
||IGP & CVG, ETH Zürich 7/6/2016www.semantic3d.net, [email protected] 14
Concept
“reduced-8”
reduced challenge
“semantic-8”
full challenge
training set:
1.660.295.483 points
15 scans
unfiltered
test set:
78.699.329 points
3 scans
voxel grid filter:
1 cm resolution
test set:
2.348.935.300 points
15 scans
unfiltered
||IGP & CVG, ETH Zürich 7/6/2016www.semantic3d.net, [email protected] 15
How to submit your results: ww.semantic3d.net
1. Create zip file containing the results as described in “submit”
2. login
3. Create a new classifier
4. Upload data
5. Choose if you want to make results public
1. 2.
3.4. 5.
||IGP & CVG, ETH Zürich
Class distribution
7/6/2016www.semantic3d.net, [email protected] 16
0100200300400500600700800900
Train, million
Test, million
||IGP & CVG, ETH Zürich
Evaluation measures
7/6/2016www.semantic3d.net, [email protected] 17
Main: intersection over union (IoU)
Averaged over all classes
Well-established for segmentation:
e.g. PASCAL VOC
If cij = count [class i predicted as j]
then:
Auxiliary measure: accuracy
i
i
j
k
||IGP & CVG, ETH Zürich
Human labeller disagreement ≤ 4%
7/6/2016www.semantic3d.net, [email protected] 18
00.0050.01
0.0150.02
0.0250.03
0.0350.04
0.045
1 - (intersection over union)
||IGP & CVG, ETH Zürich
Baseline methods
7/6/2016www.semantic3d.net, [email protected] 19
Covariance
FeaturesPure Color Deep
Learning
||IGP & CVG, ETH Zürich 7/6/2016Timo Hackel, [email protected] 21
Cube mapping
||IGP & CVG, ETH Zürich 7/6/2016Timo Hackel, [email protected] 22
Result
Reduced challenge
Full challenge
||IGP & CVG, ETH Zürich
Baseline II
Deep Learning on voxelized neighbourhood
7/6/2016www.semantic3d.net, [email protected] 23
Voxel occupancy
grid: any point
inside voxel? (0/1)
3D Convolutional
Neural Network
Goal: classify
center of
neighbourhood
(sliding 3D window)
Semantic label
||IGP & CVG, ETH Zürich
Baseline II
Deep Learning on voxelized neighbourhood
7/6/2016www.semantic3d.net, [email protected] 24
Classification & detection, no segmentation
3D ShapeNets [CVPR’15]
VoxNet [IROS’15]
Deep Sliding Shapes [CVPR’16]
…
Prior work:
||IGP & CVG, ETH Zürich 7/6/2016Timo Hackel, [email protected] 25
Voxelization details
16x16x16 voxels neighbourhood
5 scales considered: voxel size 2.5 cm -
0.8 m
Scales concatenated
Batch constructed on C++ side
Batch transferred to Torch via Luajit
occupancy hash table (sparse)
classified x, y, z query
neighbourhood (dense)
…
||IGP & CVG, ETH Zürich 7/6/2016Timo Hackel, [email protected] 26
Deep net details
VGG-like
Multi-scale
Torch
Input scale 1
Input scale 5
…
VGG-like 1
VGG-like 5
… Concatenation
of FCFC
Predictions
VGG 3x3x3
kernel
||IGP & CVG, ETH Zürich
Training details & results
xy-rotation augmentations necessary
z direction aligned with gravity
Batch size 100, every 100 batches random xy-rotation
Trained on 1 point cloud with 259 million points
Classes sampled randomly with equal probabilities
Test results: DeepNet better than 2D color classification
x y
z
7/6/2016Timo Hackel, [email protected] 27
||IGP & CVG, ETH Zürich 7/6/2016Timo Hackel, [email protected] 28
Code available soon!
C++/Lua
You could build your algorithm on top of that.
Will be on benchmark website.
Subscribe to our newsletter!
||IGP & CVG, ETH Zürich
Baseline III
traditional machine learning with multiscale features
7/6/2016Timo Hackel, [email protected] 29
compute
neighborhoods
extract
features
classify
semantically
X=2
Y=3.1 Y=1.5
p1 p2 p3 p4
RFX Y
||IGP & CVG, ETH Zürich 7/6/2016Timo Hackel, [email protected] 30
Neighborhood approximation
KD-trees slow
Further approximation needed
Multiscale voxel-grid filtering
KD-tree pyramid
||IGP & CVG, ETH Zürich 7/6/2016Timo Hackel, [email protected] 31
Feature extraction
Spherical neighborhood
Cylindrical neighborhood
||IGP & CVG, ETH Zürich 7/6/2016Timo Hackel, [email protected] 32
Training
4. Detect and remove unused features
1. Subsampling of training set x to generate x’
2. Grid search and cross-validation depth d of random forest
3. Train random forest using d and x’
class frequencies per scan do not necessarily
represent prior distribution over class labels
Error weights depend on distance to scanner
Subsampling when scanner origin unknown
||IGP & CVG, ETH Zürich 7/6/2016Timo Hackel, [email protected] 33
Classification & Implementation details
Precompute KD-Trees and
keep them in RAM
Evaluate features on the fly
(nearly no RAM needed)
Implementation in C++ with
OpenMP
Solve eigenvalues and
eigenvectors of 3x3 Matrix
analytically
||IGP & CVG, ETH Zürich 7/6/2016Timo Hackel, [email protected] 34
Results on Mobile Mapping Data
||IGP & CVG, ETH Zürich 7/6/2016Timo Hackel, [email protected] 35
Results on Mobile Mapping Data
||IGP & CVG, ETH Zürich 7/6/2016Timo Hackel, [email protected] 36
Result
Reduced challenge
Full challenge
||IGP & CVG, ETH Zürich
Why does the confusion matrix not contain all points
from the submitted result?
Why ASCII and not LAZ?
Why is the training set of semantic-8 and reduced-8
the same?
What are the next steps?
7/6/2016www.semantic3d.net, [email protected] 37
Questions and answers
||IGP & CVG, ETH Zürich 7/6/2016www.semantic3d.net, [email protected] 38
Demo of baseline III