super resolution on 3d point clouds using deep learning€¦ · master in computer vision ......
TRANSCRIPT
Super Resolution on 3D Point
Clouds
using Deep Learning
Belén Luque López
About me
Bachelor's Degree in
Audiovisual Systems
Engineering
Master in Computer
Vision
Master’s Thesis
Javier Ruiz Béatrice
Pesquet
Super Resolution on 3D
Point Clouds using Deep
Learning
May - September
2017
2
Motivation
Telepresence project
Internship at the Image Processing Group
of the UPC (September 2016-April 2017)
Capture room
Visualization room
3
Motivation
Low resolution of point clouds!
Possible solutions:
● Use meshes
● Increase the number
of points 4
Telepresence project
Working with 3D
2.5D (RGB-
D)
Multi-view
projection
Voxelizatio
n
Point cloud
Unorganized list of XYZ
coordinates 5
Super resolution
Super resolution in 2D Super resolution in 2.5D
Super resolution on depth map, then construct
point cloud
Upsample image without losing
spatial detail 6
Super resolution
Super resolution in 3D Super resolution in 2D
7
Create more points, fill the holes
Super resolution
Super resolution in 3D Super resolution in 2D
The position of the new points is
already given
Multiple options, not that
easy! 8
C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation,” CoRR, vol. abs/1612.00593, 2016.
Our first approach
Pointne
t
Novel deep net architecture that directly consumes point clouds
(unordered point sets)
9
Our first approach
The input is an unordered list of XYZ
coordinates!
10
Pointne
t
Our first approach
11
Pointne
t
mini network to learn affine transformation matrix
make the point cloud invariant to rotation/translation
Our first approach
12
Pointne
t
Our first approach
13
Pointne
t X
Y
Z ...
64
XYZ
XYZ
XYZ
n n
Our first approach
14
Pointne
t
Our first approach
15
Pointne
t
10 5 1 3
1 8 5 9
6 2 3 5
9 1 6 3
4 9 3 5
10 9 6 9
Our first approach
16
Pointne
t
Our first approach
Over 6,000 m2 of indoor spaces, almost 700.000.000 points + meshes
Dataset: 2D-3D-Semantics
(http://buildingparser.stanford.edu/dataset.html)
I. Armeni, A. Sax, A. R. Zamir, and S. Savarese,
“Joint 2D-3D-Semantic Data for Indoor Scene
Understanding” ArXiv e-prints, 2017.
17
Our first approach
Input (N
points)
Output (N
points)
Original point cloud (2N
points)
Pointnet
Original point
cloud
50% sampling
Input
data
Ground
truth
Residual training to obtain the double of points
50%
sampling
18
Our first approach
Pointnet
nx
3
Loss:
1. For each output point, find the closest point in the ground truth (nearest neighbor
search with kd-tree)
2. Minimize RMS distance
Output: new XYZ
coordinates
19
Tools
● Data preparation (point cloud downsampling):
PCL library
● Pointnet model: Tensorflow (or third-party
implementation in pyTorch)
○ + Integration with python-pcl for point cloud
processing (nearest neighbor search)
● Data ingestion: h5py library (dataset stored in
HDF5 format)
● Data augmentation (rotation/jittering): numpy
library
● Visualization of results: meshlab software
numpy + scipy 2D
projection
20
Some thoughts
Can we combine the input points (with convolutions, like in 2D images) ?
The input is an unorganized list of 3D points, neighborhoods are not
defined…
→ Semantic segmentation as a previous step?
Multiple solutions exist for the creation of new points. How do we train the
network?
- Minimize distance to closest point in the ground truth
- Minimize distance to the surface (closest face in a mesh)
Do we want the network to create more
points in the less populated areas? How? 21
Thank you!
Belén Luque López