super resolution on 3d point clouds using deep learning€¦ · master in computer vision ......

Super Resolution on 3D Point

Clouds

using Deep Learning

Belén Luque López

About me

Bachelor's Degree in

Audiovisual Systems

Engineering

Master in Computer

Vision

Master’s Thesis

Javier Ruiz Béatrice

Pesquet

Super Resolution on 3D

Point Clouds using Deep

Learning

May - September

2017

2

Motivation

Telepresence project

Internship at the Image Processing Group

of the UPC (September 2016-April 2017)

Capture room

Visualization room

3

Motivation

Low resolution of point clouds!

Possible solutions:

● Use meshes

● Increase the number

of points 4

Telepresence project

Working with 3D

2.5D (RGB-

D)

Multi-view

projection

Voxelizatio

n

Point cloud

Unorganized list of XYZ

coordinates 5

Super resolution

Super resolution in 2D Super resolution in 2.5D

Super resolution on depth map, then construct

point cloud

Upsample image without losing

spatial detail 6

Super resolution

Super resolution in 3D Super resolution in 2D

7

Create more points, fill the holes

Super resolution

Super resolution in 3D Super resolution in 2D

The position of the new points is

already given

Multiple options, not that

easy! 8

C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation,” CoRR, vol. abs/1612.00593, 2016.

Our first approach

Pointne

t

Novel deep net architecture that directly consumes point clouds

(unordered point sets)

9

Our first approach

The input is an unordered list of XYZ

coordinates!

10

Pointne

t

Our first approach

11

Pointne

t

mini network to learn affine transformation matrix

make the point cloud invariant to rotation/translation

Our first approach

12

Pointne

t

Our first approach

13

Pointne

t X

Y

Z ...

64

XYZ

XYZ

XYZ

n n

Our first approach

14

Pointne

t

Our first approach

15

Pointne

t

10 5 1 3

1 8 5 9

6 2 3 5

9 1 6 3

4 9 3 5

10 9 6 9

Our first approach

16

Pointne

t

Our first approach

Over 6,000 m2 of indoor spaces, almost 700.000.000 points + meshes

Dataset: 2D-3D-Semantics

(http://buildingparser.stanford.edu/dataset.html)

I. Armeni, A. Sax, A. R. Zamir, and S. Savarese,

“Joint 2D-3D-Semantic Data for Indoor Scene

Understanding” ArXiv e-prints, 2017.

17

http://buildingparser.stanford.edu/dataset.html

Our first approach

Input (N

points)

Output (N

points)

Original point cloud (2N

points)

Pointnet

Original point

cloud

50% sampling

Input

data

Ground

truth

Residual training to obtain the double of points

50%

sampling

18

Our first approach

Pointnet

nx

3

Loss:

1. For each output point, find the closest point in the ground truth (nearest neighbor

search with kd-tree)

2. Minimize RMS distance

Output: new XYZ

coordinates

19

Tools

● Data preparation (point cloud downsampling):

PCL library

● Pointnet model: Tensorflow (or third-party

implementation in pyTorch)

○ + Integration with python-pcl for point cloud

processing (nearest neighbor search)

● Data ingestion: h5py library (dataset stored in

HDF5 format)

● Data augmentation (rotation/jittering): numpy

library

● Visualization of results: meshlab software

numpy + scipy 2D

projection

20

Some thoughts

Can we combine the input points (with convolutions, like in 2D images) ?

The input is an unorganized list of 3D points, neighborhoods are not

defined…

→ Semantic segmentation as a previous step?

Multiple solutions exist for the creation of new points. How do we train the

network?

- Minimize distance to closest point in the ground truth

- Minimize distance to the surface (closest face in a mesh)

Do we want the network to create more

points in the less populated areas? How? 21

Thank you!

Belén Luque López

super resolution on 3d point clouds using deep learning€¦ · master in computer vision ......

Documents