yunhai wang 1 minglun gong 1,2 tianhua wang 1,3 hao (richard) zhang 4 daniel cohen-or 5 baoquan chen...
TRANSCRIPT
Projective Analysis for 3D Shape Segmentation
Yunhai Wang1 Minglun Gong1,2 Tianhua Wang1,3 Hao (Richard) Zhang 4
Daniel Cohen-Or 5 Baoquan Chen1,6
5 Tel-Aviv University4 Simon Fraser University
1Shenzhen Institutes of Advanced Technology
6 Shandong University
3Jilin University2 Memorial University of Newfoundland
2/40
Segmentation of 3D shapes
One of the most fundamental tasks in shape analysisLow-level cues (minimal rule; convexity) alone insufficient
3/40
Learning segmentation
[Kalograkis et al. 10]
Active co-analysis[Wang et al. 2012]
Unsupervised co-analysis[Sidi et al. 2011]
Knowledge-driven approach
Joint segmentation[Huang et al. 2011]
Keys to success: amount & quality of labelled or unlabelled 3D
data
4/40
380 labeled meshes over 19 object categories
3D data challenge: amount
How many 3D models of strollers, golf carts, gazebos, …? Not enough 3D models = insufficient knowledge
Labeling 3D shapes is also a non-trivial task
5/40
Many many more images
About 14 million images across almost 22,000 object categories
Labeling images is quite a bit easier than labeling 3D shapes
6/40
3D data challenge: quality
6
Incomplete
Real-world 3D models (e.g., those from Tremble Warehouse) are often imperfect
Self-intersecting; non-manifold
7/40
Treat a 3D shape as a set of projected binary images
Alleviate various data artifacts in 3D, e.g., self-intersections
Projective shape analysis (PSA)
Then propagate the image labels to the 3D shape
Label these images by learning from vast amount of image data
8/40
Joint image-shape analysis via projective analysis for semantic 3D segmentation
Utilize vast amount of available image data
Allowing us to analyze imperfect 3D shapes
Contributions
9/40
Bi-class Symmetric Hausdorff distance = BiSHDesigned for matching 1D binary images
More sensitive to topology changes (holes)
Caters to our needs: part-aware label transfer
Contributions
10/40Image-guided 3D modeling
[Xu et al.11]
Many works on 2D-3D fusion, e.g., for reconstruction
[Li et al.11]
Related works onshape-image hybrid processing
11/40
Light field descriptor for 3D shape retrieval
[Chen et al.03]
Image-space simplification error
[Lindstrom and Turk 10]
Related works onprojective shape analysis
We deal with the higher-level and more delicate task of semantic 3D
segmentation
12/40
PSA for 3D shape segmentation
Outline
Region-based binary shape matching
Results and conclusion
14/40
Projecting input 3D shape
Assume all objects are upright oriented; they mostly are!
Project an input 3D shape from multiple pre-set viewpoints
15/40
Retrieve labeled images
For each projection of the input 3D shape, retrieve top matches from the set of labelled images
16/40
Select projections for label transfer
Select top (non-adjacent) projections with the smallest average matching costs for label transfer
17/40
Label transfer
Label transfer is done per corresponding horizontal slabs
Pixel correspondence straightforward
Later …
18/40
Confidence map
Label transfer is weighted by a confidence value per pixelThree terms based on image-level, slab-
level, and pixel-level similarity: more similar = higher confidence
19/40
Back projection (2D-to-3D)
Probabilistic map over input 3D shape: computed by integrating per-pixel confidence values over each shape primitiveOne primitive projects to multiple pixels in
multiple imagesPer-pixel confidence gathered over multiple
retrieved images
20/40
Graph cuts optimization
Final labeling of 3D shape: multi-label alpha expansion graph cuts based on the probabilistic map
21/40
PSA for 3D shape segmentation
Outline
Region-based binary shape matching
Results and conclusion
22/40
The matching problem
Projections of input 3D shape
…Database of (labeled)
images
…
Characteristics of the data to be matched
Possibly complex topology (lots of holes), not just a contour
All upright orientated: to be exploited
Goal: find shapes most suitable for label transfer and FAST! Not a global visual similarity based retrievalWant part-aware label transfer but cannot reliably
segmentClassical descriptors, e.g., shape context, interior
distance shape context (IDSC), GIST, Zenike moments, Fourier descriptors, etc., do not quite
fulfill our needs
24/40
Classical choice for distance: symmetric
Hausdorff (SH)
But not sensitive to topology changes; not
part-aware
Scan-lines to slabs
Cluster scan-lines into smaller number of slabs --- efficiency!
Hierarchical clustering by a distance between adjacent slabs
25/40
Extend SH to consider both B and W!
SH for only one class may not be topology-sensitive
A bi-class SH distance is!
A
B
C
B
SH(A,B)=2, SH(Ac, Bc)=10
SH(C,B)=2, SH(Cc, Bc)=2
26/40
A
B
C
B
SH(A,B)=2, SH(Ac, Bc)=10
SH(C,B)=2, SH(Cc, Bc)=2
Bi-class symmetric Hausdorff (BiSH)
BiSH(C,B) = 2
BiSH(A,B) = 10
28/40
Piecewise linear warping
Slabs are scaled/warped vertically for better alignment
Another measure to encourage part-aware label transfer
Slabs of labeled image warped to better align with slabs in
projected image
Warp
Slabs recolored: many-to-one slab matching possible
Recolor
29/40
Slab matching and image similarity
Dissimilarity between slabs: BiSH scaled by slab heightSlab matching allows linear warp: optimized by a dynamic time warping (DTW) algorithm
Dissimilarity between images: sum over slab dissimilarity after warped slab matching
30/40
PSA for 3D shape segmentation
Outline
Region-based binary shape matching
Results and conclusion
31/40
vs. learning mesh segmentation
Same inputs, training data (we project), and experimental setting
Models in [K 2010]: manifold, complete, no self-intersections
PSA allows us to handle any category and imperfect shapes
32/40
Data
11 object categories; about 2600 labeled images
All input 3D shapes tested have self-intersections as well as other data artifacts
35/40
Timing
Matching two images (512 x 512) takes 0.06 seconds
Label transfer (2D-to-2D then to 3D): about 1 minute for a 20K-triangle meshNumber of selected projections: 5 – 10
Number of retrieved images per projection: 2
36/40
Conclusions
Projective shape analysis (PSA): semantic 3D segmentation by learning from labeled 2D images
Demonstrated potential in labeling 3D models: imperfect, complex topology, over any category
37/40
Main advantages
No strong requirements on quality of 3D model
Utilize the rich availability and ease of processing of photos for 3D shape analysis
38/40
Limitations
Inherent limitation of 2D projections: they do not fully capture 3D info
Inherent to data-driven: knowledge has to be in data
Assuming upright; not designed for articulated shapes
Relying on spatial and not feature-space analysis
39/40
Future work
Labeling 2D images is still tedious: unsupervised projective analysis
Additional cues from images and projections, e.g., color, depth, etc.
Apply PSA for other knowledge-driven analyses
40/40
Thank you!
More results and data can be found from
http://web.siat.ac.cn/~yunhai/psa.html