csc2457 3d & geometric deep presenter: shayan
TRANSCRIPT
![Page 1: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/1.jpg)
CSC2457 3D & Geometric Deep Learning
Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations Vincent Sitzmann, Michael Zollhofer and Gordon Wetzstein
Feb 23rd Presenter: Shayan Shekarforoush
Instructor: Animesh Garg
![Page 2: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/2.jpg)
Learning Scene Representation
• With 3D Bias:
• Or not:
![Page 3: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/3.jpg)
Applications
•Downstream tasks
![Page 4: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/4.jpg)
Challenges
3D supervision
![Page 5: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/5.jpg)
Challenges
3D supervision
![Page 6: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/6.jpg)
Challenges
3D supervision 2D image + Camera pose
![Page 7: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/7.jpg)
Challenges
Geometry Geometry + Appearance
![Page 8: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/8.jpg)
Challenges
Multi-view consistency
![Page 9: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/9.jpg)
Challenges
Voxel resolutions
![Page 10: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/10.jpg)
Challenges
Voxel resolutions
Point cloud sparsity
![Page 11: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/11.jpg)
Contributions
• A continuous, 3D structure aware, neural scene representation encoding geometry and appearance a multi-view consistent manner.• Along with a Differentiable ray marching algorithm for rendering.
• End-to-end training without explicit 3D supervision.
• Generalizable to other geometry or appearance.
• Evaluation in:• Novel view synthesis.• Few-shot reconstruction.• ...
![Page 12: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/12.jpg)
Problem Setting
Input data:
2D image:
![Page 13: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/13.jpg)
Problem Setting
Input data:
2D image:Extrinsic matrix:Intrinsic matrix:
![Page 14: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/14.jpg)
Implicit Scene Function
![Page 15: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/15.jpg)
Implicit Scene Function
0.2 -0.5 1.7 . . . 2.4 -0.8
![Page 16: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/16.jpg)
Implicit Scene Function
0.2 -0.5 1.7 . . . 2.4 -0.8
Visual
Geometry
![Page 17: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/17.jpg)
Implicit Scene Function
![Page 18: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/18.jpg)
Implicit Scene Function
Higher Resolution
![Page 19: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/19.jpg)
Neural Rendering
Neural Renderer
E, K
2D Image
![Page 20: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/20.jpg)
Neural Rendering
•Ray Marching
•Pixel Generator
![Page 21: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/21.jpg)
Ray Marching
Parametrize ray marching out of pixel (u, v):
![Page 22: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/22.jpg)
Ray Marching
Parametrize ray marching out of pixel (u, v):Intersection as optimization:
Intersection
Surface
![Page 23: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/23.jpg)
Ray Marching
Parametrize ray marching out of pixel (u, v):
![Page 24: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/24.jpg)
Ray Marching
Parametrize ray marching out of pixel (u, v):
0.2 -0.5 1.7 . . . 2.4 -0.8
![Page 25: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/25.jpg)
Ray Marching
Parametrize ray marching out of pixel (u, v):
0.2 -0.5 1.7 . . . 2.4 -0.8
LSTM
![Page 26: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/26.jpg)
Ray Marching
Parametrize ray marching out of pixel (u, v):
0.2 -0.5 1.7 . . . 2.4 -0.8
LSTM
![Page 27: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/27.jpg)
Ray Marching
Parametrize ray marching out of pixel (u, v):
![Page 28: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/28.jpg)
Pixel Generator
Per Pixel:
0.2 -0.5 1.7 . . . 2.4 -0.8
![Page 29: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/29.jpg)
Pixel Generator
Per Pixel:
0.2 -0.5 1.7 . . . 2.4 -0.8
MLP
RGB
![Page 30: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/30.jpg)
General Framework
![Page 31: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/31.jpg)
General Framework
![Page 32: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/32.jpg)
Generalization over Scenes
![Page 33: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/33.jpg)
Generalization over Scenes
![Page 34: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/34.jpg)
Generalization over Scenes
MLP Weights
![Page 35: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/35.jpg)
Generalization over Scenes
MLP Weights
Latent code
![Page 36: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/36.jpg)
Generalization over Scenes
MLP Weights
Latent code
Hypernetwork (MLP)
![Page 37: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/37.jpg)
Optimization
Joint optimization:
𝜽: Neural Renderer
![Page 38: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/38.jpg)
Optimization
Joint optimization using SGD:
Instances Viewpoints
![Page 39: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/39.jpg)
Optimization
Joint optimization using SGD:
2
L2 reconstruction loss
![Page 40: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/40.jpg)
Optimization
Joint optimization using SGD:
2
L2 reconstruction loss Positive depth
![Page 41: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/41.jpg)
Optimization
Joint optimization using SGD:
2
Gaussian PriorL2 reconstruction loss Positive depth
![Page 42: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/42.jpg)
Optimization
•
1st shot 2nd shot
Trained beforehand
![Page 43: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/43.jpg)
Shepard Metzler
•7 element objects•Novel view synthesis on:
• Training set• Few-shot on 100 test objects
![Page 44: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/44.jpg)
ShapeNet
•Cars and Chairs•Novel view synthesis on:
• Training set.• Few-shot on official test objects.
![Page 45: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/45.jpg)
ShapeNet
![Page 46: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/46.jpg)
ShapeNet
![Page 47: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/47.jpg)
Latent space interpolation
![Page 48: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/48.jpg)
Camera pose extrapolation
Camera zoom Camera rotation
![Page 49: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/49.jpg)
Basel face model
•Available disentangled latent:• Identity• Expression
![Page 50: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/50.jpg)
Minecraft room
Room scale scene
![Page 51: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/51.jpg)
Critique / Limitations / Open Issues
•Availability of camera pose?
•Effects of view or lighting?
• Failure cases.
![Page 52: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/52.jpg)
Critique / Limitations / Open Issues
•
![Page 53: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/53.jpg)
Contributions (Recap)
• A continuous, 3D structure aware, neural scene representation encoding geometry and appearance a multi-view consistent manner.• Along with a Differentiable ray marching algorithm for rendering.
• End-to-end training without explicit 3D supervision.
• Generalizable to other geometry or appearance.
• Evaluation in:• Novel view synthesis.• Few-shot reconstruction.• ...
![Page 54: CSC2457 3D & Geometric Deep Presenter: Shayan](https://reader035.vdocuments.site/reader035/viewer/2022062222/62a432bef3e39b36505fda67/html5/thumbnails/54.jpg)
Thank you!