video-to-video synthesis - developer.download.nvidia.com · 2 generative adversarial networks...
TRANSCRIPT
![Page 1: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/1.jpg)
Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Guilin Liu, Andrew Tao, Jan Kautz, Bryan Catanzaro
VIDEO-TO-VIDEO SYNTHESIS
![Page 2: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/2.jpg)
2
GENERATIVE ADVERSARIAL NETWORKSUnconditional GANs
Generator Discriminator
Discriminator
False
True
Image credit: Celebrity dataset, Jensen Huang, Founder and CEO of NVIDIA, Ian Goodfellow, Father of GANs.
~
![Page 3: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/3.jpg)
3
After training for a
while using NVIDIA
DGX1 machinesFun sampling time begin
Generator
Image credit: NVIDIA StyleGAN
![Page 4: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/4.jpg)
4
CONDITIONAL GANSAllow user more control on the sampling process
Modeling
(training)
Sampling
(testing)
Generated result Given info (e.g. image, text)
output style Given info (e.g. image, text)
![Page 5: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/5.jpg)
5
SKETCH-CONDITIONAL GANS
Generator
Image credit: NVIDIA pix2pixHD
![Page 6: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/6.jpg)
6
IMAGE-CONDITIONAL GANS
Image credit: NVIDIA MUNIT
![Page 7: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/7.jpg)
7
MASK-CONDITIONAL GANSSemantic Image Synthesis
![Page 8: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/8.jpg)
8
MASK-CONDITIONAL GANSSemantic Image Synthesis
![Page 9: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/9.jpg)
9
LIVE DEMO
I need to get an RTX Ready Laptop (https://www.nvidia.com/en-us/geforce/gaming-laptops/20-series/)
It is running live in GTC
Will be online for everyone to try out in NVIDIA AI Playground website (https://www.nvidia.com/en-us/research/ai-playground/)
![Page 10: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/10.jpg)
10
Interface
![Page 11: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/11.jpg)
11
![Page 12: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/12.jpg)
12
PROBLEM WITH PREVIOUS METHODS
input result
![Page 13: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/13.jpg)
13
Batch Norm (Ioffe et al. 2015)
𝑦 =𝑥 − 𝜇
𝜎⋅ 𝛾 + 𝛽
normalization
affine transform
de-normalization
PROBLEM WITH PREVIOUS METHODS
removes label information
0
1
0
0
𝑥=
0
0
0
1
𝑥=
same output!
![Page 14: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/14.jpg)
14
PROBLEM WITH PREVIOUS METHODS
input result
![Page 15: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/15.jpg)
15
PROBLEM WITH PREVIOUS METHODS
• Do not feed the label map directly to network
• Use the label map to generate normalization layers instead
![Page 16: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/16.jpg)
16
element-wise
conv
𝛾
Parameter-free
Batch Norm
conv
SPADE(SPatially Adaptive DEnormalization)
𝛽
𝑦 =𝑥 − 𝜇
𝜎⋅ 𝛾 + 𝛽
network input network output𝑥 𝑦(label free)
label free
![Page 17: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/17.jpg)
19
SPADESPatially Adaptive DE-normalization
element-wise
conv
𝛾
Parameter-free
Batch Norm
conv
𝛽
![Page 18: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/18.jpg)
20
SP
AD
E
Re
LU
3x3
Conv
Re
LU
3x3
Conv
SP
AD
ESPADE ResBlk
SPADE RESIDUAL BLOCKS
![Page 19: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/19.jpg)
21
~SPADE
ResBlk
SPADE
ResBlk
SPADE
ResBlk
SPADE
ResBlk
SPADE GENERATOR
![Page 20: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/20.jpg)
22
PROBLEM WITH PREVIOUS METHODS
input w/o SPADE w/ SPADE
![Page 21: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/21.jpg)
23
![Page 22: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/22.jpg)
24
![Page 23: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/23.jpg)
25
![Page 24: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/24.jpg)
26
Multimodal Results on FlickrIMAGE RESULTS
![Page 25: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/25.jpg)
27
Multimodal Results on FlickrIMAGE RESULTS
![Page 26: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/26.jpg)
28
![Page 27: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/27.jpg)
29
![Page 28: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/28.jpg)
30
![Page 29: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/29.jpg)
31
![Page 30: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/30.jpg)
33
VIDEO-TO-VIDEO SYNTHESIS
![Page 31: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/31.jpg)
34
IMAGE-TO-IMAGE SYNTHESIS
Car
Road
Tree
Sidewalk
Building
![Page 32: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/32.jpg)
35
VIDEO-TO-VIDEO SYNTHESIS
![Page 33: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/33.jpg)
36
VIDEO-TO-VIDEO SYNTHESIS
![Page 34: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/34.jpg)
37
VIDEO-TO-VIDEO SYNTHESIS
![Page 35: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/35.jpg)
38
VIDEO-TO-VIDEO SYNTHESIS
![Page 36: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/36.jpg)
39
MOTIVATION
• AI-based rendering
Traditional graphics
Geometry, texture, lighting
Machine learning graphics
Data
![Page 37: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/37.jpg)
40
• AI-based rendering
• High-level semantic manipulation
Largely explored
MOTIVATION
Original image New image
Edit here!
Segmentation Keypoint Detection
etc
Image/video synthesis
little explored (this work)
High-level representation
![Page 38: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/38.jpg)
41
PREVIOUS WORK
Video style transfer
COVST [2017], ArtST [2016]
Unconditional synthesis
MoCoGAN [2018], TGAN [2017], VGAN [2016]
Video prediction
MCNet [2017], PredNet [2017]
Image translation
pix2pixHD [2018], CRN [2017], pix2pix [2017]
![Page 39: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/39.jpg)
42
PREVIOUS WORK: FRAME-BY-FRAME RESULT
![Page 40: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/40.jpg)
43
OUR METHOD
• Sequential generator
• Multi-scale temporal discriminator
• Spatio-temporal progressive training procedure
![Page 41: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/41.jpg)
44
OUR METHOD
Sequential Generator
W
![Page 42: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/42.jpg)
45
OUR METHOD
Image Discriminator Video Discriminator
D1
D2
D3
D1 D2 D3
Multi-scale DiscriminatorsSequential Generator
W
![Page 43: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/43.jpg)
46
OUR METHOD
...
Spatially progressive
Temporally progressive
Spatio-temporally Progressive Training
...
Residual blocks Alternating training
T
T T
SS S
![Page 44: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/44.jpg)
47
RESULTS
![Page 45: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/45.jpg)
48
RESULTS
• Semantic → Street view scenes
• Edges → Human faces
• Poses → Human bodies
![Page 46: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/46.jpg)
49
RESULTS
• Semantic → Street view scenes
• Edges → Human faces
• Poses → Human bodies
![Page 47: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/47.jpg)
50
STREET VIEW: CITYSCAPES
Semantic map pix2pixHD
COVST (video style transfer) Ours
![Page 48: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/48.jpg)
51
STREET VIEW: BOSTON
![Page 49: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/49.jpg)
52
STREET VIEW: NYC
![Page 50: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/50.jpg)
53
RESULTS
• Semantic → Street view scenes
• Edges → Human faces
• Poses → Human bodies
![Page 51: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/51.jpg)
54
FACE SWAPPING (FACE → EDGE → FACE)
input edges output
![Page 52: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/52.jpg)
55
FACE SWAPPING (SLIMMER FACE)
input (slimmed) edges (slimmed) output
![Page 53: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/53.jpg)
56
FACE SWAPPING (SLIMMER FACE)
input (slimmed) edges (slimmed) output
![Page 54: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/54.jpg)
57
MULTI-MODAL EDGE → FACE
Style 1 Style 2 Style 3
![Page 55: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/55.jpg)
58
RESULTS
• Semantic → Street view scenes
• Edges → Human faces
• Poses → Human bodies
![Page 56: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/56.jpg)
59
MOTION TRANSFER (BODY → POSE → BODY)
input poses output
![Page 57: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/57.jpg)
60
MOTION TRANSFER (BODY → POSE → BODY)
input poses output
![Page 58: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/58.jpg)
61
MOTION TRANSFER (BODY → POSE → BODY)
input poses output
![Page 59: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/59.jpg)
62
MOTION TRANSFER (BODY → POSE → BODY)
input poses output
![Page 60: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/60.jpg)
63
MOTION TRANSFER
![Page 61: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/61.jpg)
64
EXTENSION: FRAME PREDICTION
• Goal: predict future frames given past frames
• Our method: decompose prediction into two steps
• 1. predict the semantic map for next frame
• 2. synthesize the frame based on the semantic map
![Page 62: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/62.jpg)
65
EXTENSION: FRAME PREDICTION
Ground truth PredNet
MCNet Ours
![Page 63: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/63.jpg)
66
INTERACTIVE GRAPHICS
![Page 64: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/64.jpg)
67
PATH TO INTERACTIVE GRAPHICS
• Real-time inference
• Combining with existing graphics pipeline
• Domain gap between real input and synthetic input
![Page 65: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/65.jpg)
68
PATH TO INTERACTIVE GRAPHICS
• Real-time inference
• Combining with existing graphics pipeline
• Domain gap between real input and synthetic input
![Page 66: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/66.jpg)
69
PATH TO INTERACTIVE GRAPHICS
• Real-time inference
• FP16 + TensorRT → ~5 times speed up
• 36ms (27.8 fps) for 1080p inference
• Overall: 15~25 fps
![Page 67: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/67.jpg)
70
PATH TO INTERACTIVE GRAPHICS
• Real-time inference
• Combining with existing graphics pipeline
• CARLA: open-source simulator for autonomous driving research
• Make game engine render semantic maps
• Pass the maps to the network and display the inference result
![Page 68: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/68.jpg)
71
PATH TO INTERACTIVE GRAPHICS
• Real-time inference
• Combining with existing graphics pipeline
• Domain gap between real input and synthetic input
• Network trained on real data but tested on synthetic data
• Things that differ: Object shapes/edges, density of objects, camera viewpoints, etc
• On-going work
![Page 69: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/69.jpg)
72
ORIGINAL CARLA IMAGE
![Page 70: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/70.jpg)
73
RENDERED SEMANTIC MAPS
![Page 71: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/71.jpg)
74
RECORDED DEMO RESULTS
![Page 72: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/72.jpg)
75
RECORDED DEMO RESULTS
![Page 73: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/73.jpg)
76
CONCLUSION
![Page 74: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/74.jpg)
77
CONCLUSION
• What can we achieve?
• What can it be used for?
![Page 75: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/75.jpg)
78
CONCLUSION
• What can we achieve?
• Synthesize high-res realistic images
![Page 76: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/76.jpg)
79
CONCLUSION
• What can we achieve?
• Synthesize high-res realistic images
• Produce temporally-smooth videos
![Page 77: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/77.jpg)
80
CONCLUSION
• What can we achieve?
• Synthesize high-res realistic images
• Produce temporally-smooth videos
• Reinvent interactive graphics
![Page 78: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/78.jpg)
81
CONCLUSION
• What can we achieve?
• What can it be used for?
• AI-based rendering
• High-level semantic manipulation
Traditional graphics
Machine learning graphics
Original image New image
High-level representation
![Page 79: VIDEO-TO-VIDEO SYNTHESIS - developer.download.nvidia.com · 2 GENERATIVE ADVERSARIAL NETWORKS Unconditional GANs Generator Discriminator Discriminator False True Image credit: Celebrity](https://reader030.vdocuments.site/reader030/viewer/2022041212/5e007597153847426734eb1d/html5/thumbnails/79.jpg)
THANK YOU
https://github.com/NVIDIA/vid2vid