crom - cpu/gpu hybrid computation platform for...
TRANSCRIPT
![Page 1: Crom - CPU/GPU Hybrid Computation Platform for …on-demand.gputechconf.com/gtc/2013/presentations/S3476...Crom - CPU/GPU Hybrid Computation Platform for Visual Effects Nathan Cournia,](https://reader033.vdocuments.site/reader033/viewer/2022041612/5e3878f30c045c26467886ea/html5/thumbnails/1.jpg)
Crom - CPU/GPU Hybrid Computation Platform for Visual Effects
Nathan Cournia, Casey Vanover, Bill Spitzak, Hans Rijpkema,Josh Tomlinson, Bradley Smith, Nathan Litke
Rhythm and Hues Studios
![Page 2: Crom - CPU/GPU Hybrid Computation Platform for …on-demand.gputechconf.com/gtc/2013/presentations/S3476...Crom - CPU/GPU Hybrid Computation Platform for Visual Effects Nathan Cournia,](https://reader033.vdocuments.site/reader033/viewer/2022041612/5e3878f30c045c26467886ea/html5/thumbnails/2.jpg)
Who We Are
![Page 3: Crom - CPU/GPU Hybrid Computation Platform for …on-demand.gputechconf.com/gtc/2013/presentations/S3476...Crom - CPU/GPU Hybrid Computation Platform for Visual Effects Nathan Cournia,](https://reader033.vdocuments.site/reader033/viewer/2022041612/5e3878f30c045c26467886ea/html5/thumbnails/3.jpg)
Motivation
● Modernize lighting/compositing workflows● Unify user experience
● Workflow evolved across four proprietary packages
● Streamline pipeline
Look Development(Lighthouse)
Light Placement(Voodoo)
Scene Lighting(Lighthouse)
Render(Wren)
LightCmp(Icy)
![Page 4: Crom - CPU/GPU Hybrid Computation Platform for …on-demand.gputechconf.com/gtc/2013/presentations/S3476...Crom - CPU/GPU Hybrid Computation Platform for Visual Effects Nathan Cournia,](https://reader033.vdocuments.site/reader033/viewer/2022041612/5e3878f30c045c26467886ea/html5/thumbnails/4.jpg)
Requirements
● Rethink our software designed up to 25 years ago● Multiple-cores, multiple GPUs, international locations, cloud
● Decouple interface from computation engines
● Seamless integration with other software:● Pipelines: R+H, Shotgun, etc
● Renderers: R+H, Mantra, etc.
● User extensible:● C++
● Python (new nodes, Qt interfaces)
● Interface builder / Visual Programming
● Easily share networks / interfaces
![Page 5: Crom - CPU/GPU Hybrid Computation Platform for …on-demand.gputechconf.com/gtc/2013/presentations/S3476...Crom - CPU/GPU Hybrid Computation Platform for Visual Effects Nathan Cournia,](https://reader033.vdocuments.site/reader033/viewer/2022041612/5e3878f30c045c26467886ea/html5/thumbnails/5.jpg)
Main Idea
● Crom is a VFX platform
![Page 6: Crom - CPU/GPU Hybrid Computation Platform for …on-demand.gputechconf.com/gtc/2013/presentations/S3476...Crom - CPU/GPU Hybrid Computation Platform for Visual Effects Nathan Cournia,](https://reader033.vdocuments.site/reader033/viewer/2022041612/5e3878f30c045c26467886ea/html5/thumbnails/6.jpg)
VFX Platform
● Look Development● Scene Lighting● Compositing● Misc. Tools
![Page 7: Crom - CPU/GPU Hybrid Computation Platform for …on-demand.gputechconf.com/gtc/2013/presentations/S3476...Crom - CPU/GPU Hybrid Computation Platform for Visual Effects Nathan Cournia,](https://reader033.vdocuments.site/reader033/viewer/2022041612/5e3878f30c045c26467886ea/html5/thumbnails/7.jpg)
General Design
● Core data structure is a dependency graph● Data passed between dependency graph nodes are
strongly typed● Dependency graph is stateless● Can hook up anything to anything else
![Page 8: Crom - CPU/GPU Hybrid Computation Platform for …on-demand.gputechconf.com/gtc/2013/presentations/S3476...Crom - CPU/GPU Hybrid Computation Platform for Visual Effects Nathan Cournia,](https://reader033.vdocuments.site/reader033/viewer/2022041612/5e3878f30c045c26467886ea/html5/thumbnails/8.jpg)
Stateless Nodes
● Multiple threads can traverse the graph in parallel
● "Global" state is passed up the dependency graph in a "Context / Request" object● Multiple frames, tiles, layers, etc. can be
concurrently computed
![Page 9: Crom - CPU/GPU Hybrid Computation Platform for …on-demand.gputechconf.com/gtc/2013/presentations/S3476...Crom - CPU/GPU Hybrid Computation Platform for Visual Effects Nathan Cournia,](https://reader033.vdocuments.site/reader033/viewer/2022041612/5e3878f30c045c26467886ea/html5/thumbnails/9.jpg)
Data
● Data passed between nodes is stored in a "property graph"
● Data representation is decoupled from programming interface● An interface, i.e. Adapter/Wrapper, can be placed onto a property graph to
define an object
● A property graph can be adapted to provide multiple interfaces
● Copy-on-write semantics allow for sharing of data
● Heuristics to place subsets of data into a persistent cache
● Property graph is dynamically user extensible yet strongly typed
![Page 10: Crom - CPU/GPU Hybrid Computation Platform for …on-demand.gputechconf.com/gtc/2013/presentations/S3476...Crom - CPU/GPU Hybrid Computation Platform for Visual Effects Nathan Cournia,](https://reader033.vdocuments.site/reader033/viewer/2022041612/5e3878f30c045c26467886ea/html5/thumbnails/10.jpg)
VFX Compositor
● Compositor: Assembles multiple images into a final image(s).
● Example: Nuke
![Page 11: Crom - CPU/GPU Hybrid Computation Platform for …on-demand.gputechconf.com/gtc/2013/presentations/S3476...Crom - CPU/GPU Hybrid Computation Platform for Visual Effects Nathan Cournia,](https://reader033.vdocuments.site/reader033/viewer/2022041612/5e3878f30c045c26467886ea/html5/thumbnails/11.jpg)
GPU/CPU Compositor
● Crom implements a hybrid GPU/CPU compositor
● Dependency graph traversal produces two main items in the property graph:● Instruction Tree: Low-level operations to be
performed● Data Callbacks: Objects that will be invoked to
populate the compositing engine with data from the dependency graph
![Page 12: Crom - CPU/GPU Hybrid Computation Platform for …on-demand.gputechconf.com/gtc/2013/presentations/S3476...Crom - CPU/GPU Hybrid Computation Platform for Visual Effects Nathan Cournia,](https://reader033.vdocuments.site/reader033/viewer/2022041612/5e3878f30c045c26467886ea/html5/thumbnails/12.jpg)
Example cmp Node Graph
![Page 13: Crom - CPU/GPU Hybrid Computation Platform for …on-demand.gputechconf.com/gtc/2013/presentations/S3476...Crom - CPU/GPU Hybrid Computation Platform for Visual Effects Nathan Cournia,](https://reader033.vdocuments.site/reader033/viewer/2022041612/5e3878f30c045c26467886ea/html5/thumbnails/13.jpg)
Example Instruction Tree
![Page 14: Crom - CPU/GPU Hybrid Computation Platform for …on-demand.gputechconf.com/gtc/2013/presentations/S3476...Crom - CPU/GPU Hybrid Computation Platform for Visual Effects Nathan Cournia,](https://reader033.vdocuments.site/reader033/viewer/2022041612/5e3878f30c045c26467886ea/html5/thumbnails/14.jpg)
Callbacks
ReadImage1Callback
ReadImage2Callback
RGB1Callback
![Page 15: Crom - CPU/GPU Hybrid Computation Platform for …on-demand.gputechconf.com/gtc/2013/presentations/S3476...Crom - CPU/GPU Hybrid Computation Platform for Visual Effects Nathan Cournia,](https://reader033.vdocuments.site/reader033/viewer/2022041612/5e3878f30c045c26467886ea/html5/thumbnails/15.jpg)
Callbacks (cont.)
ReadImage1Callback
ReadImage2Callback
RGB1Callback
![Page 16: Crom - CPU/GPU Hybrid Computation Platform for …on-demand.gputechconf.com/gtc/2013/presentations/S3476...Crom - CPU/GPU Hybrid Computation Platform for Visual Effects Nathan Cournia,](https://reader033.vdocuments.site/reader033/viewer/2022041612/5e3878f30c045c26467886ea/html5/thumbnails/16.jpg)
Instruction Tree (cont.)
● Generic representation of low-level operations that need to be done.
● When working interactively, converted to GLSL.● When working on the render farm, converted to
OpenCL.
![Page 17: Crom - CPU/GPU Hybrid Computation Platform for …on-demand.gputechconf.com/gtc/2013/presentations/S3476...Crom - CPU/GPU Hybrid Computation Platform for Visual Effects Nathan Cournia,](https://reader033.vdocuments.site/reader033/viewer/2022041612/5e3878f30c045c26467886ea/html5/thumbnails/17.jpg)
Instruction Tree (GLSL)
uniform sampler2D ReadImage1 ;uniform sampler2D ReadImage2 ;uniform vec4 RGB1;varying vec2 v0000 ;void main(void ){ vec4 t0001 = texture2D(ReadImage1, v0000); vec4 t0002 = t0001 + (texture2D(ReadImage2, v0000) * (1 - clamp(t0001.w, 0, 1))); gl_FragColor = (vec4(t0002.xyz, clamp(t0002.w, 0, 1)) * RGB1 );}
![Page 18: Crom - CPU/GPU Hybrid Computation Platform for …on-demand.gputechconf.com/gtc/2013/presentations/S3476...Crom - CPU/GPU Hybrid Computation Platform for Visual Effects Nathan Cournia,](https://reader033.vdocuments.site/reader033/viewer/2022041612/5e3878f30c045c26467886ea/html5/thumbnails/18.jpg)
Per-Pixel Expressions
● Instruction tree nodes can not only be created from the dependency graph but also from crom's expression language
● Allows for fast per-pixel expressions!sample(ReadImage1.output, vec2(sin(pos.x), pos.y + cos(pos.x)))
![Page 19: Crom - CPU/GPU Hybrid Computation Platform for …on-demand.gputechconf.com/gtc/2013/presentations/S3476...Crom - CPU/GPU Hybrid Computation Platform for Visual Effects Nathan Cournia,](https://reader033.vdocuments.site/reader033/viewer/2022041612/5e3878f30c045c26467886ea/html5/thumbnails/19.jpg)
Lazy Programmers
● cmp node library only has around 50 nodes● Define low-level operations (cmp.Add,
cmp.Translate, cmp.Crop, cmp.Text)
● Most nodes are user defined via "macro" nodes!
![Page 20: Crom - CPU/GPU Hybrid Computation Platform for …on-demand.gputechconf.com/gtc/2013/presentations/S3476...Crom - CPU/GPU Hybrid Computation Platform for Visual Effects Nathan Cournia,](https://reader033.vdocuments.site/reader033/viewer/2022041612/5e3878f30c045c26467886ea/html5/thumbnails/20.jpg)
Macro Node (cmp.Gamma)
![Page 21: Crom - CPU/GPU Hybrid Computation Platform for …on-demand.gputechconf.com/gtc/2013/presentations/S3476...Crom - CPU/GPU Hybrid Computation Platform for Visual Effects Nathan Cournia,](https://reader033.vdocuments.site/reader033/viewer/2022041612/5e3878f30c045c26467886ea/html5/thumbnails/21.jpg)
Macro Node (cmp.Gamma)
![Page 22: Crom - CPU/GPU Hybrid Computation Platform for …on-demand.gputechconf.com/gtc/2013/presentations/S3476...Crom - CPU/GPU Hybrid Computation Platform for Visual Effects Nathan Cournia,](https://reader033.vdocuments.site/reader033/viewer/2022041612/5e3878f30c045c26467886ea/html5/thumbnails/22.jpg)
Macro Nodes
● Benefit of macro nodes is that they produce an Instruction tree without the user writing any C++ / Python
● Macro nodes can be just as fast as built-in nodes● Custom interfaces can be created that are
indistinguishable from built-in interfaces via the interface builder or Python
● Macro nodes usually contain other macro nodes● Production scripts contain well over 250k nodes
![Page 23: Crom - CPU/GPU Hybrid Computation Platform for …on-demand.gputechconf.com/gtc/2013/presentations/S3476...Crom - CPU/GPU Hybrid Computation Platform for Visual Effects Nathan Cournia,](https://reader033.vdocuments.site/reader033/viewer/2022041612/5e3878f30c045c26467886ea/html5/thumbnails/23.jpg)
GPU Saturation
● Depedency graph traversal produces hundreds of GPU API calls
● When scrubbing controls commands build up in GPU
● Easy to saturate GPU with tens of thousands of commands with a simple gesture
● GUI quickly becomes unresponsive as GPU tries to process given commands
● A cornerstone of the Crom platform is that sub-tasks can be interrupted/canceled● Allows for fast feedback
● GPU APIs do not support canceling commands
![Page 24: Crom - CPU/GPU Hybrid Computation Platform for …on-demand.gputechconf.com/gtc/2013/presentations/S3476...Crom - CPU/GPU Hybrid Computation Platform for Visual Effects Nathan Cournia,](https://reader033.vdocuments.site/reader033/viewer/2022041612/5e3878f30c045c26467886ea/html5/thumbnails/24.jpg)
Dispatch Queue
● Crom uses a global GPU dispatch queue● All compute communication with the GPU
happens on a single context/thread pair● Compute threads locally queue commands● Locally queued commands are enqueued to
global queue in logical batches
![Page 25: Crom - CPU/GPU Hybrid Computation Platform for …on-demand.gputechconf.com/gtc/2013/presentations/S3476...Crom - CPU/GPU Hybrid Computation Platform for Visual Effects Nathan Cournia,](https://reader033.vdocuments.site/reader033/viewer/2022041612/5e3878f30c045c26467886ea/html5/thumbnails/25.jpg)
Dispatch Queue Observations
● Global queue throttles commands to ensure GPU driver's command buffer is not to deep
● Commands in global dispatch queue can be interrupted
● Easy to support "native kernels" in OpenGL backend
● GPU throughput not optimal. Overall system is more responsive
● Tricky to handle errors in dispatch queue
● Must be careful not to interrupt object creation/population commands that are needed for later commands
● Single context/thread pair helps avoid nasty driver bugs
![Page 26: Crom - CPU/GPU Hybrid Computation Platform for …on-demand.gputechconf.com/gtc/2013/presentations/S3476...Crom - CPU/GPU Hybrid Computation Platform for Visual Effects Nathan Cournia,](https://reader033.vdocuments.site/reader033/viewer/2022041612/5e3878f30c045c26467886ea/html5/thumbnails/26.jpg)
GPU Limitations
● In practice the GPU has several limitations:● Memory● Uniforms● Varyings● Image Units● Instructions
![Page 27: Crom - CPU/GPU Hybrid Computation Platform for …on-demand.gputechconf.com/gtc/2013/presentations/S3476...Crom - CPU/GPU Hybrid Computation Platform for Visual Effects Nathan Cournia,](https://reader033.vdocuments.site/reader033/viewer/2022041612/5e3878f30c045c26467886ea/html5/thumbnails/27.jpg)
Instruction Tree Splitting
● The instruction tree tells us:● Memory requirements
● Uniform requirements
● Varying requirements
● Number of input images
● Estimate of instructions needed
● We break up the instruction tree into smaller sub-trees that "fit" on the GPU
● Use multiple shader/kernel invocations to composite image
● Sub-tree output can be cached