superconductor - par labparlab.eecs.berkeley.edu/sites/all/parlab/files... · 2012. 8. 22. ·...
TRANSCRIPT
SUPERCONDUCTOR1 Million Nodes at 30 FPS
Matthew E. Torok, UC Berkeley Par [email protected]
Monday, August 20, 12
New Age of Visualizations
Monday, August 20, 12
Big Data
New Age of Visualizations
speed
Monday, August 20, 12
Big Data - and -
New Age of Visualizations
speed
Monday, August 20, 12
Big Data Small Devices- and -
New Age of Visualizations
speed power
Monday, August 20, 12
Data visualization system
Use GPU for layout and rendering
Fast: 100,000 nodes at 27fps
Power efficient: data parallelism
Web-Friendly: OpenCL today, WebCL tomorrow
Superconductor is...
Monday, August 20, 12
ARCHITECTURE
Monday, August 20, 12
Creating Widgets
Use FTL language and compiler
Simple, declarative syntax
Simplicity of Javascript, speed of GPU
FTL: Synthesizes traversal schedule from layout spec, uses specializers to create tuned OpenCL kernels
Monday, August 20, 12
FTL Data Flow
Monday, August 20, 12
Runtime
Monday, August 20, 12
Runtime
Host code reads data into tree structure
Monday, August 20, 12
Runtime
Host code reads data into tree structure
Tree transferred into GPU memory
Monday, August 20, 12
Runtime
Host code reads data into tree structure
Tree transferred into GPU memory
Layout engine ‘draws’ vertices into shared GPU buffer
Monday, August 20, 12
Runtime
Host code reads data into tree structure
Tree transferred into GPU memory
Layout engine ‘draws’ vertices into shared GPU buffer
OpenGL renders vertex buffer to
screen
Monday, August 20, 12
Architecting for Speed
Data only moved once (we have big data)
Layout engine generated vertices — speaks language of OpenGL
Monday, August 20, 12
Parallel Tree Traversals
Tree laid out as structure-split arrays
Monday, August 20, 12
Parallel Tree Traversals
Tree laid out as structure-split arrays
Monday, August 20, 12
Parallel Tree Traversals
Then traversed level-by-level, synchronous
Monday, August 20, 12
Parallel Tree Traversals
Level 0 (root) In top-down traversal, start with root
Monday, August 20, 12
Parallel Tree Traversals
Level 1 Then process each subsequent level serially
Monday, August 20, 12
Parallel Tree Traversals
Level 2 The nodes within a level are processed in parallel
Monday, August 20, 12
Parallel Tree Traversals
Monday, August 20, 12
DEMO2011 Russian Legislative Election
Monday, August 20, 12
JavaScript Comparison: Treemap
27 FPS
JavaScript2.3GHz Core i7
SuperconductorGeForce GT 650M
500
100,000
Monday, August 20, 12
Weak Scaling on GPU
Nodes Speed
10,000 26 FPS
100,000 26 FPS
1,000,000 4.5 FPS
300 line declarative spec synthesized into 5 efficient tree traversals
Early Results
Monday, August 20, 12
WHAT’S NEXT?
Monday, August 20, 12
Future Extensions
Beyond tree traversals: graphs
Beyond OpenCL: WebCL
Data binding & mutation
Open Source: 2013?
Monday, August 20, 12
Summary
Declarative visualization language
GPU: big data & small devices
Result:
100,000 node interactive animation
200x speedup vs. other high-level language (JS)
Monday, August 20, 12
SUPERCONDUCTORComing Soon
Monday, August 20, 12