![Page 2: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/2.jpg)
What is this about?
DiCE: Software library for writing applications
scaling to many GPUs and CPUs in a cluster
![Page 3: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/3.jpg)
What is this about?
DiCE: Software library for writing applications
scaling to many GPUs and CPUs in a cluster
Used since 2003 in our rendering products...
NVIDIA indeX NVIDIA Iray
courtesy of Vyacheslav Serov
courtesy of Rüdiger Raab
courtesy of Thomas Zancker
![Page 4: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/4.jpg)
Why are we presenting this here?
DiCE is a base technology in indeX
— Clustering / networking /distribution based on DiCE
DiCE API exposed by indeX
— Distribute pre-computation of data for indeX
— Do your own computation…
![Page 5: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/5.jpg)
Design Goals
„Provide a software library to be used by rendering
experts to write scalable software for GPU clusters.“
— Not required: low level paralellization / networking knowledge
— High level of abstraction / easy to use...
— Not specific to special domain (e.g. rendering)
— High performance, meant for interactive applications
Other solutions...
![Page 6: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/6.jpg)
Unique Combination of Features
Simple programming model
Ease of deployment / commodity hardware
Unified multi-core and cluster parallelization
GPU support
Dynamic clustering
Focus on interactive applications
Multi-user support e.g. for web services
Available on Windows, Linux, Mac OS X
![Page 7: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/7.jpg)
Overview
Networking / Clustering
Datastore
Job System
C++ API
Application
![Page 8: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/8.jpg)
Overview
Networking / Clustering
Datastore
Job System
C++ API
Application
![Page 9: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/9.jpg)
Overview
Networking / Clustering
Datastore
Job System
C++ API
Application
![Page 10: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/10.jpg)
Overview
Networking / Clustering
Datastore
Job System
C++ API
Application
![Page 11: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/11.jpg)
DiCE and indeX
Networking / Clustering
Datastore
Job System
C++ API
Application
indeX
![Page 12: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/12.jpg)
Job System
Networking / Clustering
Datastore
Job System
C++ API
Application
![Page 13: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/13.jpg)
0 1 2 3 4
5 6 7 8 9
10 11 12 13 14
15 16 17 18 19
Parallelization Model
Programmer: split work in n fragments!
— As independent as possible
— Potentially thousands per „frame“!
No apriori knowledge about resources in the cluster!
Goal: Distribute work over all GPUs / CPUs in cluster
![Page 14: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/14.jpg)
Parallelization Model
Fragmented Job
~ similar to CUDA kernel
Implement C++ class:
void execute_fragment(int i, int n) {…}
To be called once for every fragment
Ask DiCE to execute job in n fragments
0 1 2 3 4
5 6 7 8 9
10 11 12 13 14
15 16 17 18 19
![Page 15: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/15.jpg)
Parallelization Model - Cluster
Not a shared memory model!
![Page 16: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/16.jpg)
Parallelization Model - Cluster
Not a shared memory model!
Idea: Split execution and integration of results
void execute_remote(int i, int n, OUT){…} Remote host
void receive_result(int i, int n, IN) {…} Origin host
execute_remote()+receive_result() = execute_fragment()
![Page 17: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/17.jpg)
Parallelization Model – Single Host
My_job
• Scene
• Camera
• Framebuf[ ]
1 Host
2 GPUs
![Page 18: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/18.jpg)
0 GPU 1
1 GPU 1
2 GPU 2
3 GPU 2
4 GPU1
5 GPU 2
Parallelization Model – Single Host
My_job
• Scene
• Camera
• Framebuf[ ]
1 Host
2 GPUs
![Page 19: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/19.jpg)
Parallelization Model – Single Host
Exe
cu
te fra
gm
en
t 1
Exe
cu
te fra
gm
en
t 2
Exe
cu
te fra
gm
en
t 4
Exe
cu
te fra
gm
en
t 5
My_job
• Scene
• Camera
• Framebuf[ ]
Exe
cu
te fra
gm
en
t 0
Exe
cu
te fra
gm
en
t 3
0 GPU 1
1 GPU 1
2 GPU 2
3 GPU 2
4 GPU1
5 GPU 2
![Page 20: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/20.jpg)
Parallelization Model – Single Host
Exe
cu
te fra
gm
en
t 1
Exe
cu
te fra
gm
en
t 2
Exe
cu
te fra
gm
en
t 4
Exe
cu
te fra
gm
en
t 5
My_job
• Scene
• Camera
• Framebuf[ ]
Exe
cu
te fra
gm
en
t 0
Exe
cu
te fra
gm
en
t 3
0 GPU 1
1 GPU 1
2 GPU 2
3 GPU 2
4 GPU1
5 GPU 2
![Page 21: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/21.jpg)
Parallelization Model – Single Host
Exe
cu
te fra
gm
en
t 1
Exe
cu
te fra
gm
en
t 2
Exe
cu
te fra
gm
en
t 4
Exe
cu
te fra
gm
en
t 5
My_job
• Scene
• Camera
• Framebuf[ ]
Exe
cu
te fra
gm
en
t 0
Exe
cu
te fra
gm
en
t 3
0 GPU 1
1 GPU 1
2 GPU 2
3 GPU 2
4 GPU1
5 GPU 2
![Page 22: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/22.jpg)
Parallelization Model – Single Host
Exe
cu
te fra
gm
en
t 1
Exe
cu
te fra
gm
en
t 2
Exe
cu
te fra
gm
en
t 4
Exe
cu
te fra
gm
en
t 5
My_job
• Scene
• Camera
• Framebuf[ ]
Exe
cu
te fra
gm
en
t 0
Exe
cu
te fra
gm
en
t 3
0 GPU 1
1 GPU 1
2 GPU 2
3 GPU 2
4 GPU1
5 GPU 2
![Page 23: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/23.jpg)
Parallelization Model
![Page 24: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/24.jpg)
0 GPU 1
Host 1
1 GPU 1
Host 2
2 GPU 2
Host 2
3 GPU 2
Host 1
4 GPU1
Host 3
5 GPU 2
Host 3
Parallelization Model – 3 Hosts
Host 2 Host 3
Host 1 My_job
• Scene
• Camera
• Framebuf[ ]
3 Host
2 GPUs, each
![Page 25: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/25.jpg)
Host 3 Host 2
Parallelization Model – 3 Hosts
Host 1 My_job
• Scene
• Camera
• Framebuf[ ]
My_job
• Scene
• Camera
My_job
• Scene
• Camera
0 GPU 1
Host 1
1 GPU 1
Host 2
2 GPU 2
Host 2
3 GPU 2
Host 1
4 GPU1
Host 3
5 GPU 2
Host 3
![Page 26: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/26.jpg)
Host 3 Host 2
Parallelization Model – 3 Hosts
Host 1 My_job
• Scene
• Camera
• Framebuf[ ]
My_job
• Scene
• Camera
My_job
• Scene
• Camera
Exe
cu
te re
mo
te 1
Exe
cu
te re
mo
te 2
Exe
cu
te re
mo
te 4
Exe
cu
te re
mo
te 5
Exe
cu
te fra
gm
en
t 0
Exe
cu
te fra
gm
en
t 3
0 GPU 1
Host 1
1 GPU 1
Host 2
2 GPU 2
Host 2
3 GPU 2
Host 1
4 GPU1
Host 3
5 GPU 2
Host 3
![Page 27: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/27.jpg)
Host 3 Host 2
Parallelization Model – 3 Hosts
Host 1 My_job
• Scene
• Camera
• Framebuf[ ]
My_job
• Scene
• Camera
My_job
• Scene
• Camera
Exe
cu
te re
mo
te 1
Exe
cu
te re
mo
te 2
Exe
cu
te re
mo
te 4
Exe
cu
te re
mo
te 5
Exe
cu
te fra
gm
en
t 0
Exe
cu
te fra
gm
en
t 3
0 GPU 1
Host 1
1 GPU 1
Host 2
2 GPU 2
Host 2
3 GPU 2
Host 1
4 GPU1
Host 3
5 GPU 2
Host 3
![Page 28: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/28.jpg)
Host 3 Host 2
Parallelization Model – 3 Hosts
Host 1 My_job
• Scene
• Camera
• Framebuf[ ]
My_job
• Scene
• Camera
My_job
• Scene
• Camera
Exe
cu
te re
mo
te 1
Exe
cu
te re
mo
te 2
Exe
cu
te re
mo
te 4
Exe
cu
te re
mo
te 5
Exe
cu
te fra
gm
en
t 0
Exe
cu
te fra
gm
en
t 3
Re
ce
ive
resu
lt 1
Re
ce
ive
resu
lt 2
Re
ce
ive
resu
lt 4
Re
ce
vie
resu
lt 5
0 GPU 1
Host 1
1 GPU 1
Host 2
2 GPU 2
Host 2
3 GPU 2
Host 1
4 GPU1
Host 3
5 GPU 2
Host 3
![Page 29: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/29.jpg)
Parallelization Model – 3 Hosts
![Page 30: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/30.jpg)
Parallelization Model - Hierarchical
Viewer Host
Compositor Host
Render Host
GPUs
Compositor Job
GPU Fragment
Rendering Job
GPU Job
![Page 31: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/31.jpg)
Datastore
Networking / Clustering
Datastore
Job System
C++ API
Application
![Page 32: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/32.jpg)
Datastore
In memory NoSQL datastore for arbitrary C++ objects
Store object on some host / retrieve on any host
Data transport (mostly) transparent to application
![Page 33: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/33.jpg)
Datastore Objects
class My_adder Your class
{
float m_a;
int m_b;
float sum() { return m_a + m_b; }
};
![Page 34: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/34.jpg)
Datastore Objects
class My_adder
{
float m_a; Arbitrary member variables
int m_b;
float sum() { return m_a + m_b; }
};
![Page 35: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/35.jpg)
Datastore Objects
class My_adder
{
float m_a;
int m_b;
float sum() { return m_a + m_b; } Arbitrary member functions
};
![Page 36: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/36.jpg)
Datastore Objects
class My_adder : public Element< UUID > Derive from base class
{
float m_a;
int m_b;
float sum() { return m_a + m_b; }
};
![Page 37: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/37.jpg)
Datastore Objects
class My_adder : public Element< UUID >
{
float m_a;
int m_b;
void serialize(Serializer* serializer) Implement serialization
{
serializer->write(m_a);
serializer->write(m_b);
}
};
![Page 38: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/38.jpg)
Datastore Objects
class My_adder : public Element< UUID >
{
float m_a;
int m_b;
void serialize(Serializer* serializer);
void deserialize(Deserializer* deserializer) Implement deserialization
{
deserializer->read(m_a);
deserializer->read(m_b);
}
};
![Page 39: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/39.jpg)
Datastore Objects
class My_adder : public Element< UUID >
{
float m_a;
int m_b;
void serialize(ISerializer* serializer);
void deserialize(IDeserializer* deserializer);
};
register_serializable_class< My_adder >(); Register class
![Page 40: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/40.jpg)
Datastore: Cache
Per host cache for objects
— Accessing object will make sure it is in the cache!
— If necessary fetch from other hosts
If cache is full: throw away objects owned by others (LRU)
— Store more data in cluster than a single host could
Configurable redundant storage for handling host failure
![Page 41: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/41.jpg)
Datastore Transactions
Important for multi-user operation
![Page 42: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/42.jpg)
Datastore Transactions
Important for multi-user operation
ACID
— Atomicity: Transaction commit, abort
— Isolation: Starting transaction “freezes” view on datastore
![Page 43: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/43.jpg)
Datastore Transactions
Important for multi-user operation
ACID
— Atomicity: Transaction commit, abort
— Consistency: Cluster wide locks available
— Isolation: Starting transaction “freezes” view on datastore
— Durability: Redundancy
![Page 44: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/44.jpg)
Transaction Isolation
A X
T7
T8
1
![Page 45: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/45.jpg)
Transaction Isolation
Isolation based on multi-version capability
A5 X9
T7
T8
1
![Page 46: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/46.jpg)
Transaction Isolation
Isolation based on multi-version capability
Copy-on-write
A5 X9
T7
T8
1 X10
![Page 47: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/47.jpg)
Transaction Isolation
Isolation based on multi-version capability
Copy-on-write
A5 X9
T7
T8
1 X10
![Page 48: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/48.jpg)
Transaction Isolation
Isolation based on multi-version capability
Copy-on-write
A5
T8
1 X10
![Page 49: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/49.jpg)
Networking / Clustering
Networking / Clustering
Datastore
Job System
C++ API
Application
![Page 50: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/50.jpg)
Networking / Clustering
Handles cluster building and data transfers
— Self-organizing, dynamic addition and removal of hosts
— Tested with up to 1000 hosts
— Several networking protocols for different environments…
![Page 51: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/51.jpg)
Network Layer: UDP with Multicast
Unicast: Send to each host
![Page 52: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/52.jpg)
Network Layer: UDP with Multicast
Unicast: Send to each host
Multicast: Like radio, send once, received by many
![Page 53: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/53.jpg)
Network Layer: UDP with Multicast
Unicast: Send to each host
Multicast: Like radio, send once, received by many
![Page 54: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/54.jpg)
Network Layer: UDP with Multicast
Self Organization:
— Multicast address identifies cluster
— Multicast “beacon” packets to announce to other hosts
— “Election” process to elect one synchronizer
Multicast / unicast used for bulk data transfers
— Especially effective for many hosts
![Page 55: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/55.jpg)
Network Layer: TCP
For networks with
— low bandwidth multicast or
— No multicast (e.g. Amazon Web Services)
Discovering hosts
— UDP multicast layer or
— At least one know host
TCP used for all data transport
Still fully dynamic
![Page 56: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/56.jpg)
Host 1
Memory
Network Layer: Infiniband
Native Infiniband with RDMA
0x1234
Host 2
Memory
0x4532
CPU CPU
![Page 57: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/57.jpg)
Host 1
Memory
Network Layer: Infiniband
Native Infiniband with RDMA
0x1234
Host 2
Memory
0x4532
CPU CPU
![Page 58: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/58.jpg)
Host 1
Memory
Network Layer: Infiniband
Native Infiniband with RDMA
RDMA used for speeding up bulk data transfer
Fastest transmissions > 30 Gbit/s end-to-end
0x1234
Host 2
Memory
0x4532
CPU CPU
![Page 59: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/59.jpg)
Other Features
More multi-user capabilities (scopes, ...)
„Futures“
Global logging system
HTTP Server
RTMP Video streaming
Cloud Bridge
...
![Page 60: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/60.jpg)
Summary
DiCE is a library for writing scalable applications
DiCE used since 10 years in our rendering products
Currently directly usable to if you use indeX
![Page 61: Distributing Computation to Large GPU Clusters | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · Networking / Clustering Handles cluster building and data transfers](https://reader034.vdocuments.site/reader034/viewer/2022042417/5f334d738634a8570754b58c/html5/thumbnails/61.jpg)
Thank you …
Stefan Radig Sr. Manager, NVIDIA Iray and DiCE