![Page 1: Distributed Supercomputing in Java Henri Bal · Distributed Supercomputing in Java Henri Bal Vrije Universiteit Amsterdam Keynote talk 11-th Euromicro Conference on Parallel, Distributed](https://reader033.vdocuments.site/reader033/viewer/2022042309/5ed68423ff0e593c0b63fd62/html5/thumbnails/1.jpg)
Distributed Supercomputing in Java
Henri BalVrije Universiteit Amsterdam
Keynote talk 11-th Euromicro Conference on Parallel,Distributed and Network based Processing, Genoa, Italy,(February, 5-7, 2003)
![Page 2: Distributed Supercomputing in Java Henri Bal · Distributed Supercomputing in Java Henri Bal Vrije Universiteit Amsterdam Keynote talk 11-th Euromicro Conference on Parallel, Distributed](https://reader033.vdocuments.site/reader033/viewer/2022042309/5ed68423ff0e593c0b63fd62/html5/thumbnails/2.jpg)
2
Distributed supercomputing
• Parallel processing on geographically distributedcomputing systems (grids)
• Examples:- SETI@home ( ), RSA-155, Entropia, Cactus
• Currently limited to trivially parallel applications• Our goals:
- Generalize this to more HPC applications- Provide high-level programming support
![Page 3: Distributed Supercomputing in Java Henri Bal · Distributed Supercomputing in Java Henri Bal Vrije Universiteit Amsterdam Keynote talk 11-th Euromicro Conference on Parallel, Distributed](https://reader033.vdocuments.site/reader033/viewer/2022042309/5ed68423ff0e593c0b63fd62/html5/thumbnails/3.jpg)
3
Grids versus supercomputers
• Performance/scalability- Speedups on geographically distributed systems?
• Heterogeneity- Different types of processors, operating systems, etc.- Different networks (Ethernet, Myrinet, WANs)
• General grid issues- Resource management, co-allocation, firewalls,
security, monitoring, authorization, accounting, ….
![Page 4: Distributed Supercomputing in Java Henri Bal · Distributed Supercomputing in Java Henri Bal Vrije Universiteit Amsterdam Keynote talk 11-th Euromicro Conference on Parallel, Distributed](https://reader033.vdocuments.site/reader033/viewer/2022042309/5ed68423ff0e593c0b63fd62/html5/thumbnails/4.jpg)
4
Our approach
• Performance/scalability- Exploit hierarchical structure of grids
(Albatross project)
• Heterogeneity- Use Java + JVM (Java Virtual Machine) technology
• General grid issues- Import knowledge from elsewhere (GGF, GridLab)
![Page 5: Distributed Supercomputing in Java Henri Bal · Distributed Supercomputing in Java Henri Bal Vrije Universiteit Amsterdam Keynote talk 11-th Euromicro Conference on Parallel, Distributed](https://reader033.vdocuments.site/reader033/viewer/2022042309/5ed68423ff0e593c0b63fd62/html5/thumbnails/5.jpg)
5
• Grids usually are hierarchical- Collections of clusters, supercomputers- Fast local links, slow wide-area links
• Can optimize algorithms to exploit this hierarchy- Message combining + latency hiding on wide-area links- Collective operations for wide-area systems- Load balancing
• Successful for many applications- Did many experiments on a homogeneous wide-area
test bed (DAS) [HPCA 1999, IEEE TPDS 2002]
Speedups on a grid?
![Page 6: Distributed Supercomputing in Java Henri Bal · Distributed Supercomputing in Java Henri Bal Vrije Universiteit Amsterdam Keynote talk 11-th Euromicro Conference on Parallel, Distributed](https://reader033.vdocuments.site/reader033/viewer/2022042309/5ed68423ff0e593c0b63fd62/html5/thumbnails/6.jpg)
6
The Ibis system
• High-level & efficient programming support fordistributed supercomputing on heterogeneous grids
• Use Java-centric approach + JVM technology- Inherently more portable than native compilation
“Write once, run everywhere ”- Requires entire system to be written in Java
• Use special-case (native) optimizations on demand
![Page 7: Distributed Supercomputing in Java Henri Bal · Distributed Supercomputing in Java Henri Bal Vrije Universiteit Amsterdam Keynote talk 11-th Euromicro Conference on Parallel, Distributed](https://reader033.vdocuments.site/reader033/viewer/2022042309/5ed68423ff0e593c0b63fd62/html5/thumbnails/7.jpg)
7
Outline
• Programming support• Highly portable & efficient implementation• Experiences on DAS-2 and EC GridLab testbeds
![Page 8: Distributed Supercomputing in Java Henri Bal · Distributed Supercomputing in Java Henri Bal Vrije Universiteit Amsterdam Keynote talk 11-th Euromicro Conference on Parallel, Distributed](https://reader033.vdocuments.site/reader033/viewer/2022042309/5ed68423ff0e593c0b63fd62/html5/thumbnails/8.jpg)
8
Ibis programming support
• Ibis provides- Remote Method Invocation (RMI)- Replicated objects (RepMI) - as in Orca- Group/collective communication (GMI) - as in MPI- Divide & conquer (Satin) - as in Cilk
• All integrated in a clean, object-oriented wayinto Java, using special “marker” interfaces- Invoking native library (e.g. MPI) would give up
Java’s “run everywhere” portability
![Page 9: Distributed Supercomputing in Java Henri Bal · Distributed Supercomputing in Java Henri Bal Vrije Universiteit Amsterdam Keynote talk 11-th Euromicro Conference on Parallel, Distributed](https://reader033.vdocuments.site/reader033/viewer/2022042309/5ed68423ff0e593c0b63fd62/html5/thumbnails/9.jpg)
9
Compiling Ibis programs
Javacompiler
bytecoderewriter JVM
JVM
JVM
source bytecodebytecode
![Page 10: Distributed Supercomputing in Java Henri Bal · Distributed Supercomputing in Java Henri Bal Vrije Universiteit Amsterdam Keynote talk 11-th Euromicro Conference on Parallel, Distributed](https://reader033.vdocuments.site/reader033/viewer/2022042309/5ed68423ff0e593c0b63fd62/html5/thumbnails/10.jpg)
10
GMI (group communication)
• Generalizes Remote Method Invocation- Modify how invocations & results are handled- Invoke multiple objects, combine/gather results, etc.- Expresses many forms of group communication
interface Example extends GroupInterface { public int get() throws ...; ….}Example e = ; // get group stubm = findMethod(”int get()", e); // method to configurem.useGroupInvocation(); // get() will be multicastm.useCombineResult(…); //results are combinedint result = e.get(); // example invocation
![Page 11: Distributed Supercomputing in Java Henri Bal · Distributed Supercomputing in Java Henri Bal Vrije Universiteit Amsterdam Keynote talk 11-th Euromicro Conference on Parallel, Distributed](https://reader033.vdocuments.site/reader033/viewer/2022042309/5ed68423ff0e593c0b63fd62/html5/thumbnails/11.jpg)
11
Divide-and-conquer parallelism
• Divide-and-conquer isinherently hierarchical
• Satin- Cilk-like primitives (spawn/sync)
• New load balancing algorithm- Cluster-aware random work stealing [PPoPP’01]
fib(1) fib(0) fib(0)
fib(0)
fib(4)
fib(1)
fib(2)
fib(3)
fib(3)
fib(5)
fib(1) fib(1)
fib(1)
fib(2) fib(2)cpu 2
cpu 1cpu 3
![Page 12: Distributed Supercomputing in Java Henri Bal · Distributed Supercomputing in Java Henri Bal Vrije Universiteit Amsterdam Keynote talk 11-th Euromicro Conference on Parallel, Distributed](https://reader033.vdocuments.site/reader033/viewer/2022042309/5ed68423ff0e593c0b63fd62/html5/thumbnails/12.jpg)
12
Example
interface FibInter {public int fib(long n);
}
class Fib implements FibInter { int fib (int n) {
if (n < 2) return n;return fib(n-1) + fib(n-2);
}}
Single-threaded Java
![Page 13: Distributed Supercomputing in Java Henri Bal · Distributed Supercomputing in Java Henri Bal Vrije Universiteit Amsterdam Keynote talk 11-th Euromicro Conference on Parallel, Distributed](https://reader033.vdocuments.site/reader033/viewer/2022042309/5ed68423ff0e593c0b63fd62/html5/thumbnails/13.jpg)
13
interface FibInter extends ibis.satin.Spawnable {
public int fib(long n);}
class Fib extends ibis.satin.SatinObject implements FibInter { public int fib (int n) {
if (n < 2) return n;int x = fib (n - 1);int y = fib (n - 2);sync();return x + y;
}}
Java + divide&conquer
Example
GridLab testbed
![Page 14: Distributed Supercomputing in Java Henri Bal · Distributed Supercomputing in Java Henri Bal Vrije Universiteit Amsterdam Keynote talk 11-th Euromicro Conference on Parallel, Distributed](https://reader033.vdocuments.site/reader033/viewer/2022042309/5ed68423ff0e593c0b63fd62/html5/thumbnails/14.jpg)
14
Ibis implementation
• Want to exploit Java’s “run everywhere” property,but- That requires 100% pure Java implementation,
no single line of native code- Hard to use native communication (e.g. Myrinet) or
native compiler/runtime system
• Ibis approach:- Reasonably efficient pure Java solution (for any JVM)- Optimized solutions with native code for special cases
![Page 15: Distributed Supercomputing in Java Henri Bal · Distributed Supercomputing in Java Henri Bal Vrije Universiteit Amsterdam Keynote talk 11-th Euromicro Conference on Parallel, Distributed](https://reader033.vdocuments.site/reader033/viewer/2022042309/5ed68423ff0e593c0b63fd62/html5/thumbnails/15.jpg)
15
Ibis design
![Page 16: Distributed Supercomputing in Java Henri Bal · Distributed Supercomputing in Java Henri Bal Vrije Universiteit Amsterdam Keynote talk 11-th Euromicro Conference on Parallel, Distributed](https://reader033.vdocuments.site/reader033/viewer/2022042309/5ed68423ff0e593c0b63fd62/html5/thumbnails/16.jpg)
16
Current status
![Page 17: Distributed Supercomputing in Java Henri Bal · Distributed Supercomputing in Java Henri Bal Vrije Universiteit Amsterdam Keynote talk 11-th Euromicro Conference on Parallel, Distributed](https://reader033.vdocuments.site/reader033/viewer/2022042309/5ed68423ff0e593c0b63fd62/html5/thumbnails/17.jpg)
17
Challenges
• How to make the system flexible enough- Run seamlessly on different hardware / protocols
• Make the pure-Java solution efficient enough- Need fast local communication even
for grid applications
• Special-case optimizations
![Page 18: Distributed Supercomputing in Java Henri Bal · Distributed Supercomputing in Java Henri Bal Vrije Universiteit Amsterdam Keynote talk 11-th Euromicro Conference on Parallel, Distributed](https://reader033.vdocuments.site/reader033/viewer/2022042309/5ed68423ff0e593c0b63fd62/html5/thumbnails/18.jpg)
18
Flexibility
• Support different communication substrates• IPL just defines an interface between high-level
programming systems and underlying platforms• Higher levels can ask IPL to load different
implementations at runtime, using class loading- Eg FIFO ordering, reliable communication
![Page 19: Distributed Supercomputing in Java Henri Bal · Distributed Supercomputing in Java Henri Bal Vrije Universiteit Amsterdam Keynote talk 11-th Euromicro Conference on Parallel, Distributed](https://reader033.vdocuments.site/reader033/viewer/2022042309/5ed68423ff0e593c0b63fd62/html5/thumbnails/19.jpg)
19
Fast communication in pure Java
• Manta system [ACM TOPLAS Nov. 2001]
- RMI at RPC speed, but using native compiler & RTS
• Ibis does similar optimizations, but in pure Java- Compiler-generated serialization at bytecode level
5-9x faster than using runtime type inspection- Reduce copying overhead
Zero-copy native implementation for primitive arraysPure-Java requires type-conversion (=copy) to bytes
![Page 20: Distributed Supercomputing in Java Henri Bal · Distributed Supercomputing in Java Henri Bal Vrije Universiteit Amsterdam Keynote talk 11-th Euromicro Conference on Parallel, Distributed](https://reader033.vdocuments.site/reader033/viewer/2022042309/5ed68423ff0e593c0b63fd62/html5/thumbnails/20.jpg)
20
Communication performance onFast Ethernet
0
50
100
150
200
250
Latency
Ibis MPI/C Java RMIKaRMI Ibis RMI
0
2
4
6
8
10
12
int[ ] throughput Tree throughput
Latency (µs) & throughput (MB/s), measured on1 GHz Pentium-IIIs (KaRMI = Karlsruhe RMI)
![Page 21: Distributed Supercomputing in Java Henri Bal · Distributed Supercomputing in Java Henri Bal Vrije Universiteit Amsterdam Keynote talk 11-th Euromicro Conference on Parallel, Distributed](https://reader033.vdocuments.site/reader033/viewer/2022042309/5ed68423ff0e593c0b63fd62/html5/thumbnails/21.jpg)
21
Communication performance onMyrinet
05
1015202530354045
Latency
Ibis MPI/C KaRMI Ibis RMI
0
20
40
60
80
100
120
140
160
int[ ] throughput Tree throughput
![Page 22: Distributed Supercomputing in Java Henri Bal · Distributed Supercomputing in Java Henri Bal Vrije Universiteit Amsterdam Keynote talk 11-th Euromicro Conference on Parallel, Distributed](https://reader033.vdocuments.site/reader033/viewer/2022042309/5ed68423ff0e593c0b63fd62/html5/thumbnails/22.jpg)
22
Early Grid experiences with Ibis
• Using Satin divide-and-conquer system- Implemented with Ibis in pure Java, using TCP/IP
• Application measurements on- DAS-2 (homogeneous)- Testbed from EC GridLab project (heterogeneous)
![Page 23: Distributed Supercomputing in Java Henri Bal · Distributed Supercomputing in Java Henri Bal Vrije Universiteit Amsterdam Keynote talk 11-th Euromicro Conference on Parallel, Distributed](https://reader033.vdocuments.site/reader033/viewer/2022042309/5ed68423ff0e593c0b63fd62/html5/thumbnails/23.jpg)
23
Layers involved
![Page 24: Distributed Supercomputing in Java Henri Bal · Distributed Supercomputing in Java Henri Bal Vrije Universiteit Amsterdam Keynote talk 11-th Euromicro Conference on Parallel, Distributed](https://reader033.vdocuments.site/reader033/viewer/2022042309/5ed68423ff0e593c0b63fd62/html5/thumbnails/24.jpg)
24
Distributed ASCI Supercomputer (DAS) 2
Node configuration
Dual 1 GHz Pentium-III>= 1 GB memoryMyrinetLinux
VU (72 nodes) UvA (32)
Leiden (32) Delft (32)
GigaPort(1-10 Gb)
Utrecht (32)
![Page 25: Distributed Supercomputing in Java Henri Bal · Distributed Supercomputing in Java Henri Bal Vrije Universiteit Amsterdam Keynote talk 11-th Euromicro Conference on Parallel, Distributed](https://reader033.vdocuments.site/reader033/viewer/2022042309/5ed68423ff0e593c0b63fd62/html5/thumbnails/25.jpg)
25
0,0
10,0
20,0
30,0
40,0
50,0
60,0
70,0
Fibon
acci
Adaptiv
e integ
ration
Set co
ver
Fib. th
resho
ld IDA*
Knaps
ack
Matrix m
ultiply
N choo
se K
N quee
ns
Prime f
actor
s
Raytra
cer
TSP
spee
dup
single cluster of 64 machines 4 clusters of 16 machines
Satin on wide-area DAS-2
![Page 26: Distributed Supercomputing in Java Henri Bal · Distributed Supercomputing in Java Henri Bal Vrije Universiteit Amsterdam Keynote talk 11-th Euromicro Conference on Parallel, Distributed](https://reader033.vdocuments.site/reader033/viewer/2022042309/5ed68423ff0e593c0b63fd62/html5/thumbnails/26.jpg)
26
Satin on GridLab:Theory versus practice
• No support for co-allocation yet (done manually)• Firewall problems everywhere
- Currently: use a range of open ports- Future: use multiplexing & ssh
• Java indeed runs everywhere modulo bugs in (old) JVMs
- IBM 1.3.1 JIT: bug in versioning mechanism- Origin2000 JDK: bug in thread synchronization
![Page 27: Distributed Supercomputing in Java Henri Bal · Distributed Supercomputing in Java Henri Bal Vrije Universiteit Amsterdam Keynote talk 11-th Euromicro Conference on Parallel, Distributed](https://reader033.vdocuments.site/reader033/viewer/2022042309/5ed68423ff0e593c0b63fd62/html5/thumbnails/27.jpg)
27
• Satin/Ibis program was run simultaneously on
But it works ...
Type OS CPU Location
Cluster Linux Pentium-3 VU Amsterdam
Server Solaris Sparc VU Amsterdam
Origin 2000 Irix64 MIPS AEI Potsdam
Cluster Linux Pentium-3 PSNC Poznan
![Page 28: Distributed Supercomputing in Java Henri Bal · Distributed Supercomputing in Java Henri Bal Vrije Universiteit Amsterdam Keynote talk 11-th Euromicro Conference on Parallel, Distributed](https://reader033.vdocuments.site/reader033/viewer/2022042309/5ed68423ff0e593c0b63fd62/html5/thumbnails/28.jpg)
28
Performance on GridLabperformance relative to DAS2 cluster
00.20.40.60.8
11.21.41.6
fib
tsp
cove
r
ida
fib_thr
esho
ld
raytra
cer
nque
ens
knapsa
ck
primfac
Mmult
n_ov
er_k
integra
te
perf
orm
ance
DAS2 ORIGIN
BOTH
16-node DAS-2 cluster + 16-node Origin
![Page 29: Distributed Supercomputing in Java Henri Bal · Distributed Supercomputing in Java Henri Bal Vrije Universiteit Amsterdam Keynote talk 11-th Euromicro Conference on Parallel, Distributed](https://reader033.vdocuments.site/reader033/viewer/2022042309/5ed68423ff0e593c0b63fd62/html5/thumbnails/29.jpg)
29
Summary
• Ibis: a programming environment for grids- RMI, group communication, divide&conquer
• Portable- Using Java’s “write once, run everywhere” property
• Efficient- Reasonably efficient “run everywhere” solution- Optimized solutions for special cases
• Experience with prototype system on GridLab
![Page 30: Distributed Supercomputing in Java Henri Bal · Distributed Supercomputing in Java Henri Bal Vrije Universiteit Amsterdam Keynote talk 11-th Euromicro Conference on Parallel, Distributed](https://reader033.vdocuments.site/reader033/viewer/2022042309/5ed68423ff0e593c0b63fd62/html5/thumbnails/30.jpg)
30
Acknowledgements Rob van Nieuwpoort Jason Maassen Thilo Kielmann Rutger Hofman Ceriel Jacobs Gosia Wrzesinska Olivier Aumage Kees Verstoep
web site: www.cs.vu.nl/ibis
vrije Universiteit