infovis infrastructure workshop chris mueller open systems lab, indiana university october 9, 2004...
TRANSCRIPT
InfoVis Infrastructure Workshop
Chris MuellerOpen Systems Lab, Indiana University
October 9, 2004chemuell at cs dot indiana dot edu
www.osl.iu.edu
Overview
• Position Paper– Repository style infrastructure (SourceForge, GenBank,
CPAN)– Standard software protocols – Guiding policies to help ensure quality
• Current Work – Open Systems Lab, IU– High performance components for IVC
• Boost Graph Library– Very large data sets/visualization
• Interests– Understand community needs– Learn what’s available, where we’re going
• Industry Viewpoint– Web-based Scientific Visualization and Analysis products– In-house visualization and analysis tools (high-throughput
analytical chemistry)
Core Algorithm Patterns breadth_first_search breadth_first_visit depth_first_search depth_first_visit undirected_dfs
Shortest Paths Algorithms dijkstra_shortest_paths bellman_ford_shortest_paths dag_shortest_paths johnson_all_pairs_shortest_paths
Minimum Spanning Tree Algorithms kruskal_minimum_spanning_tree prim_minimum_spanning_tree connected_components strong_components Incremental Connected Components initialize_incremental_components incremental_components same_component component_index
Maximum Flow Algorithms edmunds_karp_max_flow push_relabel_max_flow topological_sort transitive_closure copy_graph transpose_graph isomorphism cuthill_mckee_ordering sequential_vertex_coloring* minimum_degree_ordering sloan_ordering ith_wavefront, max_wavefront, aver_wavefront, and rms_wavefront
Recent Additions
Betweenness Centrality
Betweenness Centrality clustering
A* search
Floyd-Warshall all-pairs shortest paths
Kamada-Kawai layout
Boost Graph Library Algorithms
BC Clustering Performance BGL vs. JUNG
0
10
20
30
40
50
60
200 225 250 275 300 325 350 375 400
# of Movies
Wall
clo
ck t
ime (
min
ute
s)
BGL
JUNG
Actor Collaboration DatabaseBetweenness Centrality Clustering (threshold=0.01)
Single Processor
Scalability of Parallel BGL Betweenness Centrality
1
10
100
1000
1 10 100
# of Processors
Wall
clo
ck t
ime (
secon
ds)
Actor Collaboration DatabaseBetweenness Centrality
Dot Plot Performance ResultsBase SIMD 1 SIMD 2 Thread
Ideal 140 1163 1163 2193
NFS 88 370 400 -
NFS Touch 88 - 446 891
Local - 500 731 -
Local Touch 90 - 881 1868
• Base is a direct port of the DOTTER algorithm • SIMD 1 is the SIMD algorithm using a sparse matrix data structure based on STL vectors• SIMD 2 is the SIMD algorithm using a binary format and memory mapped output files• Thread is the SIMD 2 algorithm on 2 Processors
Ideal Speedup Real Speedup Ideal/Real Throughput
SIMD 8.3x 9.7x 75%
Thread 15x 18.1x 77%
Thread (large data) 13.3 21.2 85%