infovis infrastructure workshop chris mueller open systems lab, indiana university october 9, 2004...

6
InfoVis Infrastructure Workshop Chris Mueller Open Systems Lab, Indiana University October 9, 2004 chemuell at cs dot indiana dot edu www.osl.iu.edu

Upload: blake-mcdowell

Post on 19-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: InfoVis Infrastructure Workshop Chris Mueller Open Systems Lab, Indiana University October 9, 2004 chemuell at cs dot indiana dot edu

InfoVis Infrastructure Workshop

Chris MuellerOpen Systems Lab, Indiana University

October 9, 2004chemuell at cs dot indiana dot edu

www.osl.iu.edu

Page 2: InfoVis Infrastructure Workshop Chris Mueller Open Systems Lab, Indiana University October 9, 2004 chemuell at cs dot indiana dot edu

Overview

• Position Paper– Repository style infrastructure (SourceForge, GenBank,

CPAN)– Standard software protocols – Guiding policies to help ensure quality

• Current Work – Open Systems Lab, IU– High performance components for IVC

• Boost Graph Library– Very large data sets/visualization

• Interests– Understand community needs– Learn what’s available, where we’re going

• Industry Viewpoint– Web-based Scientific Visualization and Analysis products– In-house visualization and analysis tools (high-throughput

analytical chemistry)

Page 3: InfoVis Infrastructure Workshop Chris Mueller Open Systems Lab, Indiana University October 9, 2004 chemuell at cs dot indiana dot edu

Core Algorithm Patterns breadth_first_search breadth_first_visit depth_first_search depth_first_visit undirected_dfs

Shortest Paths Algorithms dijkstra_shortest_paths bellman_ford_shortest_paths dag_shortest_paths johnson_all_pairs_shortest_paths

Minimum Spanning Tree Algorithms kruskal_minimum_spanning_tree prim_minimum_spanning_tree connected_components strong_components Incremental Connected Components initialize_incremental_components incremental_components same_component component_index

Maximum Flow Algorithms edmunds_karp_max_flow push_relabel_max_flow topological_sort transitive_closure copy_graph transpose_graph isomorphism cuthill_mckee_ordering sequential_vertex_coloring* minimum_degree_ordering sloan_ordering ith_wavefront, max_wavefront, aver_wavefront, and rms_wavefront

Recent Additions

Betweenness Centrality

Betweenness Centrality clustering

A* search

Floyd-Warshall all-pairs shortest paths

Kamada-Kawai layout

Boost Graph Library Algorithms

Page 4: InfoVis Infrastructure Workshop Chris Mueller Open Systems Lab, Indiana University October 9, 2004 chemuell at cs dot indiana dot edu

BC Clustering Performance BGL vs. JUNG

0

10

20

30

40

50

60

200 225 250 275 300 325 350 375 400

# of Movies

Wall

clo

ck t

ime (

min

ute

s)

BGL

JUNG

Actor Collaboration DatabaseBetweenness Centrality Clustering (threshold=0.01)

Single Processor

Page 5: InfoVis Infrastructure Workshop Chris Mueller Open Systems Lab, Indiana University October 9, 2004 chemuell at cs dot indiana dot edu

Scalability of Parallel BGL Betweenness Centrality

1

10

100

1000

1 10 100

# of Processors

Wall

clo

ck t

ime (

secon

ds)

Actor Collaboration DatabaseBetweenness Centrality

Page 6: InfoVis Infrastructure Workshop Chris Mueller Open Systems Lab, Indiana University October 9, 2004 chemuell at cs dot indiana dot edu

Dot Plot Performance ResultsBase SIMD 1 SIMD 2 Thread

Ideal 140 1163 1163 2193

NFS 88 370 400 -

NFS Touch 88 - 446 891

Local - 500 731 -

Local Touch 90 - 881 1868

• Base is a direct port of the DOTTER algorithm • SIMD 1 is the SIMD algorithm using a sparse matrix data structure based on STL vectors• SIMD 2 is the SIMD algorithm using a binary format and memory mapped output files• Thread is the SIMD 2 algorithm on 2 Processors

Ideal Speedup Real Speedup Ideal/Real Throughput

SIMD 8.3x 9.7x 75%

Thread 15x 18.1x 77%

Thread (large data) 13.3 21.2 85%