1 this work partially funded by nsf grants iis-9732897, iris-9729878 and iis-0119276 matthew o....

20
1 This work partially funded by NSF Grants IIS-9732897, IRIS- 9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine Rosario, Allen R. Martin, Ying-Huey Fua, Daniel Stroe http:// davis.wpi.edu/~xmdv XmdvTool Interactive Visual Data Exploration System for High- dimensional Data Sets Worcester Polytechnic Institute

Post on 19-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine

1

This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276

Matthew O. Ward, Elke A. Rundensteiner,Jing Yang, Punit Doshi, Geraldine Rosario,

Allen R. Martin, Ying-Huey Fua, Daniel Stroe

http://davis.wpi.edu/~xmdv

XmdvToolInteractive Visual Data Exploration System

for High-dimensional Data Sets

Worcester Polytechnic Institute

Page 2: 1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine

2

XmdvTool Features

• Hierarchical visualization and interaction tools for exploring very large high-dimensional data sets to discover patterns, trends and outliers

• Applications: Bioterrorism Detection Bioinformatics and Drug Discovery Space Science Geology and Geochemistry Systems Monitoring and Performance Evaluation Economics and Business Simulation Design and Analysis

• Multi-platform support (Unix, Linux, Windows)• Public domain software: http://davis.wpi.edu/~xmdv

Page 3: 1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine

3

• Scale-up to High Dimensions: Visual Hierarchical Dimension Reduction

• Scale-up to Large Data Sets: Interactive Hierarchical Displays, Database Backend with Minmax Encoding, Semantic Caching and Adaptive Prefetching

• Interlinked Multi-Displays: Parallel Coordinates, Glyphs, Scatterplot Matrices, Dimensional Stacking

• Visual Interaction Tools: N-Dimensional Brushes, Structure-Based Brushing, InterRing

Xmdv: Main Features

Page 4: 1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine

4

Scale-Up for Large Number of Dimensions

Solution to High Dimensional Datasets:• Group Similar Dimensions into

Dimension Hierarchy• Navigate Dimension Hierarchy by

InterRing• Form Lower Dimensional Spaces by

Dimension Clusters• Convey Dimension Cluster

Information by Dissimilarity Display

Page 5: 1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine

5

Visual Hierarchical Dimension Reduction Process

Page 6: 1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine

6

A 42-dimensional Data Set

Dimension Hierarchy Interaction Tool:

InterRing

A 4-Dimensional Subspace

Visual Hierarchical Dimension Reduction Process

Page 7: 1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine

7

InterRing - Dimension Hierarchy Navigation and Manipulation

Roll-up/Drill-down Rotate Zoom in/out

Distort Modify

Page 8: 1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine

8

Dissimilarity Display

Three Axes Method

Mean-Band Method

Diagonal Plot Method

Axis Width Method

Page 9: 1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine

9

Scale-up for Large Number of Records

Solution to Large Scale Datasets:• Group Similar Records into

Data Hierarchy • Navigate Data Hierarchy by

Structure-Based Brushing• Represent Data Clusters by

Mean-Band Method • Provide Database Backend Support

using MinMax Tree, Caching, Prefetching

Page 10: 1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine

10

2D example

Interactive Hierarchical Display

Hierarchical Clustering Structure-Based Brushing

Page 11: 1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine

11

Flat Display Hierarchical Display

Interactive Hierarchical Display

Mean-Band Method in Parallel Coordinates

Page 12: 1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine

12

Flat Display Hierarchical Display

Mean-Band Method in Parallel Coordinates

Interactive Hierarchical Display

Page 13: 1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine

13

Scalability of Data Access

• Approach• Attach database system to visualization front-end

• MinMax hierarchy encoding• Key idea: avoid recursive processing

• Pre-computed

• Caching• Key idea: reduce response time and network traffic

• Prefetching• Key idea: use application hints and predict user patternsapplication hints and predict user patterns

• Performed during idle timePerformed during idle time

Page 14: 1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine

14

• Pre-compute object positions

– level-of-detail (L)

– extent values (x,y)

– preserve tree structure

• New query semantics

– objects are now rectangles

– select objects that touch L

– select objects that touch (x, y)

– structure-based brush = intersection of two selections

Scalability of Data Access:MinMax Hierarchy Encoding

level of detail

extent values

L

x y

query = (x, y, L) x y

L

Page 15: 1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine

15

• Purpose• reduce response time and network traffic

• Issues• visual query cannot directly translate into object IDs high-level cache specification to avoid complete scans

• Semantic caching• queries are cached rather than objects• minimize cost of cache lookup• dynamically adapt cached queries to patterns of queries

Scalability of Data Access: Caching

Page 16: 1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine

16

• Strategy– Speculative (no specific hints)

– navigation remains locallocal – both user user and data setdata set influence exploration

– Adaptive (strategy changes over time)– Evolves as more knowledge becomes available

– Non-pure (interruptible prefetching)– leave buffer in consistent consistent state

• Requirements– non-pure prefetching + large transactions & small object

size + semantic caching small granularity (object level)– speculative, non-pure prefetcher cache replacement

policy + guessing method

Scalability of Data Access: Prefetching

Page 17: 1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine

17

Conclusions: Caching reduces response time by 80% Prefetching further reduces response time by 30% Designing better prefetching strategies might help

further reduce response time

Effectiveness of Prefetcher

0

5

1015

20

25

30

0 2 4 6 8Delay between User Operations (seconds)

% Im

prov

emen

t in

Resp

onse

Tim

e

Effectiveness of Caching

0

40

80

120

160

200

Client OFFServer OFF

Client OFFServer ON

Client ON ServerOFF

Client ON ServerON

Caching

Res

pon

se T

ime

(sec

ond

s)

Scalability of Data Access: Experimental Evaluation

Page 18: 1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine

18

Random Random Strategy

(m-1) m (m+1)

Direction Direction Strategy

Hot Regions

Current Navigation

Window

Focus Focus Strategy

m(n-2)

m(n-1)m(n)

m(n+1)

Mean Mean Strategy

m(n-2)

m(n-1)m(n)

m(n+1)

Exponential Weight Exponential Weight Average Average Strategy

Vector Vector Strategies

41p

41p

41p

41p

Data Set Driven Data Set Driven Strategy

Localized Speculative Localized Speculative Strategies

Scalability of Data Access: Prefetching

Page 19: 1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine

19

Xmdv System Implementation

• Tools– C/C++

– TCL/TK

– OpenGL

– Oracle 8i

– Pro*C

User

MinMaxLabeling

SchemaInfo

Hierarchical Data

RewriterTranslator

Loader

BufferQueries

GUI

OFF-LINE PROCESS

Estimator

ExplorationVariables

DB

ON-LINE PROCESS

MEMORY

Flat Data

PrefetcherLibrary:RandomDirection

Focus

EWAMean

DB DB

Buffer

Page 20: 1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine

20

Publications (available at http://davis.wpi.edu/~xmdv)

• Jing Yang, Matthew O. Ward and Elke A. Rundensteiner, "InterRing: An Interactive Tool for Visually Navigating and Manipulating Hierarchical Structures", InfoVis 2002, to appear

• Punit R. Doshi, Elke A. Rundensteiner, Matthew O. Ward and Daniel Stroe, “Prefetching For Visual Data Exploration.”

Technical Report #: WPI-CS-TR-02-07, 2002• Jing Yang, Matthew O. Ward and Elke A. Rundensteiner, “Interactive

Hierarchical Displays: A General Framework for Visualization and Exploration of Large Multivariate Data Sets”, Computers and Graphics Journal, 2002, to appear

• Daniel Stroe, Elke A. Rundensteiner and Matthew O. Ward, “Scalable Visual Hierarchy Exploration”, Database and Expert Systems Applications, pages 784-793, Sept. 2000

• Ying-Huey Fua, Matthew O. Ward and Elke A. Rundensteiner, “Hierarchical Parallel Coordinates for Exploration of LargeDatasets”, IEEE Proc. of Visualization, pages 43-50, Oct. 1999

• Ying-Huey Fua, Matthew O. Ward and Elke A. Rundensteiner, “Navigating Hierarchies with Structure-Based Brushes”, IEEE Proceedings of Visualization, pages 43-50, Oct. 1999