next generation visualization technologies: enabling...

10
ULTRA-SCALE VISUALIZATION The SciDAC Institute for Ultra-Scale Visualization aims to enable extreme-scale knowledge discovery by advancing the state of visualization technologies, fostering awareness of and communication about new visualization technologies, and placing these technologies into the hands of application scientists. The development of supercomputers trans- formed the pursuit of science. With supercom- puting’s rapid growth in capability and capacity, scientists expect to study problems of unprece- dented complexity with exceptional fidelity and to address many new problems for the first time. However, the size and complexity of the data out- put of such large-scale simulations present tremendous challenges to the subsequent data visualization and analysis tasks; they create a growing gap between the scientists’ ability to sim- ulate complex physics at high resolution and their ability to extract knowledge from the resulting massive datasets. The Institute for Ultra-Scale Visualization (Ultravis Institute), funded by the U.S. Department of Energy’s (DOE) SciDAC pro- gram, will close this gap by leading community efforts to develop advanced visualization tech- nologies that enable knowledge discovery at extreme scale. Visualization—the creation of vivid pictures from simulation output in the form of arrays of numbers—has become an indispensable tool for scientists. But visualization is not without its chal- lenges, especially considering the time-varying, multivariate aspect of simulation output. We have identified and pursued several high-priority visu- alization research topics that will provide the greatest benefit to the SciDAC community as well as the broader area of scientific supercomputing and discovery. These topics include application- driven visualization, multivariate and multidi- mensional exploration, complex adaptive-grid data visualization, parallel visualization, in situ visualization, and next-generation visualization tools. Researchers at North Carolina State University have conducted the highest-resolution supernova simulations to date. They used the DOE Leader- ship Computing Facility to study the develop- ment of a rotational instability in a supernova shockwave when it was only a fraction of a sec- ond old, deep in the core of the dying star. The hundred terabytes of data output by each run of the simulation contained the greatest details and dynamic features ever modeled. The advanced visualization techniques developed by the Ultra- vis Institute made it possible to completely and 12 S CI DAC R EVIEW S PRING 2009 WWW . SCIDACREVIEW . ORG The size and complexity of the data output of large- scale simulations present tremendous challenges to the subsequent data visualization and analysis tasks NEXT-GENERATION Visualization Technologies: Enabling Discoveries at EXTREME Scale

Upload: others

Post on 11-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Next Generation Visualization Technologies: Enabling ...cacs.usc.edu/education/cs653/Ma-Ultravis-SciDACRev09.pdf · tremendous challenges to the subsequent data visualization and

U LT R A - S C A L E V I S U A L I Z A T I O N

The SciDAC Institute for Ultra-Scale Visualization aims to enable extreme-scaleknowledge discovery by advancing the state of visualization technologies, fosteringawareness of and communication about new visualization technologies, andplacing these technologies into the hands of application scientists.

The development of supercomputers trans-formed the pursuit of science. With supercom-puting’s rapid growth in capability and capacity,scientists expect to study problems of unprece-dented complexity with exceptional fidelity andto address many new problems for the first time.However, the size and complexity of the data out-put of such large-scale simulations presenttremendous challenges to the subsequent datavisualization and analysis tasks; they create agrowing gap between the scientists’ ability to sim-ulate complex physics at high resolution and theirability to extract knowledge from the resultingmassive datasets. The Institute for Ultra-ScaleVisualization (Ultravis Institute), funded by theU.S. Department of Energy’s (DOE) SciDAC pro-gram, will close this gap by leading communityefforts to develop advanced visualization tech-nologies that enable knowledge discovery atextreme scale.

Visualization—the creation of vivid picturesfrom simulation output in the form of arrays ofnumbers—has become an indispensable tool forscientists. But visualization is not without its chal-

lenges, especially considering the time-varying,multivariate aspect of simulation output. We haveidentified and pursued several high-priority visu-alization research topics that will provide thegreatest benefit to the SciDAC community as wellas the broader area of scientific supercomputingand discovery. These topics include application-driven visualization, multivariate and multidi-mensional exploration, complex adaptive-griddata visualization, parallel visualization, in situvisualization, and next-generation visualizationtools.

Researchers at North Carolina State Universityhave conducted the highest-resolution supernovasimulations to date. They used the DOE Leader-ship Computing Facility to study the develop-ment of a rotational instability in a supernovashockwave when it was only a fraction of a sec-ond old, deep in the core of the dying star. Thehundred terabytes of data output by each run ofthe simulation contained the greatest details anddynamic features ever modeled. The advancedvisualization techniques developed by the Ultra-vis Institute made it possible to completely and

12 S C I D A C R E V I E W S P R I N G 2 0 0 9 W W W . S C I D A C R E V I E W . O R G

The size and complexity ofthe data output of large-scale simulations presenttremendous challenges tothe subsequent datavisualization and analysistasks

NEXT-GENERATIONVisualization Technologies:

Enabling Discoveries at EXTREME Scale

Sp09 12-21 Ma.qxd 2/26/09 4:39 PM Page 12

Page 2: Next Generation Visualization Technologies: Enabling ...cacs.usc.edu/education/cs653/Ma-Ultravis-SciDACRev09.pdf · tremendous challenges to the subsequent data visualization and

13S C I D A C R E V I E W S P R I N G 2 0 0 9 W W W . S C I D A C R E V I E W . O R G

Figure 1. Top, simultaneous visualization of pathlines and the angular momentum field obtained from a supernovasimulation performed by Dr. John Blondin. Bottom, volume rendering of the entropy (left) and angular momentum (right)fields. New parallel and graphics processing unit (GPU)-accelerated rendering techniques enable interactive visualizationat levels of detail and clarity previously unattainable.

vividly reveal the details and actions hidden in thearrays of data, as shown by the images in figure 1.The initially spherical supernova shock developsa wave pattern in the counter-clockwise direction.This spiraling wave gathers more angularmomentum as it grows. The flow tracers in thetop image illustrate the deflection of the radiallyinfalling gas by the distorted accretion shock,feeding a rotational flow pattern around the neu-tron star at the center of the supernova. Thisshock instability helps drive the supernova shockout of the core and leads to the spin-up of theneutron star left behind.

Application-Driven VisualizationA common approach to visualizing a large amountof data is to reduce the quantity sent to the graph-ics pipeline by using a multiresolution data repre-sentation coupled with a compression method suchas quantization or transform-based compression.This approach, however, does not explicitly takedomain knowledge or visualization tasks intoaccount for data representation and reduction. Inparticular, compressing high-precision floating-point data based solely on values can only result inlimited savings. Further reduction is possible byusing the typicality that only a small subset of the

Visualization—the creationof vivid pictures fromsimulation output in theform of arrays ofnumbers—has become anindispensable tool forscientists.

H. Y

UA

ND

K. L. M

A, ULTR

AS

CA

LEV

ISU

ALIZA

TION

INS

TITUTE/U

C–

DA

VIS

Sp09 12-21 Ma.qxd 2/26/09 4:39 PM Page 13

Page 3: Next Generation Visualization Technologies: Enabling ...cacs.usc.edu/education/cs653/Ma-Ultravis-SciDACRev09.pdf · tremendous challenges to the subsequent data visualization and

U LT R A - S C A L E V I S U A L I Z A T I O N

data is of interest in analysis. We advocate an appli-cation-driven approach to compressing and ren-dering large-scale, time-varying scientificsimulation data. Most scientists have certaindomain knowledge and specific visualization tasksin their minds. In the context of time-varying, mul-tivariate volume data visualization, such knowledgecould be the salient isosurface of interest for somevariable, and the visualization task could be observ-ing spatio–temporal relationships among severalother variables in the neighborhood of that isosur-face. We have attempted to directly incorporatesuch domain knowledge and tasks into the wholedata reduction, compression, and renderingprocess.

Scientists at Sandia National Laboratories are ableto perform three-dimensional (3D) fully-resolved

direct numerical simulation of turbulent combus-tion. With full access to the spatially and temporallyresolved fields, direct numerical simulation plays amajor role in establishing a fundamental under-standing of the microphysics of turbulence–chem-istry interactions. For example, to understand thedynamic mechanisms of extinction and re-ignitionin turbulent flames, scientists need to validateknown relationships and reveal hidden ones amongmultiple variables. In the study of a lifted turbulentjet, the scientific interest is twofold: the overallstructure of the jet flame and the lifted flame baseregion to understand what is stabilizing the liftedflame. More specifically, the visualization task is toshow a particular stoichiometric mixture fractionisosurface together with another field, such as HO2(hydroperoxy radical) or OH (hydroxy radical),

14 S C I D A C R E V I E W S P R I N G 2 0 0 9 W W W . S C I D A C R E V I E W . O R G

Figure 2. Top, simultaneous visualization of mixture fraction and HO2 data obtained from a turbulent combustionsimulation performed by Dr. Jackie Chen. Bottom, visualization of the same data using a distance field representation ofHO2. The left image depicts the HO2 distribution near the feature isosurface (in white). The right image shows a widercoverage of HO2 around the feature surface. This application-driven approach not only allows scientists to see previouslyhidden features but also enables more aggressive encoding of the data according to the distance field, leading togreater storage saving.

We have attempted todirectly incorporatedomain knowledge andtasks into the whole datareduction, compression,and rendering process.

H. Y

U(TO

P), C. W

AN

G(B

OTTO

M), A

ND

K. L. M

A(B

OTH), U

LTRA

SC

ALE

VIS

UA

LIZATIO

NIN

STITU

TE/UC

–D

AV

IS

Sp09 12-21 Ma.qxd 2/26/09 4:39 PM Page 14

Page 4: Next Generation Visualization Technologies: Enabling ...cacs.usc.edu/education/cs653/Ma-Ultravis-SciDACRev09.pdf · tremendous challenges to the subsequent data visualization and

15S C I D A C R E V I E W S P R I N G 2 0 0 9 W W W . S C I D A C R E V I E W . O R G

because researchers want to see how each fieldbehaves near the isosurface. Given this knowledge,we encoded selected scalar fields based on each gridpoint’s distance from the mixture fraction isosur-face. For example, we can more aggressively com-press the regions farther away from the features ofinterest. Figure 2 shows visualizations made usingtwo different distance values for the HO2 field. Sci-entists can choose to visualize HO2 distributionaround the isosurface within a selected distancefrom the surface. The region farther away from thesurface, which would otherwise obscure the moreinteresting features near the mixture fraction iso-surface, is not displayed or de-enhanced. Such a finelevel of control on revealing spatial features locallywas not previously available to scientists. This newcapability will suggest new and novel ways to visu-alize and explore data, allowing scientist to studytheir data to a much greater extent. Finally, theencoding resulted in a 20-fold saving in storagespace, which was particularly critical when graph-ics processing unit (GPU)-accelerated rendering

was used. In this way, we can better cope with thelimited video memory space of commodity graph-ics cards to achieve interactive visualization. Theimages generated from data with and without thedistance-based compression are almost visuallyindistinguishable.

Adaptive-Grid Data VisualizationUnstructured grids and adaptive mesh refinement(AMR) methods are increasingly used in large-scalescientific computations for the modeling of prob-lems involving complex geometries and phenom-ena at varying scales. By applying finer meshesonly to regions requiring high accuracy, both com-puting time and storage space can be reduced. Onthe other hand, the use of adaptive, unstructureddiscretizations complicates the visualization task,because the resulting datasets are irregular bothgeometrically and topologically. The need to storeand access additional information about the grid’sstructure can lead to visualization algorithms thatincur considerable memory and computational

Figure 3. Visualization of density data (top) and gravitational potential data (bottom) output from a cosmologysimulation performed by Dr. Brian O’Shea. The Ultravis Institute has created a new rendering technology for visualizingdata on 3D adaptive grids at previously unavailable quality and interactivity.

Unstructured grids andadaptive mesh refinementmethods are increasinglyused in large-scalescientific computations forthe modeling of problemsinvolving complexgeometries andphenomena at varyingscales.

R. A

RM

STR

ON

GA

ND

K. L. M

A, ULTR

AS

CA

LEV

ISU

ALIZA

TION

INS

TITUTE/U

C–

DA

VIS

Sp09 12-21 Ma.qxd 2/26/09 4:39 PM Page 15

Page 5: Next Generation Visualization Technologies: Enabling ...cacs.usc.edu/education/cs653/Ma-Ultravis-SciDACRev09.pdf · tremendous challenges to the subsequent data visualization and

U LT R A - S C A L E V I S U A L I Z A T I O N

overheads. Hardware-accelerated volume render-ing can offer real-time rates but is limited by thesize of the video memory and complexity of therendering algorithm. While algorithms and toolsare introduced for visualizing simple unstructuredAMR grids, several areas of study are beginning touse more sophisticated mesh structures, consist-ing of nested, overlapped, or higher-order ele-ments. We are developing new approaches torender these data. Produced from a cosmologysimulation, figure 3 (p15) shows visualizationresults using our new algorithm to render AMRdata with nested elements without re-meshing.Because we do not resample the mesh into a uni-form one, nor do we break the nested elements,our renderer can handle data orders of magnitudelarger than conventional algorithms can. Bothquality and rendering performance are superiorcompared to existing systems.

Interfaces for Time-Varying, Multivariate Data VisualizationA visualization tool’s usability is largely determinedby its user interface. Past visualization researchlargely focused on designing new visual represen-tations and improving performance visualizationcalculations. The design and deployment of appro-priate user interfaces for advanced visualizationtechniques began to receive more attention only afew years ago. Interface design has therefore playeda major role in several Institute projects, in partic-ular enhancing scientists’ ability to visualize andanalyze time-varying, multivariate data.

One visualization interface designed for explor-ing time-varying, multivariate volume data consistsof three components that abstract the complexexploration of different spaces of data and visuali-

zation parameters. An important concept realizedhere is, the interface is also the visualization itself.In figure 4, the rightmost panel displays the timehistograms of the data. A time histogram showshow the distribution of data values changes over thewhole time sequence and can thus help the useridentify time steps of interest and specify time-vary-ing features. The middle panel attempts to displaythe correlation between each pair of variables inparallel coordinates for the selected time step. Byexamining different pairs of variables the user canoften identify features of interest based on the cor-relations observed. The leftmost panel displays ahardware-accelerated volume rendering enhancedwith the capability to render multiple variables intoa single visualization in a user-controllable fashion.Such simultaneous visualizations of multiple scalarquantities allow the user to more closely exploreand validate simulations from the temporal dataspace, parallel-coordinate space, and the 3D phys-ical space. These three components are tightlycross-linked to facilitate what we call tri-space dataexploration, offering scientists new power to studytheir time-varying, multivariate volume data.

A similar interface design effectively facilitatesvisualization of multidimensional particle data out-put from a gyrokinetic simulation performed byscientists at the Princeton Plasma Physics Labora-tory. Depicting the complex phenomena associatedwith particle data presents a challenge because ofthe large quantity of particles, variables, and timesteps. By using two modes of interaction—physi-cal space and variable space—our system allowsscientists to explore collections of densely packedparticles and discover interesting features withinthe data. Although single variables can be easilyexplored through a one-dimensional transfer

16 S C I D A C R E V I E W S P R I N G 2 0 0 9 W W W . S C I D A C R E V I E W . O R G

Figure 4. Interface for tri-space visual exploration of time-varying, multivariate volume data. From left to right, the spatial view, variable view, andtemporal view of the data are given.

H. A

KIB

AA

ND

K. L. M

A, ULTR

AS

CA

LEV

ISU

ALIZA

TION

INS

TITUTE/U

C–

DA

VIS

Sp09 12-21 Ma.qxd 2/26/09 4:39 PM Page 16

Page 6: Next Generation Visualization Technologies: Enabling ...cacs.usc.edu/education/cs653/Ma-Ultravis-SciDACRev09.pdf · tremendous challenges to the subsequent data visualization and

function, we again turn to a parallel coordinatesinterface for interactively selecting particles in mul-tivariate space. In this manner, particles with deeperconnections can be separated and then renderedusing sphere glyphs and pathlines. With this inter-face, scientists are able to highlight the location andmotion of particles that become trapped in turbu-lent plasma flow (figure 5, top). The bottom panelin figure 5 shows a similar design for comparativevisualization of cosmological simulations. Using

this interface, scientists at the Los Alamos NationalLaboratory are able to more intuitively and conve-niently compare different simulation codes or dif-ferent approximations using the same code. Ourapproach to multivariate data visualization has alsobeen adopted by general-purpose visualizationtools such as VisIt.

Orthogonal to the above designs, we have alsodeveloped an interface allowing scientists to visu-alize specific multivariate temporal features using

17S C I D A C R E V I E W S P R I N G 2 0 0 9 W W W . S C I D A C R E V I E W . O R G

Figure 5. Top, multidimensional visualization of a gyrokinetic particle simulation with a parallel coordinates interfaceusing six axes, from top to bottom: toroidal coordinate, trap particle condition, parallel velocity, statistical weight,perpendicular velocity, and distance from the center. The visualization highlights trapped particles, those changingdirection frequently. Bottom, a similar user interface for comparative analysis of different cosmological simulationcodes and approximations.

By using two modes ofinteraction—physicalspace and variablespace—our system allowsscientists to explorecollections of denselypacked particles anddiscover interestingfeatures within the data.

C. JO

NE

S(TO

P), S. H

AR

OZ

(BO

TTOM

), AN

DK

. L. MA

(BO

TH), ULTR

AS

CA

LEV

ISU

ALIZA

TION

INS

TITUTE/U

C–

DA

VIS

Sp09 12-21 Ma.qxd 2/26/09 4:39 PM Page 17

Page 7: Next Generation Visualization Technologies: Enabling ...cacs.usc.edu/education/cs653/Ma-Ultravis-SciDACRev09.pdf · tremendous challenges to the subsequent data visualization and

U LT R A - S C A L E V I S U A L I Z A T I O N

powerful constructs resembling regular expres-sions for textual search. Through a succinct visual-ization language interface, which we call seeReg,a user can specify partial hypotheses for visuali-zation. The power of seeReg stems from its capa-bility to automatically expand on partially specifiedfeatures and reveal all possible matches in thedataset in a succinct view. Figure 6 shows a visual-ization of DOE C-LAMP (Carbon-Land Model inter-comparison Project) climate modeling data usingseeReg to reveal variation in time of the first “major”snowfall in the northern hemisphere. The yearsshown from top to bottom are 2050, 2051, and2052. “First Snow” is detected by months of eventwith an expression specifying that the user is notinterested in any variation of ground cover of snowbelow 0.7 units. As soon as the snow ground coverincreases by more than 0.7 units, our domainexpert regards the event as the first major snowfall.The value of 0.7 is determined empirically by theuser in a highly interactive visualization explo-ration. This visualization is novel in that it allowsfuzzy user knowledge to be directly used in creat-ing visualization. This perspective of uncertaintyvisualization has not been presented before in thefield. In particular, it nicely complements traditionalmultivariate visualizations created with a parallelcoordinate type of user interface.

Parallel VisualizationScientists generally do not look at all of the terabytesof data generated by their simulations, because theydo not have a tool capable of displaying the fullextent of the data at the original precision. The usualpractice is to down-sample the data in either the

temporal or the spatial domain so it is possible tolook at the reduced data on a desktop computer,which defeats the purpose of running the high-res-olution simulations. A feasible solution to this prob-lem is to use parallel visualization, which capitalizesthe power of the same supercomputers used to runthe simulations. While many parallel visualizationalgorithms have been developed over the years, veryfew are usable due to their limited scalability andflexibility. It is the primary goal of the Ultravis Insti-tute to make parallel visualization technology acommodity for SciDAC scientists and the broadercommunity. We have been experimenting with thewhole process of parallel visualization, includinginput/output (I/O) and coordination with the sim-ulations rather than rendering algorithms in isola-tion, on the new generation of supercomputers suchas the IBM Blue Gene/P and the Cray XT4 for assess-ing the feasibility of post-processing visualizationas well as in situ visualization.

A distinct effort of ours is the development of ascalable parallel visualization method for under-standing time-varying 3D vector-field data. Vector-field visualization is more challenging thanscalar-field visualization, because it generallyrequires more computing for conveying the direc-tional information and more space for storing thevector field. We have introduced the first scalableparallel pathline visualization algorithm. Particletracing is fundamental to portraying the structureand direction of a vector flow field, because we canconstruct paths and surfaces from the traced par-ticles to effectively characterize the flow field. How-ever, visualizing a large time-varying vector field ona parallel computer using particle tracing presentssome unique challenges. Even though the tracingof each particle is independent of others, a parti-cle may drift to anywhere in the spatial domain overtime, demanding interprocessor communication.Furthermore, as particles move around, the num-ber of particles each processor must handle varies,leading to uneven workloads. Our new algorithmsupports levels of detail, has very low communica-tion requirements, and effectively balances load.This new capability enables scientists to see theirvector-field data in unprecedented detail, at vary-ing abstraction levels, with higher interactivity. Anexample is the upper image in figure 1 (p13), show-ing pathline visualization superimposed with vol-ume rendering of the angular momentum field.

For massively parallel rendering, parallel effi-ciency is often determined by the image composit-ing step, which combines the partial imagesgenerated by individual processors into a completeimage. Because this step requires interprocessorcommunication, it can easily become the primarybottleneck for scalable rendering if not carefullydone. We have experimentally studied existing

18 S C I D A C R E V I E W S P R I N G 2 0 0 9 W W W . S C I D A C R E V I E W . O R G

January

February

March

April

May

June

July

August

September

October

November

December

year 2050

year 2052

year 2051

Figure 6. An uncertainty-tolerant visualization created from using seeReg, avisualization language interface, shows variation in time of the first “major” snowfallin the northern hemisphere in a C-LAMP simulation.

Through a succinctvisualization languageinterface, which we callseeReg, a user canspecify partial hypothesesfor visualization.

M. G

LATTE

RA

ND

J. HU

AN

G, ULTR

AS

CA

LEV

ISU

ALIZA

TION

INS

TITUTE/U

. TE

NN

ES

SE

E–K

NO

XVILLE; F. H

OFFM

AN, O

RN

L

Sp09 12-21 Ma.qxd 2/26/09 4:39 PM Page 18

Page 8: Next Generation Visualization Technologies: Enabling ...cacs.usc.edu/education/cs653/Ma-Ultravis-SciDACRev09.pdf · tremendous challenges to the subsequent data visualization and

parallel image compositing algorithms and foundbinary swap the most scalable one. The binary swapalgorithm, nevertheless, has an inherent limitationin processor utilization. Consistently, high effi-ciency is obtained only when the number of proces-sors is a power of two. We have developed avariation of the binary swap algorithm, which wecall a 2–3 swap. Our experimental study on boththe IBM Blue Gene/P and Cray XT4 shows this newalgorithm is scalable to any number of processors.We are making our binary swap compositingimplementation available to others who developparallel renderers.

In Situ VisualizationDue to the size of data typically produced by alarge-scale simulation, visualization is almost exclu-sively a post-processing step. Even though it is desir-able to monitor and validate some simulationstages, the cost of moving the simulation output toa visualization machine could be too high to makeinteractive visualization feasible. A better approachis either not to move the data or to keep to a mini-mum the portions of the data that must be moved.This approach can be achieved if both simulationand visualization calculations are run on the sameparallel supercomputer, so the data can be shared,as shown in figure 7. Such in situ processing can ren-der images directly or extract features, which aremuch smaller than the full raw data, to store for on-the-fly or later examination. As a result, reducingboth the data transfer and storage costs early in thedata analysis pipeline can optimize the overall sci-entific discovery process.

In practice, however, this approach has seldombeen adopted. First, most scientists were reluctantto use their supercomputer time for visualizationcalculations. Second, it could take a significanteffort to couple a legacy parallel simulation codewith an in situ visualization code. In particular, thedomain decomposition optimized for the simula-tion is often unsuitable for parallel visualization,resulting in the need to replicate data for acceler-ating visualization calculations. Hence, scientistscommonly store only a small fraction of the data,or they study the stored data at a coarser resolution,which defeats the original purpose of performingthe high-resolution simulations. To enable scien-tists to study the full extent of the data generated bytheir simulations and for us to possibly realize theconcept of steering simulations at extreme scale,we ought to begin investigating the option of in situprocessing and visualization. Many scientists arebecoming convinced that simulation-time featureextraction, in particular, is a feasible solution totheir large data problem. During the simulationtime, all relevant data about the simulated field arereadily available for the extraction calculations.

It may also be desirable and feasible to render thedata in situ to monitor and steer a simulation. Evenin the case that runtime monitoring is not practicaldue to the length of the simulation run or the natureof the calculations, it may still be desirable to gen-erate an animation characterizing selected parts ofthe simulation. This in situ visualization capabilityis especially helpful when a significant amount ofthe data is to be discarded. Along with restart files,the animations could capture the integrity of thesimulation with respect to a particularly importantaspect of the modeled phenomenon.

We have been studying in situ processing andvisualization for selected applications to under-stand the impact of this new approach on ultra-scale simulations, subsequent visualization tasks,and how scientists do their work. Compared witha traditional visualization task that is performed ina post-processing fashion, in situ visualizationbrings some unique challenges. First of all, the visu-alization code must interact directly with the sim-ulation code, which requires both the scientist andthe visualization specialist to commit to this inte-gration effort. To optimize memory usage, we mustshare the same data structures with the simulationand visualization codes to avoid replicating data.Second, visualization workload balancing is moredifficult to achieve because the visualization mustcomply with the simulation architecture and betightly coupled with it. Unlike parallelizing visual-ization algorithms for standalone processing wherewe can partition and distribute data best suited forthe visualization calculations, for in situ visualization,

19S C I D A C R E V I E W S P R I N G 2 0 0 9 W W W . S C I D A C R E V I E W . O R G

Storage

Visualization Machine

Simulation Machine

Visualization Calculations

Bottleneck

Figure 7. Transferring the whole data generated by a petascale simulation to astorage device or a visualization machine could become a serious bottleneck, becauseI/O would take most of the supercomputing time. A more feasible approach is toreduce and prepare the data in situ for subsequent visualization and data analysistasks. The reduced data, which typically are several orders of magnitude smaller thanthe raw data, can greatly lower the subsequent data transferring and storage costs.

It is the primary goal ofthe Ultravis Institute tomake parallel visualizationtechnology a commodityfor SciDAC scientists andthe broader community.

SO

UR

CE: K

. L. MA, U

LTRA

SC

ALE

VIS

UA

LIZATIO

NIN

STITU

TE/UC

–D

AV

ISILLU

STR

ATIO

N: A. T

OV

EY

Sp09 12-21 Ma.qxd 2/26/09 4:39 PM Page 19

Page 9: Next Generation Visualization Technologies: Enabling ...cacs.usc.edu/education/cs653/Ma-Ultravis-SciDACRev09.pdf · tremendous challenges to the subsequent data visualization and

U LT R A - S C A L E V I S U A L I Z A T I O N

the simulation code dictates data partitioning anddistribution. Moving data frequently amongprocessors is not an option for visualization pro-cessing. We need to rethink to possibly balance thevisualization workload so the visualization is atleast as scalable as the simulation. Finally, visuali-zation calculations must be low cost with decou-pled I/O for delivering the rendering results whilethe simulation is running. Because visualization cal-culations on a supercomputer cannot be hardwareaccelerated, we must find other ways to simplify thecalculations such that adding visualization wouldonly detract a very small fraction of the supercom-puter time allocated to the scientist.

We have realized in situ visualization for a teras-cale earthquake simulation. This work also won theHPC Analytics Challenges competition of theSupercomputing Conference because of the scal-ability and interactive volume visualization we wereable to demonstrate over a wide-area network using2,048 processors of a supercomputer at the Pitts-burgh Supercomputing Center. We were able toachieve high parallel efficiency exactly because wemade the visualization calculations, that is, directvolume rendering, to use the data structures usedby simulation code, which removes the need toreorganize the simulation output and replicate data.Rendering is done in situ using the same data par-titioning made by the simulation, so no data move-ment is needed among processors. We arebeginning work on a new earthquake simulationthat will be petascale and include the structures ofbuildings above ground, which also demands morework for the visualization part.

To realize in situ data reduction and visualization,we need to essentially create a new visualizationand data-understanding infrastructure. We are alsodeveloping in situ data encoding algorithms, index-

ing methods, incremental 4D feature extraction andtracking algorithms, and data-quality measuringmethods for better supporting post-processingvisualization. We must emphasize that visualiza-tion researchers in isolation cannot achieve thisimportant goal. Through SciDAC, we haveobtained commitments from science collaboratorsto pursue this research together.

Visualization ToolsNew visualization technologies cannot reach a wideaudience without being incorporated into visuali-zation tools that can be leveraged by the scientistwho needs to understand the data. Building visual-ization tools, in particular general-purpose toolsthat can be used in a wide variety of domains,includes the daunting task of combining advancedvisualization techniques, which often have conflict-ing requirements on the application framework,data types, and interaction patterns. For maximumapplicability, such tools may also need to performwell on a wide variety of architectures: from single-user laptops, to remote interactive access of special-ized parallel visualization hardware, to massivelyparallel supercomputers.

To deploy next-generation visualization tech-nologies, we primarily use the ParaView applica-tion and framework. In order to use ParaView, orany other visualization framework, to deploy thelatest visualization research, the basic underlyingframework must have fundamental support. ForParaView, this includes the necessary adaptive datastructures, temporal extent controls, and parallelscalability. Figure 8 shows two demanding applica-tions using ParaView. ParaView’s modular architec-ture and multiple customization paths, includingplug-ins and scripting, allow for quick integrationof multiple visualization technologies. Figure 9

20 S C I D A C R E V I E W S P R I N G 2 0 0 9 W W W . S C I D A C R E V I E W . O R G

Figure 8. Some core functionalities for next-generation visualization technologies implemented in the ParaView framework. Left, fragmentationanalysis using large-scale AMR grids. Right, optimization of hydroelectric turbines.

We must emphasize thatvisualization researchersin isolation cannotachieve this importantgoal. Through SciDAC,we have obtainedcommitments fromscience collaborators topursue this researchtogether.

K. M

OR

ELA

ND, S

AN

DIA

NA

TION

AL

LA

BO

RA

TOR

IES

Sp09 12-21 Ma.qxd 2/26/09 4:39 PM Page 20

Page 10: Next Generation Visualization Technologies: Enabling ...cacs.usc.edu/education/cs653/Ma-Ultravis-SciDACRev09.pdf · tremendous challenges to the subsequent data visualization and

demonstrates how a customized reader and render-ing module used with ParaView allows a variety ofvisualization techniques to be applied to the datathat scientists at the Stanford Linear AcceleratorCenter need to study.

In addition to adopting the latest visualizationtechnologies as they are developed, visualizationtools must also adapt to the changing landscape ofrequirements and supporting hardware. For exam-ple, petascale visualization tools may soon need toexploit new parallel paradigms in hardware such asmultiple cores, multiple GPUs, and cell processors.Another critical consequence of petascale comput-ing is larger datasets generated by simulations; I/O,particularly file I/O, is a more critical concern thanever before. To maintain scalability in file reads andwrites, parallel coordinated disk access must be anintegrated part of our visualization tools.

Petascale computing also provides the facility forseveral new modes of analysis. With higher fidelitysimulations comes the ability to quantify uncer-tainty. Comparative analysis is also important forboth ensuring the validity of a simulation as well asverifying the results correspond to the physical phe-nomenon. Ensembles of runs can have higherfidelity or larger dimensions than ever before onpetascale supercomputers. All three of these analy-ses are underrepresented in our current visualiza-tion tools, and this is a problem we must correct foreffective analysis to continue.

ConclusionPetascale and exascale computing are right aroundthe corner. Will we have adequate tools for extract-ing meaning from the data generated by extreme-scale simulations? The timely investment made bythe DOE SciDAC program in ultra-scale visualiza-tion ensures the challenges will be addressed. In thisarticle, we point out some of the grand challenges

facing extreme-scale data analysis and visualizationand present several key visualization technologiesdeveloped by the Ultravis Institute to enable knowl-edge discovery in the petascale and exascaleregimes. The Institute combines the pursuit of basicresearch in visualization algorithms, interactiontechniques, and interface designs with a uniqueapproach of working one-on-one with SciDACapplication scientists to accelerate the adoption ofnew technologies by their science teams. This allowsus to demonstrate the success of newly developedtechnologies in a particular area of study, providingother science teams with an opportunity to see howeach technology is applied. Matured technologiesare generalized and deployed using ParaView to ben-efit the larger scientific community. The Institute hasalso launched a series of outreach and educationactivities that foster interaction both within thecommunity and between visualization experts andapplication scientists. The impact of these basicresearch efforts together with the outreach and edu-cation activities not only accelerates the develop-ment of ultra-scale visualization tools but also helpsstrengthen the presence of SciDAC in the academiccommunity. ●

Contributors This article was written by Kwan-Liu Ma andChaoli Wang of the University of California–Davis, HongfengYu and Kenneth Moreland of Sandia National Laboratories,Jian Huang of the University of Tennessee–Knoxville, andRob Ross of Argonne National Laboratory. The authorsacknowledge Hiroshi Akiba, Ryan Armstrong, Steve Haroz,and Chad Jones for creating images. The authors would alsolike to thank their science collaborators including JohnBlondin, Jackie Chen, Stephane Ethier, Katrin Heitmann,Forrest Hoffman, Greg Schussman, and Brian O’Shea

Further Reading

http://ultravis.org

21S C I D A C R E V I E W S P R I N G 2 0 0 9 W W W . S C I D A C R E V I E W . O R G

Figure 9. Visualization of a simulation of a low-loss accelerating cavity using ParaView by researchers at SLAC. Top, electric field magnitude on thecavity surface. Bottom left, electric field magnitude within the cavity’s volume. Bottom right, direction of real and imaginary electric fields.

The impact of these basicresearch efforts togetherwith the outreach andeducation activities notonly accelerates thedevelopment of ultra-scale visualization toolsbut also helps strengthenthe presence of SciDAC inthe academic community.

K. M

OR

ELA

ND, S

AN

DIA

NA

TION

AL

LA

BO

RA

TOR

IES

Sp09 12-21 Ma.qxd 2/26/09 4:39 PM Page 21