interactive exploration & analysis of 2.5d ensemble data › ~chengu › eurovis12 ›...

Eurographics/ IEEE-VGTC Symposium on Visualization 2012S. Bruckner, S. Miksch, and H. Pfister(Guest Editors)

Volume 31 (2012), Number 3

Interactive Exploration & Analysis of 2.5D Ensemble Data

Thomas Höllt1 Guoning Cheng2 Charles D. Hansen2,3 Markus Hadwiger1

1GMSV Center, King Abdullah University of Science and Technology, Saudi Arabia2SCI Institute, University of Utah, USA 3School of Computing, University of Utah, USA

variance300

225

150

75

0

#

z

Figure 1: Figure needs to be replaced. - Visualization of a seismic horizon ensemble (left) and an ensemble from ocean sim-ulation (right). We use a representative surface, for example a maximum likelihood surface, and employ color coding on thissurface in order to visualize scalar statistical properties. For more detailed inspection of the entire distribution of ensemblesurfaces at any selected point on the representative surface, we provide a second linked view that shows a histogram of depthpositions of the surfaces at the selected point (center). Here, the y-axis corresponds to the depth positions of surfaces, and thex-axis represents the number of surfaces in the ensemble passing through each depth.

AbstractThis paper presents a novel interactive framework for analyzing ensembles or time series of 2.5D surface data,produced in several domains such as weather simulation, oceanography or dynamic topography. The developmentof the framework was driven by a new approach to seismic interpretation, presented in this paper, which introducesensemble computation and visualization to seismic horizon extraction. We propose the use of a representativesurface rendered in 3D and textured with the results of a statistical analysis. If available the surfaces can berendered in combination with 3D volume data to provide spatial context. Finally we present the application of thisframework in three application scenarios.

Categories and Subject Descriptors (according to ACM CCS): I.3.8 [Computer Graphics]: Applications—

1. Introduction

Ensemble data is ubiquitous in many sciences. Most of thesedata is the result of simulation, for example for weather pre-diction or ocean simulation. Computing ensembles insteadof a single representation, however, might be useful in otherareas, such as image segmentation, too. In Section 4.1 ofthis work we introduce the use of ensemble computation forseismic interpretation.

Often it is assumed that the members of an ensemble form

a normal distribution, which is reasonable for probabilis-tic simulations, for example in meteorology [SAT97], butmight fail in other domains. In fact the application for seis-mic interpretation presented in this work will usually pro-duce multi-modal distributions. The advantage of assuminga normal distribution is that one can safely reduce the datato mean and standard deviation. For other distributions, evenunimodal, such a reduction will, however, most likely leadto false conclusions about the data.

submitted to Eurographics/ IEEE-VGTC Symposium on Visualization (2012)

Guoning Chen

Cross-Out

Guoning Chen

Replacement Text

Chen

Guoning Chen

Highlight

Guoning Chen

Highlight

, which can be a surface in the ensemble,

Guoning Chen

Cross-Out

Guoning Chen

Replacement Text

apply

Guoning Chen

Cross-Out

Guoning Chen

Replacement Text

to

Guoning Chen

Cross-Out

Guoning Chen

Replacement Text

are

Guoning Chen

Cross-Out

2 T. Höllt & G. Cheng & C. D. Hansen & M. Hadwiger / Interactive Exploration & Analysis of 2.5D Ensemble Data

In this work we present a framework for interactive explo-ration and analysis of 2.5D ensemble data of arbitrary distri-bution. While we also reduce the ensemble data to a singlerepresentative plus a statistical property for visualization theoriginal data is available throughout the whole pipeline. Therepresentative can be the mean, when applicable, but can bechosen from a set of other representatives including a max-imum likelihood surface, introduced in this work, in othercases. In addition to the representative surface we providethe results of a complete statistical analysis to the user.

We provide the user with tools to explore the parameterspace of the ensemble, define the range for the parameters,or pick specific surfaces of the ensemble for detailed inves-tigation. To allow on-the-fly recomputation of the surfacerepresentatives and statistical analysis during these opera-tions we implemented the complete analysis and visualiza-tion pipeline on the GPU using a combination of CUDA andOpenGL + GLSL.

The main contributions of this paper are

• A novel framework for interactive exploration and analy-sis of 2.5D ensemble data without any constraints on thedistribution of the ensemble members based on two novelconcepts:

– The maximum likelihood surface, as a valid represen-tative for ensembles with arbitrary surface distribu-tions.

– A 3D spatial distribution histogram, for efficient sta-tistical analysis of such ensembles.

• The introduction of uncertainty quantification via ensem-ble computation to seismic interpretation.

2. Related Work

Visualization of ensemble data was introduced by Luo etal. [LKP03]. The authors adapt standard visualization tech-niques for what they call spatial distribution data, which theydefine as a collection of n values for a single variable in mdimensions, essentially ensemble data. Frameworks for vi-sualization of ensemble data gained from weather simula-tions include Ensemble-Vis by Potter et al. [PWB∗09], andNoodles by Sanyal et al. [SZD∗10]. These papers describefully featured applications focused on the specific needs foranalyzing weather simulation data. They implement multi-ple linked views to visualize a complete set of multidimen-sional, multivariate and multivalued ensemble members. Themain difference to our work is that these frameworks pro-vide tools for visualizing complete simulation ensembles in-cluding multiple dimensions, while we focus on 2.5D dataonly in this work. Additionally with the focus on meteo-rology the authors can safely assume a Gaussian distribu-tion of the ensemble members, while me make no such as-sumptions. Healey and Snoeyink [HS06] present a similarapproach for visualizing error in terrain representation. Here

the error, which can be introduced by sensors, data process-ing or data representation is modeled as the difference be-tween the active model and a given ground truth.

A good introduction to uncertainty visualization is pro-vided by Pang et al. [PWL97], who present a detailed clas-sification of uncertainty, as well as numerous visualiza-tion techniques, including several concepts applicable to(iso-)surface data, like fat surfaces. Johnson and Sander-son [JS03] give a good overview of uncertainty visualizationtechniques for 2D and 3D scientific visualization, includinguncertainty in surfaces. For a definition of the basic con-cepts of uncertainty and another overview of visualizationtechniques for uncertain data, we refer to Griethe and Schu-mann [GS06]. Riveiro [Riv07] provide an evaluation of dif-ferent uncertainty visualization techniques for informationfusion.

Rhodes et al. [RLBS03] present the use of color and tex-ture to visualize uncertainty in iso-surfaces. Brown [Bro04]employs animation for the same task. Grigoryan and Rhein-gans [GR04] present a combination of surface and pointbased rendering to visualize uncertainty in tumor growth.Here uncertainty information is provided by rendering pointclouds in areas of large uncertainty compared to crisp sur-faces in certain areas.

Recently Pöthkow et al. [PH10, PWH11] as well as Pfaf-felmoser et al. [PRW11] presented techniques to extractand visualize uncertainty in probabilistic iso-surfaces. Inthese approaches for visualizing uncertainty in iso-surfacesa mean surface is rendered as the main representative surfacewhile the positional uncertainty is represented by a ’cloud’around this surface. While this technique works very well forthe presented probabilistic iso-surfaces (forming a Gaussiondistribution) it might not be the best option for multi-modaldistributions.

Lundström et al. [LLPY07] propose the use of anima-tion to show the effects of probabilistic transfer functionsfor volume rendering. A system which models and visual-izes uncertainty in segmentation data based on a priori shapeand appearance knowledge has been presented by Saad etal. [SHM10].

The density of curves in 1D function plots can been com-puted and visualized effectively using kernel density estima-tion [LH11]. Our histogram view that shows the distributionof 2D surfaces passing through each (x,y) position (Figure 1,center) is similar in spirit to such approaches, but for primi-tives of one dimension higher.

The main motivation for our work was an application inseismic interpretation. Seismic data is inherently noisy andoften ambiguous, resulting in a large amount of uncertaintywhen extracting features. Höllt et al. [HBG∗11] present aworkflow for extracting seismic horizons using global en-ergy minimization. We extended their work by parameteriz-ing the employed cost function and automatically sampling


Guoning Chen

Highlight

can be or has been

Guoning Chen

Highlight

reads weird. Typically we say surfaces embedded in 3D

T. Höllt & G. Cheng & C. D. Hansen & M. Hadwiger / Interactive Exploration & Analysis of 2.5D Ensemble Data 3

the parameter space to compute ensembles of surfaces in-stead of single surfaces and integrated the ensemble visual-ization into their proposed workflow.

For an introduction to simulation in oceanography we re-fer to [SC00]. The particular ensemble used in this workwas computed using advanced ensemble Kalman filter-ing [Eve06].

3. Exploration & Analysis of 2.5D Ensemble Data

Similar to the approaches presented by Pöthkow etal. [PH10, PWH11] as well as Pfaffelmoser et al. [PRW11]we render a single surface representing the ensemble in 3D.The representative can be a mean surface in case of a Gaus-sian distribution but for other cases we introduce other repre-sentatives including a maximum likelihood surface. We per-form an extensive statistical analysis of the data which re-sults in several possible representative surfaces as well as aset of indicators for the positional uncertainty, for examplethe standard deviation. Instead of indicating this data by thethickness of a cloud around the representative surface wedecided to texture the surface with this data. In addition weprovide a view showing the histogram and probability den-sity distribution for a selected position. The view providesthe user with detailed information on the distribution of sur-faces at any desired position, which is especially importantfor multi-modal distributions.

The input to our system is a set of heightfields or dis-crete 2D functions f : N×N 7→R representing the same fea-ture. These can be part of a simulation ensemble, e.g. fromweather forecasts or oceanography, a time series of some sortor, as shown in the seismic application scenario presentedin Section 4.1, the results of a parameterized segmentation.Even though we focus on heightfields or surfaces in 2.5D inthis work the concepts can also be applied to surfaces in ndimensions as long as the correspondences between all sur-faces in the dataset are known for every nD-datapoint. In ourcase we assume the 2D spatial coordinate to be the corre-spondence between the surfaces.

3.1. Statistical Analysis

The basis for the statistical analysis of the input data is a3D spatial distribution histogram. We define the axes of thehistogram such that the x and y axes correspond to the do-main of the input function and the z-axis to the range. Thisresults in a volume with the same x and y extents as the in-put surface data and the z extent depending on the range andsampling of the image of the input function. At each posi-tion in the histogram the number of surfaces passing throughthis position is counted. As we consider only heightfields weknow that every surface passes through every (x,y)-positionat most once. And since we consider ensemble data of thesame object we also know that the domain is the same for

all surfaces meaning that every surface passes through every(x,y)-position exactly once.

This results in two important properties:

First, we can interpret the 3D histogram as a set of 1Dhistograms, one for each (x,y)-position. This means that thestatistical analysis can be carried out for each (x,y)-positionseparately. We use this to parallelize the computation asshown in Section 3.2.

Second, each of these 1D histograms can directly be in-terpreted as a probability distribution of the surfaces at thecorresponding (x,y)-position by dividing the value of eachbin by the total number of surfaces. In addition to this sim-ple probability measure we also compute the kernel densityestimate to approximate the probability density function foreach (x,y)-position.

From these 1D histograms a number of statistical prop-erties, including Range, Mean, Median, Maximum mode,Standard deviation, Variance, Skewness and Kurtosis arecomputed for each (x,y)-position.

We already established that the mean might not be a goodrepresentative for ensembles with multi-modal distributions.In addition sometimes it might be desired to use an actualsurface from the ensemble as the representative, instead ofa synthetically created one. For example, when one expectsone of the ensemble members to be the correct solution. Forthis kind of data we introduce an additional representative:The maximum likelihood surface. This surface is an actualsurface from the ensemble. Which surface that is, is decidedby a likelihood value assigned to each of the surfaces in theensemble. This likelihood value is computed by taking theheight- or function-value f (x,y) at each (x,y)-position of thesurface f and creating the sum of the corresponding proba-bilities, looked up in the probability density function (pdf) atthis position:

likelihood( f ) = ∑x

∑y

pdf(x,y, f (x,y)). (1)

The surface from the ensemble with the highest likelihoodvalue is called the maximum likelihood surface and is usedas the representative for the ensemble.

To explore the parameter space, for example to look atthe influence of a certain parameter, or to remove outliersthe statistical analysis can be carried out for the completeensemble or for any user defined subset of the ensemble.Therefore we provide sliders in the graphical user interfaceto set bounds for each parameter, or pick a specific parametervalue to visualize the corresponding surface.

Depending on the type of the parameter, picking a sin-gle value can have different results. In the general case itwill only have influence of the surface geometry, allowingthe user to browse through all surfaces of the selected pa-rameter range, while the input for the statistical analysis isstill the selected parameter range. The user can also define a


Guoning Chen

Highlight

single surface or individual surfaces

Guoning Chen

Cross-Out

Guoning Chen

Replacement Text

and

Guoning Chen

Highlight

??

Guoning Chen

Highlight

Guoning Chen

Highlight

$n$

Guoning Chen

Highlight

Guoning Chen

Cross-Out

Guoning Chen

Replacement Text

and is determined

Guoning Chen

Cross-Out

Guoning Chen

Replacement Text

on


Input

Shared GPU data

Statistical Analysis CUDA

VisualizationOpenGL+GLSL

mean surface(added to ensemble)

active property(range, variance, etc.)

heightfield

heightfield

heightfield

...

Ensemble

surfacedisplacement

surfacetexturing

volumerendering[ ]

set of 2Dheightfields

3D histogram

3D histogramcomputation

User definessubset

statisticalanalysis

maximumlikelihood id

vertexshader

fragmentshader

color map

fixedvertex-buffer

Shared

Visualizh

ddata

(ad

f

m

m m

y

LSLL

fields

xed

mea

( g

cocolo

writeread

optional[ ]

Figure 2: Pipeline overview.

parameter to be treated as time. While computing the statis-tical analysis over all time steps might be useful to track theareas of large variation over time, when looking at the uncer-tainty it only makes sense to look at a single time step at atime. Thus, when treating a parameter as time the statisticalanalysis is also only carried out for the selected value of thisparameter.

We designed the computation pipeline such that only sta-tistical properties which are currently needed by the vi-sualization part of the pipeline are computed. The systemcaches all results and sets an invalid flag when a parameteris changed. Statistical properties are only recomputed whenrequested for the visualization and invalidated before. If theproperty is dependent on another property the invalid flagis evaluated and this property is recomputed if needed. Forexample to compute the standard deviation and kurtosis werequire the variance, meaning when switching from visual-izing the kurtosis to the standard deviation for the same pa-rameter range the variance does not need to be recomputed.

To allow interactive exploration of the parameter spaceupdates of the statistical analysis must be carried out in realtime or at least at interactive rates. Therefore we imple-mented the GPU-based pipeline presented in the followingsection.

3.2. GPU-Based Analysis and Visualization Pipeline

The GPU-based analysis and visualization pipeline is illus-trated in Figure 2. The pipeline is divided into two mainparts: The statistical analysis is carried out using CUDA,while the visualization is based on OpenGL and GLSLshaders. The data is shared between the two parts of thepipeline, so that after the initial upload of the ensemble ontothe GPU no expensive bus transfer is necessary.

Before the actual computation the ensemble is converted

into a 3D texture and loaded onto the GPU. Every height-field of the ensemble will be represented by one slice in thetexture. Additional space for the mean, median and maxi-mum mode heightfield will be reserved in this texture. Thesurfaces are indexed using the original parametrization. I.e.if there is only a single parameter, for example the time stepsin a time series, the surface ID corresponds to the texture in-dex. For higher dimensional parameter space, e.g. ensembleID plus time, the linear texture index is computed from theoriginal parameters. This allows the user to define subrangesfor each parameter separately, for example to examine thecomplete ensemble at a single time step.

Changes in the parameter range trigger an update of the3D histogram and subsequently of the representative surfaceand property texture. Since each surface provides exactlyone entry to the histogram per (x,y)-position, rather than set-ting up a thread for each surface we set up one thread per(x,y)-position. Each thread then loops over all selected sur-faces and inserts the height values into the histogram. Thisway write conflicts can be avoided and no critical sectionsor atomic operations are needed. The kernels for the statisti-cal analysis are set up in a similar fashion. The active prop-erty is computed with one thread per (x,y)-position. Withexception of the maximum likelihood ID the main differenceto the histogram computation is that this results in a singlevalue per thread which are then assembled in a 2D texture.While mean, median and maximum mode are attached to the3D heightfield texture to be used as representative surfaces,the other properties are copied in a 2D texture available tothe visualization pipeline for texturing the surface. The re-sult of the maximum likelihood ID computation is a singlevalue, therefore the resulting values fromm all threads haveto be summed up in a reduction step for each surface and themaximum value has to be found in a second reduction overall surfaces.


Guoning Chen

Inserted Text

,

Guoning Chen

Cross-Out

Guoning Chen

Replacement Text

is

Guoning Chen

Cross-Out

Guoning Chen

Replacement Text

from


By exploiting the parallelism of the GPU and eliminatingcostly bus transfers between CPU and GPU this allows in-teractive modification of the parameter range even for largeensembles. For a detailed performance analysis refer to Sec-tion 5.

With the ensemble data already on the GPU we imple-mented a rendering pipeline that allows efficient surface ren-dering. Instead of creating new surface geometry every timea different surface of the ensemble is rendered a single gen-eral vertex buffer of fixed size is created, which covers thecomplete (x,y) domain, but does not contain any height in-formation. The z-value of each vertex is initialized to zero,and set later using a vertex shader. Before transforming thevertex coordinates into view space, the object space (x,y)-coordinates of the vertex in combination with the ID of therepresentative surface (e.g, the maximum likelihood surfaceor the mean surface) are used to look up the z-value of thecurrent vertex in the ensemble texture. At this point, we havethe correct geometry for the surface. In order to be able to vi-sualize the results of the statistical analysis, the object spacecoordinates are attached to every vertex as texture coordi-nates (x and y are sufficient). In the fragment shader, thisinformation can then be used to lookup the active statisti-cal property in the 2D texture. This texture contains the rawinformation from the statistical analysis, which is then con-verted to the fragment color by a lookup in a 1D color map.We provide a couple of continuous, diverging cool-to-warmcolor maps as presented by Moreland [Mor09] but also allowthe creation of custom color maps. These color maps mini-mally interfere with shading, which is very important in thiscase, as shading is an important feature to judge the shapeof a surface. During testing we realized that using the con-tinuous version made it very hard to relate an actual value toa color in the rendering so we decided to optionally providea discrete version with ten steps. After the surface geome-try has been rendered, optionally an accompanying volumecan be rendered in a second pass to achieve correct visibil-ity [SHNB06].

With the described pipeline in place, a number of featurescan be implemented very easily and efficiently. If desired,the user can choose to render any surface from the ensemble.This requires no data transfer to or from the GPU, except forthe ID of the surface in the ensemble to render. We providethe possibility to fix multiple parameters to a single valueinstead of a range. The user can then use a slider to browsethrough all surfaces in the ensemble, while getting live up-dates. In addition, it is possible to automatically animate allsurfaces in a predefined range. As shown by Brown [Bro04],as well as Lundström et al. [LLPY07], animation is a pow-erful tool for visualizing uncertainty. In our system, animat-ing the ensemble gives a nice impression of the parametersthat result in similar surfaces, as well as of which areas inthe dataset react more or less to changes in the parametriza-tion of the cost function. Similar surfaces or surface parts inthe ensemble will result in little variation in the animation,

whereas areas of large variance will show more movementand thus automatically draw the users attention. This is alsovery useful for time dependent data. If desired fixing the timedimension completely removes it from the statistical analy-sis and every time step can be viewed separately or in ananimation.

The described visualization techniques can give a verygood impression of the quantitative variation in the data. De-tailed information on the surface distribution can be gainedby animating through or manually selecting single surfacesfrom the ensemble. But it is hard to compare more than twosurfaces this way. Therefore we provide an additional viewshowing the histogram and probability density distributionfor a selected position. The position to investigate can bepicked directly in the 3D view. All information that is re-quired for picking is already available in the proposed ren-dering pipeline: We use the same vertex shader as describedbefore for rendering the surface into an off-screen buffer ofthe same size as the frame buffer. Instead of using the objectspace coordinates to look up the scalar values in the frag-ment shader, we use the coordinates directly as the vertexcolor. This way, we can lookup the current mouse positiondirectly in the downloaded off-screen buffer. With the (x,y)-part of the resulting volume position, we can then directlylookup the histogram and probability density distribution forthis position. For easy comparability, we color the bin corre-sponding to the active representative surface differently thanthe remaining bins.

4. Application Scenarios

4.1. Seismic Interpretation - Horizon Extraction

Seismic interpretation describes the process of extractingsubsurface structures like horizons or faults from seismicsurveys. A horizon describes the boundary between two sub-surface layers, a fault describes discontinuities in these hori-zons. A seismic survey usually consists of one or more seis-mic recordings in form of scalar images or volumes plus ad-ditional data from drillings, so called well logs. Seismic datais inherently noisy and often ambiguous, resulting in a largeamount of uncertainty in the process of interpretation.

We extend the work of Höllt et al. [HBG∗11] to incorpo-rate our ensemble visualization framework to quantify theuncertainty of extracted horizons. In the original work theauthors present a framework for interactive seismic horizonextraction based on global optimization. Using the positionsof the well logs the authors propose to subdivide the seismicvolume in several connected sub-volumes. Using minimumcost path computations they first find the surface bound-ary for each sub-volume starting from a single seed pointand then compute the minimum cost surface, constrained bythis boundary, on the inside of each sub-volume. The au-thors chose to define the energy term by a simple, gray-valuebased cost function containing two parameters, which were,


Guoning Chen

Cross-Out

Guoning Chen

Replacement Text

E

Guoning Chen

Cross-Out

Guoning Chen

Inserted Text

and

Guoning Chen

Highlight

lookup or look up, please be consistent

Guoning Chen

Highlight

Guoning Chen

Highlight

Guoning Chen

Cross-Out

Guoning Chen

Replacement Text

customized

Guoning Chen

Highlight

Guoning Chen

Highlight

Guoning Chen

Highlight


(a) Maximum likelihood surface (b) Mean surface (c) Cut section showing ensemble distribtution

Figure 3: Figure needs to be replaced. - (a) and (b) show a comparison of the maximum likelihood surface with synthetic mean.The color coding indicates the difference between the amplitude at the volume position passed by the surface and the targetedamplitude. Blue means a small difference (better), red a bigger difference (worse). It is clearly visible why the mean surfaceis not suitable for our problem. Whereas the maximum likelihood surface is a good fit for the ridge line for large parts of thesurface (also visible in the slice view insets), the mean surface is just below the desired ridge for large parts of the surface.(c) shows cut sections of all surfaces of one ensemble rendered into a single slice view with reduced opacity. The multi-modaldistribution causing the bad results for the mean surface shown in (b) can clearly be seen.

however, not exposed to the user for the sake of simplicity.The cost is defined by the similarity of the current sampleto a global target value plus the similarity of the samples tobe connected to a path or surface. The first parameter is theweight between those two terms. The second parameter de-fines a coefficient for larger surfaces that is used to penalizeor enhance the surface. The total cost or energy E of a sur-face is defined by

Esurface(S) =∫

Sφ(x) dS(x), subject to δS (2)

with δS defining a surface boundary and φ(x) the cost func-tion. The integral results in an implicit penalty for larger sur-faces, hence the second parameter is used to allow larger orless ’stiff’ surfaces. Instead of hiding these parameters fromthe user we propose to sample the parameter space automat-ically which will result in a set or ensemble of surfaces in-stead of a single surface for each horizon. The standard de-viation resulting from the statistical analysis can directly beused to quantify the uncertainty and point the user to areaswhich need manual intervention. The user can then eitherset constraints and recompute the surface or skim throughthe surfaces in the ensemble to find the best fit.

4.1.1. Ensemble Computation

To start the ensemble computation, only a single seed pointis required, but an arbitrary number of points can be definedas additional constraints (for example derived from the ad-ditional data at the well positions) to force the resulting sur-faces through user-specified positions. The same seed pointand set of constraints are used to compute all surfaces in the

ensemble. Once the seed point and constraints are defined,the user can define a range and sampling rate for each pa-rameter to compute the ensemble. For each parameter set-ting, the seed point and constraints, as well as the parame-terized cost function, are given to the surface extraction al-gorithm. As the surfaces for each parameter setting are com-puted completely independent from all others, the ensemblecomputation can easily be parallelized, letting each node of acluster and/or processor core compute one surface at a time.The result of each run is a single horizon surface representedas a height field.

4.1.2. Visualization

The resulting ensemble has fundamentally different prop-erties compared to ensembles resulting from probabilisticcomputations. The most important is that the distribution ofthe surfaces is not a normal distribution. The reason for thatcan be found in the goals for segmentation or in this caseridge-/valley-tracking. Consider a ridge- or valley-surfaceclearly defined by the underlying scalar field. A proper seg-mentation would be expected to produce very similar (ifnot identical) results independent of the parameterization. Ingeneral, even in uncertain regions, one would desire a seg-mentation approach which produces stable results indepen-dent of the parameterization. If this is not possible over thewhole parameter domain the segmentation should at leastbe as stable as possible over ranges of the parameter do-main. For the ensemble that means that the distribution ofthe surfaces would be a clustering of very similar surfaces orsurface patches rather than a normal distribution. Figure 3cshows an example of that behavior. The traces are seeded on


Guoning Chen

Cross-Out

Guoning Chen

Replacement Text

of

Guoning Chen

Inserted Text

difference


the left side of the image, where the ridge line is very clear.The result is a single cluster containing all of the segmen-tations. The center region is very noisy and hardly providesany visible clues on the ridge progression, resulting in thetraces to fan out and form multiple clusters tagging differ-ent ridge lines behind the noisy region. Using the mean ofthese surfaces as the main representative for such an ensem-ble would hardly provide any meaningful information, or inthe worst case might confuse the user. Let us take anotherlook at Figure 3c. On their own each cluster on the rightside of the image tags a valid ridge line (even though onlyone is the desired result with the initial seed point in mind).The mean of these clusters however would be somewhere inbetween the ridge lines, hiding the fact that the actual seg-mentations actually detected proper ridge lines. Instead weuse the maximum likelihood surface, as described in Sec-tion 3.1 as the main representative for these kind of ensem-bles. Even though the maximum likelihood surface mightnot be the correct result it will reflect the desired propertiesdefined by the cost function. Figures 3a and 3b show render-ings of the maximum likelihood and the mean surfaces. Thecolor coding shows the difference of the volumes gray valueat each surface vertex and the target value for the ridge line(i.e. white). While the maximum likelihood surface is domi-nated by blue color, indicating small differences, the red ap-pearance of the mean surface indicates large differences overthe complete domain.

Figure 4 shows an example where the the original parame-ter setting used in [HBG∗11] produces a wrong result due tolarge uncertainty in the data. Using our framework we wereable to quickly identify the uncertain region in this surfaceand by skimming through the ensemble surfaces find the pa-rameter setting corresponding to a surface which representsthe underlying data better. The authors of [HBG∗11] proposeto use the cost of the surface as an indicator for the quality ofthe surface. We color coded the original result in Figure 4awith the cost and used the standard deviation for color cod-ing the ensemble visualization in Figure 4b. It can be seenthat the area where the surface extraction technique strug-gled is clearly highlighted in our framework while hardlynotable in the original result. Figure 4c shows cut sectionsof the original surface, the maximum likelihood surface andthe surface resulting from narrowing the parameter range.

4.2. Oceanography - Sea Level Simulation

The goal of physical oceanography is understanding thephysical properties of the ocean. One of these properties isthe sea level. Besides slow changes caused by global effects(according to the Permanent Service for Mean Sea Level thesea level rose by 18.5cm between 1900 and 2000 [PSM11])there are more rapid changes in the sea level caused by tides,surface waves or tsunamis. These phenomena cause coastalerosion but also pose danger to man made structures on thecoast or off-shore. In collaboration with an expert from our

geoscience department we implemented our proposed ex-ploration and analysis framework into an ocean simulationworkflow.

The presented dataset shows a simulation of the waterlevel in the Gulf of Mexico in eleven time steps of one weekeach. Every time step is represented by an ensemble of 50heightfields, resulting in a total of 550 surfaces. The spatialresolution is roughly 10km resulting in 256×225 data pointsper heightfield. The range of the height fields values is be-tween −1m and 1m around a predefined reference height.For the histogram computation we sampled the range witha resolution of 2cm, resulting in 100 bins per (x,y)-position.Uncertainty is introduced into the model by different startingconditions for each ensemble member. After each time stepactual measured data was used to verify the result and if nec-essary modify the starting condition for the next time step.The goal of this is to reduce uncertainty in each consecutivetime step. Figure 5 shows single frames of an animation overall time steps.

For exploration we map the time to the first parameter andthe starting condition to the second. It is important to sepa-rate time from the starting condition as the result of the sta-tistical analysis needs to be interpreted differently dependingon whether the it was carried out over all time steps or overa single time step only. While in the first the variation in thesurfaces indicates the movement of the sea over time in thelatter variation actually indicates uncertainty in the simula-tion.

The simulation is modeled with the assumption of a Gaus-sian distribution of the natural phenomena, which is, how-ever, mostly owed to limited computing resources neededfor these large scale simulations. This means right now thedistribution of the resulting ensembles is unimodal allowingus to use the mean surface as the main representative for theensemble. With our exploration framework the domain ex-perts can easily identify outliers and interactively constrictthe parameter range to remove them. With the GPU basedstatistical analysis pipeline the new mean surface is com-puted on the fly and can be viewed immediately to decidedwhether the modified parameter range yields the desired re-sult.

5. Results & Conclusion

The performance of the statistical analysis is crucial for in-teractive exploration of the parameter space. In the presentedapplications typically the analysis was carried out on around50 to 100 surfaces. Even though the oceanography simula-tion consists of a total of 550 surfaces, for analyzing the un-certainty only one time step needs to be considered at a time,resulting in only 50 surfaces. For datasets in this range ourproposed pipeline results in real time computation, but evenfor the complete oceanography dataset we maintain interac-tive speeds using commodity graphics hardware.


Guoning Chen

Highlight

fade out??

Guoning Chen

Cross-Out

Guoning Chen

Cross-Out

Guoning Chen

Cross-Out

Guoning Chen

Replacement Text

decide


50 Surfaces CUDA 550 Surfaces CUDA 50 Surfaces CPU CPU/GPU SpeedupProperty Depends w/o dep w dep w/o dep w dep w/o dep w dep w/o dep w dep

1 Histogram - 3.23 3.23 38.56 38.56 19.24 19.24 6.0x 6.0x2 PDF 1 12.93 16.16 12.78 51.34 45.70 64.94 3.5x 4.0x3 Range - 0.71 0.71 11.09 11.09 3.45 3.45 4.9x 4.9x4 Mean - 0.71 0.71 10.89 10.89 3.48 3.48 4.9x 4.9x5 Median 1 0.70 3.93 0.70 39.26 8.78 28.02 12.5x 7.1x6 Mode 1 1.40 4.63 1.41 39.97 4.65 23.89 3.3x 5.2x7 Variance 4 0.72 1.43 10.87 21.76 3.85 7.33 5.3x 5.1x8 Std Dev 4, 7 0.02 1.45 0.02 32.78 0.14 7.47 7.0x 5.2x9 Skewness 1, 4, 6, 7, 8 0.05 6.13 0.05 72.80 0.16 31.42 3.2x 5.1x

10 Kurtosis 4, 7 0.74 2.17 10.76 32.52 4.05 11.38 5.5x 5.2x

Table 1: The table shows the computation time for all properties. The first column shows id and name of the property. Thesecond column lists the ids of all which are needed to compute this property. The columns titled w/o dep show the computationfor just this property, assuming all dependencies are already cached and the columns titled w dep show the summed computationtime including all dependencies, basically the worst case scenario. Keep in mind, that this is really only needed when changingthe parameter range. For example the histogram and probability density function (the two most time consuming properties) arealways needed for the histogram view and as such recomputed with every parameter modification and thus recomputation isbasically never needed after that. The last two columns show the speedup from CPU to GPU without and including computationtime for the dependencies.

Table 1 shows computation times for a single time step (50surfaces) of the oceanographic dataset and the whole datasetcontaining 550 surfaces. The computations were performedusing an NVIDIDA GeForce GTX 580 with 1.5GB of graph-ics memory. The timings were averaged over 1000 kernelexecutions. As all data stays on the GPU no bus transfer hasto be considered. For comparison we also show computationtimes of a single time step on the CPU. The computationswere carried out on a workstation with two six-core Xeons(12 physical cores plus hyper threading) clocked at 3.33GHzand 48GB of main memory. The CPU computations wereparallelized using OpenMP, utilizing 24 threads.

In general it can be seen in Table 1 that using the GPUeven for 550 surfaces the slowest update including skewnessand all dependencies plus the probability density function(which needs to be computed for the histogram view) stillallows for more than ten updates per second. Compared tothe CPU version we achieved a speedup of roughly 5x forall tasks when considering the dependencies.

Histogram, range, mean, variance and kurtosis computa-tion do not use the histogram and as such rely solely onthe number of surfaces and valid data points per surface.We would expect the computation time for these values toscale linearly with the number of surfaces/valid data points,which seems to be in line with the measured numbers. Foreven larger datasets, however, it would make sense to com-pute range, mean, variance and kurtosis using the histogram.Which would result in constant time, only depending on thesize of the histogram. For the datasets here, however, thehistogram computation is the limiting factor. The probabil-ity density function, median and mode are looked up usingthe histogram, and as such is of the same magnitude for

the small and large data sets. Standard deviation and skew-ness are implemented as linear combinations of other surfaceproperties and thus computation times are also independentof the number of surfaces. With the dependencies precom-puted computation of both properties is trivial, resulting inthe very fast computation times.

To conclude, we presented an interactive framework forexploration and analysis of 2.5D ensemble data. We imple-mented a fully GPU based framework for statistical anal-ysis of 2.5D ensemble data without any prior knowledgeor assumptions on the distribution of the data. By using aGPU based framework we enable computation at interactivespeeds, even for large datasets, enabling interactive explo-ration of the parameter space. We presented the applicationof our framework in a seismic interpretation scenario, forwhich we also introduce the use of ensemble computation,as well as in ocean simulation.

References

[Bro04] BROWN R. A.: Animated visual vibrations as an uncer-tainty visualisation technique. In International Conference onComputer Graphics and Interactive Techniques in Australasiaand South East Asia (2004), pp. 84–89. 2, 5

[Eve06] EVENSEN G.: Data Assimilation: The Ensemble KalmanFilter. 2006. 3

[GR04] GRIGORYAN G., RHEINGANS P.: Point-based proba-bilistic surfaces to show surface uncertainty. Visualization andComputer Graphics, IEEE Transactions on 10, 5 (2004), 564–573. 2

[GS06] GRIETHE H., SCHUMANN H.: The visualization of un-certain data: Methods and problems. In Proceedings of SimVis’06 (2006). 2



[HBG∗11] HÖLLT T., BEYER J., GSCHWANTNER F., MUIGG P.,DOLEISCH H., HEINEMANN G., HADWIGER M.: Interactiveseismic interpretation with piecewise global energy minimiza-tion. In Proceedings of the IEEE Pacific Visualization Symposium2011 (2011), pp. 59–66. 2, 5, 7

[HS06] HEALEY C. G., SNOEYINK J.: Vistre: A visualizationtool to evaluate errors in terrain representation. In 3D DataProcessing, Visualization, and Transmission, Third InternationalSymposium on (2006), pp. 1056–1063. 2

[JS03] JOHNSON C. R., SANDERSON A. R.: A next step: Vi-sualizing errors and uncertainty. IEEE Computer Graphics andApplications 23, 5 (2003), 6–10. 2

[LH11] LAMPE O. D., HAUSER H.: Curve density estimates.Computer Graphics Forum 30, 3 (2011), 633–642. 2

[LKP03] LUO A., KAO D., PANG A.: Visualizing spatial distri-bution data sets. In VISSYM ’03: Proceedings of the symposiumon Data visualisation 2003 (2003), pp. 29–38. 2

[LLPY07] LUNDSTRÖM C., LJUNG P., PERSSON A., YNNER-MAN A.: Uncertainty visualization in medical volume renderingusing probabilistic animation. IEEE Transactions on Visualiza-tion and Computer Graphics 13, 6 (2007), 1648–1655. 2, 5

[Mor09] MORELAND K.: Diverging color maps for scientific vi-sualization. In Proceedings of the 5th International Symposiumon Visual Computing (2009), pp. 92–103. 5

[PH10] PÖTHKOW K., HEGE H.-C.: Positional uncertaintyof isocontours: Condition analysis and probabilistic measures.IEEE Transactions on Visualization and Computer Graphics PP,99 (2010), 1–15. 2, 3

[PRW11] PFAFFELMOSER T., REITINGER M., WESTERMANNR.: Visualizing the positional and geometrical variability of iso-surfaces in uncertain scalar fields. Computer Graphics Forum 30,3 (2011), 951–960. 2, 3

[PSM11] Permanent service for mean sea level (PSMSL). http://www.psmsl.org/, online 12 2011. 7

[PWB∗09] POTTER K., WILSON A., BREMER P.-T., WILLIAMSD., DOUTRIAUX C., PASCUCCI V., JOHHSON C. R.: Ensemble-vis: A framework for the statistical visualization of ensembledata. In IEEE Workshop on Knowledge Discovery from ClimateData: Prediction, Extremes. (2009), pp. 233–240. 2

[PWH11] PÖTHKOW K., WEBER B., HEGE H.-C.: Probabilisticmarching cubes. Computer Graphics Forum 30, 3 (2011), 931–940. 2, 3

[PWL97] PANG A. T., WITTENBRINK C. M., LODHA S. K.:Approaches to uncertainty visualization. The Visual Computer13 (1997), 370–390. 2

[Riv07] RIVEIRO M.: Evaluation of uncertainty visualizationtechniques for information fusion. In Information Fusion, 200710th International Conference on (2007), pp. 1–8. 2

[RLBS03] RHODES P. J., LARAMEE R. S., BERGERON R. D.,SPARR T. M.: Uncertainty visualization methods in isosur-face rendering. In EUROGRAPHICS 2003 Short Papers (2003),pp. 83–88. 2

[SAT97] SIVILLO J. K., AHLQUIST J. E., TOTH Z.: An ensem-ble forecasting primer. Weather and Forecasting 12, 4 (1997),809–818. 1

[SC00] STAMMER D., CHASSIGNET E.: Ocean state estimationand prediction in support of oceanographic research. Oceanog-raphy 13 (2000), 51–56. 3

[SHM10] SAAD A., HAMARNEH G., MÖLLER T.: Explorationand visualization of segmentation uncertainty using shape andappearance prior information. IEEE Transactions on Visualiza-tion and Computer Graphics 16, 6 (2010), 1366–1375. 2

[SHNB06] SCHARSACH H., HADWIGER M., NEUBAUER A.,BÜHLER K.: Perspective isosurface and direct volume render-ing for virtual endoscopy applications. In Eurovis 2006 (2006),pp. 315–322. 5

[SZD∗10] SANYAL J., ZHANG S., DYER J., MERCER A., AM-BURN P., MOORHEAD R. J.: Noodles: A tool for visualizationof numerical weather model ensemble uncertainty. IEEE Trans-actions on Visualization and Computer Graphics 16, 6 (2010),1421 – 1430. 2


http://www.psmsl.org/

http://www.psmsl.org/

Guoning Chen

Highlight

??


(a) Surface extracted using original partame-ter settings.

(b) Maximum likelihood surface from sam-pling the complete parameter space

(c) Cut sections through the originalsurface, the maximum likelihood sur-face and a new optimal fit resultingfrom interacitvely narrowing the pa-rameter range.

Figure 4: Figure needs to be replaced.

(a) t0 (b) t2 (c) t4

(d) t6 (e) t8 (f) t10

Figure 5: Colormap values are not readable, replace. - The mean surfaces for six different time steps of the water level sim-ulation. The color coding shows the standard deviation, scaled to the maximum value over all time steps. It is clearly visiblehow the standard deviation, indicating uncertainty, gets less in the later time steps due to the injection of measured data into thesimulation after each time step.


Guoning Chen

Highlight

Guoning Chen

Highlight

interactive exploration & analysis of 2.5d ensemble data › ~chengu › eurovis12 ›...

Documents