visualizing science at scale with longhorn - kelly gaither, texas advanced computing center

24

Upload: dell

Post on 23-Jun-2015

1.332 views

Category:

Technology


1 download

DESCRIPTION

Visualizing Science at Scale with LonghornKelly Gaither, Ph.D., Director of Data and Information AnalysisTexas Advanced Computing CenterVisualization is one of the most important and commonly used methods of analyzing and interpreting digital assets. For many types of computational research, it is the only viable means of extracting information and developing understanding from data. However, non-visual data analysis techniques—statistical analysis, data mining, data reduction, etc.—also play integral roles in many areas of knowledge discovery. In partnership with Dell, TACC has deployed Longhorn, the largest remote, interactive visualization and data analysis system in the world providing a comprehensive suite of large-scale visualization and data analysis services to the national open science community. This talk will provide an overview of Longhorn, including architectural considerations, and current science being enabled by the provision of this resource and associated services.

TRANSCRIPT

Page 1: Visualizing Science at Scale with Longhorn - Kelly Gaither, Texas Advanced Computing Center
Page 2: Visualizing Science at Scale with Longhorn - Kelly Gaither, Texas Advanced Computing Center

Visualizing Science at Scale with Longhorn

Kelly Gaither

Director of Data & Information Analysis/Research Scientist

Texas Advanced Computing Center

November 16, 2010

Page 3: Visualizing Science at Scale with Longhorn - Kelly Gaither, Texas Advanced Computing Center

Visualization Group at TACC

• 13 Full Time Staff– 6 Ph.D. – 4 Masters– 3 Bachelors

• 7 Students– 2 Undergraduate Students– 4 Masters Students– 1 PhD Student

Page 4: Visualizing Science at Scale with Longhorn - Kelly Gaither, Texas Advanced Computing Center

Visualization Group at TACC

• Areas of Expertise– Remote & Collaborative Visualization– Large Data Visualization– Large Scale GPU Clusters– Large Scale Tiled Displays – Data Mining & Feature Detection

Page 5: Visualizing Science at Scale with Longhorn - Kelly Gaither, Texas Advanced Computing Center

Motivation for Discussing Visualization Issues at Large-Scale

Page 6: Visualizing Science at Scale with Longhorn - Kelly Gaither, Texas Advanced Computing Center

Full-Scale Hurricane Prediction/Recovery

• Where will the next hurricane hit?• What is the projected loss?

Page 7: Visualizing Science at Scale with Longhorn - Kelly Gaither, Texas Advanced Computing Center

Cosmology• How did the earliest

galaxies form?• What was the first star

and when did it form?

Page 8: Visualizing Science at Scale with Longhorn - Kelly Gaither, Texas Advanced Computing Center

Biological Systems• What are the

processes at the sub-cellular level?

• How can we understand chains of reactions within living cells?

Page 9: Visualizing Science at Scale with Longhorn - Kelly Gaither, Texas Advanced Computing Center

Not Just Simulation Any More… Vastly more powerful

instruments and computers have led to an explosion of new data.

Modern science and engineering therefore is about managing and analyzing this data as well as modeling and simulation.

Page 10: Visualizing Science at Scale with Longhorn - Kelly Gaither, Texas Advanced Computing Center

What Role Does Visualization Play in Large-Scale Science

Why Are Pictures So Powerful?How Critical is Visualization to Science?

Page 11: Visualizing Science at Scale with Longhorn - Kelly Gaither, Texas Advanced Computing Center

LonghornFirst NSF XD Visualization Resource

• 256 Dell Dual Socket, Quad Core Intel Nehalem Nodes– 240 with 48 GB shared memory/node (6 GB/core)– 16 with 144 GB shared memory/node (18 GB/core)– 73 GB Local Disk– 2 Nvidia GPUs/Node (FX 5800 – 4GB RAM)

• ~14.5 TB aggregate memory• QDR InfiniBand Interconnect• Direct Connection to Ranger’s Lustre Parallel File

System• 10G Connection to 210 TB Local Lustre Parallel File

System• Jobs launched through SGE

256 Nodes, 2048 Cores, 512 GPUs, 14.5 TB Memory

Kelly Gaither (PI), Valerio Pascucci, Chuck Hansen, David Ebert, John Clyne (Co-PI), Hank Childs

Page 12: Visualizing Science at Scale with Longhorn - Kelly Gaither, Texas Advanced Computing Center

Longhorn Usage Modalities:

• Remote/Interactive Visualization– Highest priority jobs – Remote/Interactive capabilities facilitated through VNC– Run on 3 hour queue limit boundary

• GPGPU jobs– Run on a lower priority than the remote/interactive jobs– Run on a 12 hour queue limit boundary

• CPU jobs with higher memory requirements– Run on lowest priority when neither remote/interactive nor GPGPU

jobs are waiting in the queue– Run on a 12 hour queue limit boundary

Page 13: Visualizing Science at Scale with Longhorn - Kelly Gaither, Texas Advanced Computing Center

Longhorn Queue Structure

qsub -q normal -P vis

Page 14: Visualizing Science at Scale with Longhorn - Kelly Gaither, Texas Advanced Computing Center

Longhorn’s Lustre File System ($SCRATCH)

• OSS’s on Longhorn are built on Dell Nehalem Servers Connected to MD10000 Storage Vaults

• 15 Drives Total Configured into 2 Raid5 pairs with a Wandering Spare• Peak Throughput Speed of the File System is 5.86 GB/sec• Peak Aggregate Speed of the File System is 5.43 GB/sec

Page 15: Visualizing Science at Scale with Longhorn - Kelly Gaither, Texas Advanced Computing Center

Software Available on Longhorn

• Programming APIs: OpenGL, vtk– OpenGL – low level primitives, useful for programming at a

relatively low level with respect to graphics– VTK (Visualization Toolkit) – open source software system for 3D

computer graphics, image processing, and visualization– IDL

• Visualization Turnkey Systems – VisIt – free parallel visualization and graphical analysis tool– ParaView (Parallel Visualization Application) – open source

general purpose visualization system– EnSight – commercial turnkey visualization package target at CFD

visualization– Amira – commercial turnkey visualization package targeted at

visualizing scanned medical data (CAT scan, MRI, etc..)

Page 16: Visualizing Science at Scale with Longhorn - Kelly Gaither, Texas Advanced Computing Center

Connecting to Longhorn/Spur Using VNC

longhornlonghorn

laptopor

workstation

laptopor

workstation

ssh <user>@longhorn.tacc.utexas.eduqsub /share/sge/default/pe_scripts/job.vnctouch ~/vncserver.outtail –f ~/vncserver.out

longhornlonghorn

laptopor

workstation

laptopor

workstation

ssh –L <port>:longhorn.tacc.utexas.edu:<port> <user>@longhorn.tacc.utexas.edu

VNC server on vis node

ivis[1-7|big]

VNC server on vis node

ivis[1-7|big]

longhornlonghorn

laptopor

workstation

laptopor

workstation

vncviewer localhost::<port> automaticport forwarding

to vis node

establishessecure tunnelto longhorn vnc port

localhost connection forwarded to longhorn via ssh tunnel

contains vnc portinfo after job launches

Page 17: Visualizing Science at Scale with Longhorn - Kelly Gaither, Texas Advanced Computing Center

Longhorn Visualization Portalportal.longhorn.tacc.utexas.edu

>3000 jobs submitted through the portal

Page 18: Visualizing Science at Scale with Longhorn - Kelly Gaither, Texas Advanced Computing Center

EnVisionGreg Johnson, Brandt Westing

• Web-based visualization software that allows researchers to develop interactive visualizations intuitively.

• Currently integrated into the Longhorn Visualization Portal but can run independently.

• Began collaborations with ParaView team.

Page 19: Visualizing Science at Scale with Longhorn - Kelly Gaither, Texas Advanced Computing Center

Usage on Longhorn as of October 31 2010

• 498 active projects • 41,629 jobs run on the system• 3,980,297 SUs expended on the system

Page 20: Visualizing Science at Scale with Longhorn - Kelly Gaither, Texas Advanced Computing Center

Visualizing Oil SpillAdam Kubach, Karla Vega, Clint Dawson

• Visualization focused on the overlay of particle movement and satellite or aerial imaging data.

• The particles in the visualization represent the oil spill and their position is either hypothetical or reflect the position of the oil on the surface.

• The data has been visualized using Longhorn and MINERVA, which is an open source geospatial software. The data is generated daily and is approximately 100 GB in size.

Page 21: Visualizing Science at Scale with Longhorn - Kelly Gaither, Texas Advanced Computing Center

H1N1 Flu Outbreak Simulation Greg Johnson, Brandt Westing, TACC; Ned Dimitrov, Lauren Meyers, UT Comp. Bio

• Visualization of a swine flu epidemic spreading throughout North America.

• Epidemic begins in Mexico City. • Visualization classifies

individuals into three groups: susceptible (blue), infected (red), and recovered (green). Available antivirals are shown in purple.

• Cities and transportation links are highlighted in red to indicate large numbers of infected individuals and infectious travelers.

Page 22: Visualizing Science at Scale with Longhorn - Kelly Gaither, Texas Advanced Computing Center

Visualization of Hurricane Ike Greg Johnson, Romy Schneider, John Cazes, Karl Schulz, Bill Barth, TACC; Frank

Marks, NOAA; Fuqing Zheng, University of Pennsylvania; Yonghui Weng, Texas A&M.

• Throughout the 2008 hurricane season, the TACC was an active participant in a NOAA research effort to develop next-generation hurricane models.

• Using up to 40,000 processing cores at once, researchers simulated both global and regional weather models and received on-demand access to Ranger.

• Visualization of Hurricane Ike shows the storm developing in the gulf and making landfall on the Texas coast.

Page 23: Visualizing Science at Scale with Longhorn - Kelly Gaither, Texas Advanced Computing Center

Visualization of Large Scale Turbulent FlowKelly Gaither, Hank Childs, Greg Johnson, Karl Schulz, Cyrus Harrison,

Diego Donzis, Texas A&M; P.K. Yeung, Georgia Tech

• Remote interactive visualization of 17 time-steps of the largest turbulent flow simulation computed to date.

• First time this had been visualized interactively at this scale (40963).

Page 24: Visualizing Science at Scale with Longhorn - Kelly Gaither, Texas Advanced Computing Center

Thank You

Kelly [email protected]