national cancer institute hpc and life sciences jack collins advanced biomedical computing center...

13
National Cancer Institute HPC and Life Sciences Jack Collins Advanced Biomedical Computing Center Advanced Technology Program SAIC-Frederick, Inc. National Cancer Institute at Frederick April 15, 2008

Post on 21-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: National Cancer Institute HPC and Life Sciences Jack Collins Advanced Biomedical Computing Center Advanced Technology Program SAIC-Frederick, Inc. National

Nat

iona

l Can

cer

Inst

itute

HPC and Life SciencesHPC and Life Sciences

Jack CollinsAdvanced Biomedical Computing CenterAdvanced Technology ProgramSAIC-Frederick, Inc.National Cancer Institute at Frederick

April 15, 2008

Page 2: National Cancer Institute HPC and Life Sciences Jack Collins Advanced Biomedical Computing Center Advanced Technology Program SAIC-Frederick, Inc. National

Science Driven ComputationScience Driven Computation

Next-Gen Sequencing

Metabolomics Structural Biology

Epigenomics Regulatory Networks

Nanotechnology

Micro-array Protein Pathways

Drug Design (traditional)

Comparative Genomics GWAS

Systems Biology

Data Analytics Pattern Recognition

Proteomics Image Analysis / Visualization

Clinical Outcome

Page 3: National Cancer Institute HPC and Life Sciences Jack Collins Advanced Biomedical Computing Center Advanced Technology Program SAIC-Frederick, Inc. National

• Storage (More, Faster, Ubiquitous)

• Interconnect (Faster, low latency)

• Compute Elements (CPU, GPU, FPGA)

• Memory (Large datasets)

• Visualization (Computer -> Human Bandwidth)

• Backup/Data Archive (Keep it Forever)

• Software Development

• $$$$ (Cost)

Computational (HPC) Issues (All of them)Computational (HPC) Issues (All of them)

Page 4: National Cancer Institute HPC and Life Sciences Jack Collins Advanced Biomedical Computing Center Advanced Technology Program SAIC-Frederick, Inc. National

• Compute Elements (High Performance)

– CPU (Multi- core, sockets, blades)

– Special (GPU, FPGA, ?)

• Programming Model (High Performance)

– Efficient, Open, Scalable, Accessible

Computing (High Performance) RequirementsComputing (High Performance) Requirements

Page 5: National Cancer Institute HPC and Life Sciences Jack Collins Advanced Biomedical Computing Center Advanced Technology Program SAIC-Frederick, Inc. National

• People solve problems

– Scientists, Engineers, Doctors, etc.

• People write Software

– Ready access to personal computers drove Linux Development and Open Source Software (Paradigm Shift)

• Ready Access to HPC will drive HPC Development

• People will use HPC when they are exposed to HPC and have access early in their career/life

Power to the People!Power to the People!

Page 6: National Cancer Institute HPC and Life Sciences Jack Collins Advanced Biomedical Computing Center Advanced Technology Program SAIC-Frederick, Inc. National

• Everyone has one

• Becoming more powerful

– Not all problems map well but some do!

• Programming Models

– CUDA (downloadable)

GPGPU (Why am I optimistic?)GPGPU (Why am I optimistic?)

Page 7: National Cancer Institute HPC and Life Sciences Jack Collins Advanced Biomedical Computing Center Advanced Technology Program SAIC-Frederick, Inc. National

• The CUDA ェ Toolkit is a C language development environment for CUDA-enabled GPUs

• The CUDA development environment includes:

– nvcc C compiler

– CUDA FFT and BLAS libraries for the GPU

– Profiler

– gdb debugger for the GPU (alpha available in March, 2008)

– CUDA runtime driver (now also available in the standard NVIDIA GPU driver)

– CUDA programming manual

CUDA (at the price of a download)CUDA (at the price of a download)

Page 8: National Cancer Institute HPC and Life Sciences Jack Collins Advanced Biomedical Computing Center Advanced Technology Program SAIC-Frederick, Inc. National

• Smith-Waterman S. Manavski, G. Valle, CRIBI Genomics March 2008 A Neural Network on GPU Billconan, Kavinguy March 2008 MDGPU: Molecular Dynamics simulation J.A. van Meel, A. Arnold October 2007 Interactive Visualization of Volumetric White Matter Connectivity in DT-MRI Won-Ki Jeong, P. Thomas Fletcher, Ran Tao, and Ross T. Whitaker October 2007 Astrophysical simulations based on smoothed particle hydrodynamics: Fourier Volume Rendering Andrew Corrigan and John Wallin, Computational and Data Sciences, George Mason University July 2007 Computational Astrophysics Lab, RIKEN: Astrophysical N-body simulation: The Chamomile Scheme Tsuyoshi Hamada and Toshiaki Iitaka July 2007 Computational biology string matching: CMATCH Michael C. Schatz and Cole Trapnell, Center for Bioinformatics & Computational Biology, University of Maryland May 2007 Simulation Open Framework Architecture (SOFA) for real-time simulation with an emphasis on medical simulation. INRIA and CIMIT February 2007 Visual Molecular Dynamics: VMD Beckman Institute, NIH, NSF, University of Illinois at Urbana-Champaign 2007 Scalable Molecular Dynamics: NAMD Beckman Institute, NIH, NSF, University of Illinois at Urbana-Champaign 2007 NVIDIA Texture Tools 2 AlphaSource Code NVIDIA 2007 PyStream: Python interface to CUDA, CUBLAS and CUFFT Tech-X Corporation 2007 Highly Optimized Object-oriented Molecular Dynamics: HOOMD Joshua A. Anderson, Chris D. Lorenz, and Alex Travesset: Iowa State University 2007 The Schroedinger project: portable libraries for the high quality Dirac video codec created by BBC R&D. Wladimir J. van der Laan, BBC R&D, Fluendo 2007

CUDA ExamplesCUDA Examples

Page 9: National Cancer Institute HPC and Life Sciences Jack Collins Advanced Biomedical Computing Center Advanced Technology Program SAIC-Frederick, Inc. National

• Molecular Docking for Small Molecules

– Open Source from Scripps Institute (Art Olsen, Garrett Morris)

– Typical of many codes in biology

– Not Designed for HPC

• Single Threaded

• Genetic Algorithm in iterative steps

• Partnered with Silicon Informatics to enable Autodock on GPU and modern multi-core

– Smart guys with experience

Autodock (Drug Design) Autodock (Drug Design)

Page 10: National Cancer Institute HPC and Life Sciences Jack Collins Advanced Biomedical Computing Center Advanced Technology Program SAIC-Frederick, Inc. National

The “HPC” System (could buy for home)The “HPC” System (could buy for home)

Page 11: National Cancer Institute HPC and Life Sciences Jack Collins Advanced Biomedical Computing Center Advanced Technology Program SAIC-Frederick, Inc. National

Autodock Results (not NAMD 100X+ but …)

(Only using 1 GPU - Tesla)Autodock Results (not NAMD 100X+ but …)

(Only using 1 GPU - Tesla)

PDB ID Method

Ligand (# of rotatable bonds in

ligand)

time in decimal # minutes si speedup

1hvr ADK Rigid XK263 (10) 152.221hvr ADK Flex XK263 (10) 238.221hvr siADK Rigid XK263 (10) 12.27 12.401hvr siADK Flex XK263 (10) 26.47 9.001hvr Autogrid Rigid XK263 (10) 0.391hvr Autogrid Flex XK263 (10) 0.39

1stp ADK Rigid Biotin (5) 38.531stp ADK Flex Biotin (5) 80.781stp siADK Rigid Biotin (5) 3.21 12.011stp siADK Flex Biotin (5) 8.47 9.541stp Autogrid Rigid Biotin (5) 0.271stp Autogrid Flex Biotin (5) 0.26

3ptb ADK Rigid Benzamidine (0) 19.943ptb ADK Flex Benzamidine (0) 49.653ptb siADK Rigid Benzamidine (0) 2.22 8.973ptb siADK Flex Benzamidine (0) 5.90 8.413ptb Autogrid Rigid Benzamidine (0) 0.423ptb Autogrid Flex Benzamidine (0) 0.43

Real

Page 12: National Cancer Institute HPC and Life Sciences Jack Collins Advanced Biomedical Computing Center Advanced Technology Program SAIC-Frederick, Inc. National

Motivation / Business CaseMotivation / Business Case

• Little or no significant cost in the desktop workstation

• Everyone has desktop with GPU

• Can dramatically change workflow and thinking– A 10X speedup can change an overnight (12 hour, 1

job/day) run (1 molecule) into ~4+ runs/workday thus increasing science productivity, greater interactivity

– A small group of 10 staff (with Desktops) could now generate 100+ runs during off hours with no additional hardware cost

– Success inspires bigger aspirations - so the HPC guys at the computing center could help us do 1,000 or 10,000 molecules a day with their big machines (so went the Walter Reed request).

Page 13: National Cancer Institute HPC and Life Sciences Jack Collins Advanced Biomedical Computing Center Advanced Technology Program SAIC-Frederick, Inc. National

AcknowledgementsAcknowledgements

• Bob Keller, Silicon Informatics

• Hemant Trivedi, Silicon Informatics

• Sarangan “Ravi” Ravichandran, ABCC