high performance computational biology and drug design...

42
National University of Defense Technology TianHe 天河 High Performance Computational Biology and Drug Design on Tianhe Supercomputers School of Computer Science National University of Defense Technology (NUDT) Presenter: Shaoliang Peng Email: [email protected]

Upload: lamthien

Post on 24-May-2018

216 views

Category:

Documents


2 download

TRANSCRIPT

National University of Defense Technology

TianHe

天河

High Performance Computational Biology and Drug Design on

Tianhe SupercomputersSchool of Computer Science

National University of Defense Technology (NUDT) Presenter: Shaoliang Peng

Email: [email protected]

National University of Defense Technology

TianHe

天河 About us

l School of Computer Science of NUDT –  The largest School of Computer Science:

l  10 institutes, 400 +faculties, and 3,000+ students

l Hometown of Supercomputers in China: Tianhe1 and 2 Supercomputers

–  No. 1 in TOP500 (2010.10, 2013.6, 2013.11, 2014.6, 2014.11, 2015.7, 2015.11) –  TH-2: 33.86 PFLOPS, 32,000 CPUs+48,000 MICs

2

National University of Defense Technology

TianHe

天河 About me- Dr. Shaoliang Peng l National University of Defense Technology (NUDT,

Changsha, China) and an adjunct professor of BGI. l High performance computing, bioinformatics, virtual

screening, and biology simulation. l We gains the Gold Award twice of PAC 2014 and

2015 (Parallel Application Challenge Competition) and IEEE Scale Chanllenge Finallist Award Ø Human Whole Genome Re-sequencing Analysis Software

Pipeline, Ø mD3DOCKxb: largest high throughput molecular docking

platform

3

National University of Defense Technology

TianHe

天河

National University of Defense Technology

TianHe

天河 3 National Supercomputer Centers Using TH

National University of Defense Technology

TianHe

天河 List of Top 500 Supercomputers @ 2013

National University of Defense Technology

TianHe

天河 Outline

l Overview of TianHe Supercomputers (1, 2 & 3)

l Applications on TianHe

l Bio-medical applications on TianHe

l Summary

National University of Defense Technology

TianHe

天河 Overview of TH-1

l  Hybrid architecture: CPU & GPU

l  Custom system software stack

Items Configuration Processors 14336 Intel CPUs + 7168 nVIDIA GPUs + 2048 FT CPUs

Peak performance 4.7PF, Linpack 2.57PF Interconnect Proprietary high-speed interconnection network TH-net

Memory 262TB in total Storage Global shared parallel storage system, 2PB Cabinets 140 compute / communication/storage Cabinets

Power consumption 4.04MW (635.15MF/W) Cooling Water cooling system

National University of Defense Technology

TianHe

天河

l Neo-heterogeneous architecture –  Xeon CPU & Xeon Phi

Items Configuration Processors 32000 Intel Xeon CPUs + 48000 Xeon Phis + 4096 FT

CPUs Peak performance is 54.9PFlops

Interconnect Proprietary high-speed interconnection network TH Express-2

Memory 2 PB in total Storage Global shared parallel storage system, 12.4PB Cabinets 125+13+24=162 compute/communication/storage Cabinets

Power 17.8 MW Cooling Closed Air cooling system

Overview of TH-2

National University of Defense Technology

TianHe

天河 TH-2A and TH-3

National University of Defense Technology

TianHe

天河 Roadmap  of  Tianhe  System �

System Tianhe-­‐1A Tianhe-­‐2 Tianhe-­‐2A

System  Peak(PF) 4.7 54.9 ~100

Peak  Power(MW) 4.04 17.6 ~18

Total  System  Memory 262  TB 1.4  PB ~3PB

Node  Performance(TF) 0.655 3.431 ~6  

Node  processors Xeon  X5670  Nvidia  M2050

Xeon  E5  2692  Xeon  Phi China  CPU  +  GPDSP  

System  size(nodes)   7,168  nodes 16,000  nodes ~18,000

System  Interconnect   TH  Express-­‐1 TH  Express-­‐2 TH  Express-­‐2+

File  System   2  PB  Lustre

12.4PB  H2FS+Lustre

~30PB  H2FS+TDM

National University of Defense Technology

TianHe

天河 Application scale in next 5 yearsApplica8ons Current  Scale  in  China   Scale  in  next  5  

years

Seismic  Explora8on  2600km2  ,  5km  depth

217900  shots2.2TB  data

Millions  of  shots

Genomics  Research     2PB  bioinforma[cs  data 100PB  bio  data  

New  Energy(Magne[c  Confinement  Fusion)  

2  billion  ions0.83  billion  electrons 100  billion  atoms

Drug  Design 200-­‐300ns  Molecular  Dynamics  simula[ons

10  Million  molecular1000ns/day

CFD  (Aircra`  Design) 3.5  billion  mesh  points 100  billion  mesh  

points

Universal  Evolu8on(neutrinos) 110  billion  par[cles   Trillion  par[cles  

Smart  City  (Urban  Electromagne[c  Spectrum  Monitoring  System)

Area  (Guangzhou  city):200km2

Grid  size:1.0km*1.0km

Grid  Size:  100m*100m

National University of Defense Technology

TianHe

天河

System Architecture

Hybrid Runtime

MPI

 

Domain Framework Data Management Tools

Hardware

Software

Application Domain Models

Proxy Apps

Algorithms Benchmarks

OS Compiler Library File System

OpenMP GA CUDA /OpenAcc Spark New Emerged

Programing Interface

Data Analysis

CPU/Accelerator Hybrid Node Memory Interconnection Storage Device

Solutions

Requirements

Constraints

Tradeoff

Bri

dge

Co-design Eco-system

National University of Defense Technology

TianHe

天河

l  National University of Defense Technology (NUDT), Changsha

Application areas

NUDT NSCC-CS Changsha, NSCC-TJ Tianjin, NSCC-GZ Guangzhou

National University of Defense Technology

TianHe

天河 Resources and Users on TH Supercomputer

——NSCC-TJ TH-1 (Nov.2010 – May. 2011)

NSCC-GZ TH-2: Bio-medical Users > 30%

National University of Defense Technology

TianHe

天河 Bio-medical Big Data Needs Big Computer

Extremely powerful computers are needed to help biologists to handle big-data traffic jams.

Nature 498, 255–260(13 June 2013), Biology: The big challenges of big data

Decreasing trend of the cost of DNA sequencing. (http://www.genome.gov/sequencingcosts/) The growing velocity of biological big data is way beyond Moore's Law of compute power growth.

National University of Defense Technology

TianHe

天河 Solving 3 Bio-medical Big Data Problems using TH

3 Kinds of Bio-medical Big Data Problems l  Computation-Intensive

–  Large-scale sequence alignment/assembly –  Virtual drug screening

l  Data-Intensive ü  Large Memory (2nd-3rd Denovo Genome Assembly ) ü  Intensive I/O (NGS Genome Data and Text Mining)

l  Communication intensive –  Bio-Network (Gene networks, Protein Interactions… )

ü  Design Characteristics of TH-2 –  32000 CPUs + 48000 MIC (Neo-hetergeneous Architecture)

–  1.4 PB MEM+ 〉12.4 PB Storage (Big and Fast)

–  Proprietary high-speed interconnection network

National University of Defense Technology

TianHe

天河 Bio-Software developed on Tianhe

l SOAP denovo2 (TH-1 and TH-2)

l SOAP3-dp & MICA (TH-1 and TH-2)

l mBWA (TH-2) �

l mSOAPsnp (TH-2)

l SOAPfuse (TH-2)

l GAMA (TH-1)

l SGA (TH-1) … …

18

National University of Defense Technology

TianHe

天河 Bio-applications on TianHe

l MPI-SGA: String graph based de novo assembly l GAMA: high-precision population genotype

analysis software l ABYSS: a de novo, parallel, paired-end sequence

assembler that is designed for short reads l ParMETIS: an MPI-based parallel library that

implements a variety of algorithms for Graph Partitioning, Mesh Partitioning, and Matrix Reordering. 

l  … …

16/11/7 TH-1 19

National University of Defense Technology

TianHe

天河 Deep Parallelized Optimization of Genome Big Data Analysis Software Pipeline

l  Applications:Clinical studies (cancer, Ebola, SARS),population genetics, evolutionary analysis, etc.

l  Challenge: 2,000human 30X deep sequence data (300TB)

20  

r Aim:Finding  personalised  genomic  varia2ons  (SNP,  CNV,  Indel,  etc.  )  ASAP.

National University of Defense Technology

TianHe

天河 Genome Big Data Analysis Software Pipeline on TH-2

21  

http://en.wikipedia.org/wiki/Single-nucleotide_polymorphism

National University of Defense Technology

TianHe

天河 MICA: Parallel short sequence alignment

(large scale approximate string matching) l MICA is an optimized version of SOAP3-dp

implemented to be accelerated by MIC on TH-2 l Optimization efforts:

–  Three-channel IO Latency Concealing–  Introduce 512 Bit SIMD code –  Parallelized Construction the Smith-Waterman Matrix

–  Prefetching of Index Data –  Using inline function calls

National University of Defense Technology

TianHe

天河 mSOAPsnp: Massively parallel SNP detection

Core: Bayesian Probability Computationl  Algorithm improvements

Ø  Compression of 4-D Sparse Matrix

Ø  Elimination of Computation Redundancy via Building a Fast

Table

Ø  Consistency Sorting of the Gradient

l  MIC specific optimizations:

Ø  Loop expansion, space padding to improve data spatial locality, SIMD

code

l  CPU+MIC collaborated computation

23  

National University of Defense Technology

TianHe

天河 Large scale deployment on TH-2

!Ø Data:2,000human 30X deep sequence data, 300TB in total Ø Scale:8,192 nodes (each with 2 CPU + 3 MIC) Ø Processing Time: 8 months to 8.37 hours (700X speedup ) • mSOAPsnp scales up to 8,192 nodes (196,608 CPU cores and 1,376,256 MIC cores, Parallel efficiency > 60.7%, Published in ISC 2015)

ü  MICA: http://sourceforge.net/projects/mica-aligner ü  mSOAPsnp: http://sourceforge.net/projects/msnp

National University of Defense Technology

TianHe

天河 Drug Design on Tianhe supercomputers

3 software used most 1.A CPU/MIC Collaborated Parallel Framework for GROMACS on TH-2 (GIW 2016) 2. mAMBER: A CPU/MIC Collaborated Parallel Framework for AMBER on TH-2 (BIBM 2016) 3. mD3DOCKxb: An Ultra-Scalable CPU-MIC Coordinated Virtual Screening Framework

National University of Defense Technology

TianHe

天河

A CPU/MIC Collaborated Parallel Framework for GROMACS on Tianhe-2

Supercomputer

Wenhe Su, Shaoliang PENG, Shunyun Yang, Xiaoyu Zhang, Tenglilang Zhang, Weiguo Liu, Xingming Zhao

Supported by: NSFC Grant 61272056, U1435222, and 1133005

School of Computer Science National University of Defense Technology

Changsha, China

The 27th International Conference on Genome Informatics 2016 Shanghai, China

National University of Defense Technology

TianHe

天河

mAMBER: A CPU/MIC Collaborated Parallel Framework for AMBER on

Tianhe-2 Supercomputer

Shaoliang Peng, Xiaoyu Zhang, Yutong Lu, Xiangke Liao,Weiliang Zhu, Dongqing Wei

School of Computer Science

National University of Defense Technology Changsha, China

IEEE BIBM 2016 Shenzhen, China

National University of Defense Technology

TianHe

天河

l  High Throughput Virtual Screening l  Applications: Computer Aided Drug Design, Molecular

Docking, and Virtual screening l  Challenges: Sudden illness and unknown virus appear,

screening as many molecules as possible to find the effective ones, but there are more than 35 millions of drug molecules on earth.

Aim:  Finish  docking  of  all  the  drug  molecules  on  earth  within  one  day.

Find  out  the  best  100  molecules;  

Do  experiment;

Clinical

High Throughput Virtual Screening

National University of Defense Technology

TianHe

天河

l  mD3DOCKxb –  Lamarckian Genetic Algorithm

l  Data Scale: 40 millions molecules, 800TB,20*40 millions small files l  Parallel mode: MPI + OpenMP; l  Bottleneck: IO bandwidth, Communication bandwidth l  Accelerator: MIC (offload mode) and CPU l  Components:

–  Communication Engine: l  Multi layers control: Task partitioning, Load balance l  Divide tasks into two batches: prevent repeated calculation l  Sleep by groups: avoid too many IO and message passing in an instant

–  Execute Engine: Judge whether a job run on CPU or MIC –  Collaborated Accelerator: Multithreading is implemented both on CPU

and MIC which handle tasks independently and collaborated accelerate the software

Multi levels parallel and high throughput molecular docking software with MIC+CPU collaborated

National University of Defense Technology

TianHe

天河 mD3DOCKxbr  2  CPUs  +  3  MICs,  offload  mode  r  Massive  small  files  access(40millions)  

Ø  CongesOon  control  Ø  MulO-­‐stage  task  schedule  Ø  Task  pool  management  Ø  Asynchronous  i/o  &  comm  

r  Improving  i/o  performance  10x  (>800  thousands  hybrid  cores)  

r  Reducing  MPI  lantency  from  one  hour  to  several  seconds  

National University of Defense Technology

TianHe

天河 42 millions real drug molecules docking against Ebora virus in one day ON TH-2

l  We finished 42 millions dockings from 500 to 8000 nodes on TH-2. The parallel efficiency of mD3DOCkxb is over 84%.

42  millions  drug  molecules  are  all  the  known  of  the  earth,  so  finish docking against unknown virus within one day is possible!  

National University of Defense Technology

TianHe

天河 Data compression using GPUl  For mainstream biological data storage format, improve the

compression efficiency. l  The test results show that the with column-major block

compression method can improve compression efficiency

32

0

20

40

60

80

100

120

gzip bzip2 以列为主分块压缩

FASTQ format

压缩速度(MB/S) 。

GPU Accelerated Adaptive Compression Framework for Genomics Data, Guixin Guo, Shuang Qiu, Bingqiang Wang, Mian Lu, Simon See, IEEE BigData’13

National University of Defense Technology

TianHe

天河

Ø Mass data:rapid development for Genome sequencing technology, data is accumulated as exponent speed

Ø Hyper-scale computational requirement:computational scale become more complicated and bring challenge to architecture

n  Bioinformatics and Computational Biology will be the main application domains for supercomputing

Ø Analysis about difference and relevance for Population Genomics ü TB data , even PB data ü Complicated computational model ü Data intensive computing, require high performance

from the storage, communication and other subsystem.

Biology Sequence big data analysis I

National University of Defense Technology

TianHe

天河

Ø  “TH-2” platform is able to provide the higher efficiency and higher precision solution for Bioinformatics. ü High-speed storage system solve the mass data input/

output problem of Bioinformatics analysis ü Heterogeneous computation could complete the

complicated computational model ü High-speed communication network gets rid of the

scalability problem of parallel computation Ø  Application achievement based-on “TH”

ü Design and Complete the high-resolution analysis software of Population Genomics

ü  Establish the software environment for GPU speed-up Bioinformatics

n  The “BT+IT” application model will lead the development of Bioinformatics in the future

Biology Sequence big data analysis II

National University of Defense Technology

TianHe

天河

Drug Design on Tianhe super computers

National University of Defense Technology

TianHe

天河 Drug Design on TH

l Virtual Drug Screening –  use of computational resources

to more effectively and efficiently find compounds that may act as drugs.

–  computational technique used in drug discovery to search libraries of small molecules in order to identify those structures which are most likely to bind to a drug target, typically a protein receptor or enzyme.

National University of Defense Technology

TianHe

天河 Application of this software in drug discovery

l  Prediction–  Docked more than 300,000 drugs/

natural products/commercial compounds against 1,100 drug targets.

l  Experimental Validation–  Evaluated more than 600 drugs/natural

products/commercial compounds in vitro and in vivo; yielded 513 active compounds in vitro and 7 active compounds in vivo.

l  Significance –  Provides lead compounds for cancer,

HBV, and diabetes etc.

National University of Defense Technology

TianHe

天河 Paper List1.  Fang X, ..., Xiangke Liao, Xiaoqian Zhu, Shaoliang Peng, et al. Genome-wide adaptive complexes

to underground stresses in blind mole rats, Spalax: adaptive complexes to stressful life underground. Nature Communication

2.  Luo R, Heng Wang..., Xiaoqian Zhu, Shaoliang Peng, et al.MICA: A fast short-read aligner that takes full advantage of Intel Many Integrated Core Architecture (MIC). BMC Bioinformatics.

3.  Jia W,... Xiangke Liao, Shaoliang Peng, et al. SOAPfuse: an algorithm for identifying fusion transcripts from paired-end RN. et A-Seq data[J]. Genome biology, 2013, 14(2): R12.

4.  Luo R, ..., Xiaoqian Zhu, Shaoliang Peng, et al. SOAP3-dp: Fast, Accurate and Sensitive GPU-Based Short Read Aligner[J]. PloS one, 2013, 8(5): e65632.

5.  Wang J, Peng S, Cossins B P, Xiaoqian Zhu, et al. Mapping Central α-Helix Linker Mediated Conformational Transition Pathway of Calmodulin via Simple Computational Approach[J]. The Journal of Physical Chemistry B, 2014, 118(32): 9677-9685.

6.  Luo R, ..., Xiangke Liao, Xiaoqian Zhu, Shaoliang Peng, et. al. SOAPdenovo2: an empirically improved memory-efficient short-read denovo assembler [J]. GigaScience, 2012, 1(1): 18.

7.  Feng Zhang, Xiangke Liao, Shaoliang Peng, Bingqiang Wang, Xiaoqian Zhu. MPISGA: A Program for Speeding up String Graph Based Assembly on Tianhe Supercomputer. ICG-7 & BioIT 2012, Hong Kong, 2012

8.  Yingbo Cui, Xiangke Liao, Shaoliang Peng, mBWA: a Massively Parallel Sequence Reads Aligner, PACBB 2014, Spain.

38

National University of Defense Technology

TianHe

天河 Patents, Software Copyright, and AwardsPatents l  A three-stage pipeline based parallel alignment algorithm by CPU cooperating with MIC l  Task model building method based on biological gene sequencing log l  Strategy-based deployment method of computing tasks on virtual machine Software copyrights l  Gene sequence assembly software based on string graph theory l  High-throughput computing system for bioinformatics analysis V1.0 AwardsThe Scaling Genome Big Data Analysis Software on TH-2 Supercomputer, l  The Eighth IEEE International Scalable Computing Challenge-SCALE 2015: Finalist Awards l  Parallel Application Challenge 2014, Best Application Golden Award (1/85)

39

National University of Defense Technology

TianHe

天河 SUMARRY

l Computation intensive problems l Data intensive problems (big data …) l Network intensive problems

Tianhe 2 supercomputer have: l 32000 CPUs + 48000 MIC l 2 PB in total + 40 PB Storage l Proprietary high-speed interconnection network

40

Which Bio-applications moving to Tianhe supercomputer? (Running time is too long to tolerant)

National University of Defense Technology

TianHe

天河 SUMARRY

l SOAP denovo2, SOAP3-dp, SOAPfuse, PacBio,

GWAS: PERMORY-MPI, GAMA-GPU, MICA-

BWA, TH-Cloud Computing … …

41

Bio-applications on TH

 Friends' friends are good friends …l TH series supercomputers are open to all friends

not only on life&bio research

l Big computer + big bio-medical data = big science

National University of Defense Technology

TianHe

天河

Thanks!

Welcome to visit us and use TianHe supercomputer! Email: [email protected]

16/11/7 42

TianHe