molecular shape searching on gpus: a brave new world

43
FastROCS: What does it mean to be “fast”? OpenEye Scientic Software Brian Cole March 26, 2013 © 2013 OpenEye Scientic Software

Upload: can-ozdoruk

Post on 18-Jan-2015

70 views

Category:

Technology


2 download

DESCRIPTION

Shape is a fundamental three dimensional molecular property and a powerful descriptor for molecular comparison and similarity assessment; similarity in shape has proven to be a very effective method for predicting similarity in biology. As such shape-based virtual screening has become an integral part of computational drug discovery, due to both its speed and efficacy. OpenEye’s recent port of their shape similarity application, ROCS, to the GPU has resulted in a virtual screening tool of unprecedented power – FastROCS. FastROCS’ speed allows it to perform large-scale calculations of a kind inaccessible in the past and has accelerated more routine shape searching to the point that it has become competitive with more traditional, but less effective, two dimensional methods. Go through the slides to learn more. Try GPUs for free here: www.Nvidia.com/GPUTestDrive

TRANSCRIPT

Page 1: Molecular Shape Searching on GPUs: A Brave New World

FastROCS: What does it mean to be “fast”?

OpenEye Scienti!c Software Brian Cole

March 26, 2013 © 2013 OpenEye Scienti!c Software

Page 2: Molecular Shape Searching on GPUs: A Brave New World

FastROCS and the “Chasm”

OpenEye Scientific Software Brian Cole

© 2013 OpenEye Scientific Software March 26, 2013

Page 3: Molecular Shape Searching on GPUs: A Brave New World

ROCS: Rapid Overlay of Chemical Structures

March 26, 2013 © 2013 OpenEye Scienti!c Software

Page 4: Molecular Shape Searching on GPUs: A Brave New World

LeadHopper

March 26, 2013 © 2013 OpenEye Scienti!c Software

Page 5: Molecular Shape Searching on GPUs: A Brave New World

And then you wait…

March 26, 2013 © 2013 OpenEye Scienti!c Software

Page 6: Molecular Shape Searching on GPUs: A Brave New World

What is FastROCS?

CPU   GPU  

Shap

e  Overla

ys  per  Secon

d  

© 2013 OpenEye Scienti!c Software

High  is  

Best  

Page 7: Molecular Shape Searching on GPUs: A Brave New World

1  

10  

100  

1,000  

10,000  

100,000  

1,000,000  

CPU   GPU  

Shap

e  Overla

ys  per  Secon

d  

What is FastROCS?

© 2013 OpenEye Scienti!c Software

High  is  

Best  

Page 8: Molecular Shape Searching on GPUs: A Brave New World

©  2013  OpenEye  Scien;fic  So>ware  

0  

100,000  

200,000  

300,000  

400,000  

500,000  

600,000  

CPU   GPU  

Shap

e  Overla

ys  per  Secon

d  

What is FastROCS?

High  is  

Best  

Page 9: Molecular Shape Searching on GPUs: A Brave New World

1  

10  

100  

1,000  

10,000  

100,000  

1   10   100  

Log  (Elapsed

 5me  in  se

cond

s)  

Log  (cores/GPUs)  

March 26, 2013 © 2013 OpenEye Scienti!c Software

But I want it now!

ROCS  

FastROCS  Low  is  

Best  

Page 10: Molecular Shape Searching on GPUs: A Brave New World

Riding Moore’s Law

March 26, 2013 © 2013 OpenEye Scienti!c Software

0  200,000  400,000  600,000  800,000  

1,000,000  1,200,000  1,400,000  1,600,000  1,800,000  2,000,000  

C1060   C2050   C2075   C2090   K10   K20  

Shap

e  Overla

ys  per  Secon

d  

High  is  

Best  

Page 11: Molecular Shape Searching on GPUs: A Brave New World

ROCS user base

•  Every Pharma R&D •  Many BioTechs •  Many Universities •  National Labs and Research Centers •  Other software companies

March 26, 2013 © 2013 OpenEye Scienti!c Software

Page 12: Molecular Shape Searching on GPUs: A Brave New World

Licenses by Year

March 26, 2013 © 2013 OpenEye Scienti!c Software

2009   2010   2011   2012  

ROCS  

FastROCS  

High  is  

Best  

Page 13: Molecular Shape Searching on GPUs: A Brave New World

Licenses by Year (Linear Scale)

March 26, 2013 © 2013 OpenEye Scienti!c Software

2009   2010   2011   2012  

ROCS  

FastROCS  

%15  

Pharmageddon    

Page 14: Molecular Shape Searching on GPUs: A Brave New World

All ROCS users (linear scale)

March 26, 2013 © 2013 OpenEye Scienti!c Software

2009   2010   2011   2012  

Academics  

ROCS  

FastROCS  

%3  

Page 15: Molecular Shape Searching on GPUs: A Brave New World

Technology Adoption Lifecycle

March 26, 2013 © 2013 OpenEye Scienti!c Software

%2.5   %13.5   %34   %34   %16  

FastROCS  

Page 16: Molecular Shape Searching on GPUs: A Brave New World

What’s in the “chasm”?

•  “ROCS is already fast enough”

•  “The results aren’t bitwise comparable”

•  “There’s nothing else to run on the GPU”

•  “GPUs are different”

March 26, 2013 © 2013 OpenEye Scienti!c Software

GTC!  

Some  other  ;me…  

Page 17: Molecular Shape Searching on GPUs: A Brave New World

FastROCS Quick Start

•  crtl-alt-F1 (to switch to a non X-server terminal) •  login as root •  /sbin/init 3 (to turn off the X-server) •  ./NVIDIA-Linux-x86_64-285.05.09.run •  reboot •  ./cuda.sh to give /dev/nvidia* correct permissions

•  tar –xzf fastrocs-1.3.1-RHEL5-x64-OpenCL-1.1-CUDA-4.1.tar.gz •  openeye/bin/ShapeDatabaseServer.py database.oeb.gz •  openeye/bin/ShapeDatabaseClient.py localhost:8080 query.sdf out.sdf

March 26, 2013 © 2013 OpenEye Scienti!c Software

Page 18: Molecular Shape Searching on GPUs: A Brave New World

ROCS Quick Start

•  tar –xzf ROCS-3.1.1-RHEL5-x64.tar.gz

•  openeye/bin/rocs query.sdf database.oeb.gz

March 26, 2013 © 2013 OpenEye Scienti!c Software

S;ll  a  barrier  to  entry  to  work  around!  

Page 19: Molecular Shape Searching on GPUs: A Brave New World

This is even worse!

fastrocs-1.3.1-RHEL5-x64-OpenCL-1.1-CUDA-4.1.tar.gz

March 26, 2013 © 2013 OpenEye Scienti!c Software

NVidia  OpenCL  binaries  are  ;ghtly    locked  to  a  par;cular  driver  version  

Page 20: Molecular Shape Searching on GPUs: A Brave New World

Worthwhile to upgrade

March 26, 2013 © 2013 OpenEye Scienti!c Software

0  

100,000  

200,000  

300,000  

400,000  

500,000  

600,000  

700,000  

800,000  

C2050  (260  Driver)   C2050  (295  Driver)  

Conformers  /

 Secon

d  %11  

High  is  

Best  

Page 21: Molecular Shape Searching on GPUs: A Brave New World

Needed for new hardware

March 26, 2013 © 2013 OpenEye Scienti!c Software

0  

200,000  

400,000  

600,000  

800,000  

1,000,000  

1,200,000  

C2050  (295  Driver)   M2090  (295  Driver)  

Conformers  /

 Secon

d  

High  is  

Best  

Page 22: Molecular Shape Searching on GPUs: A Brave New World

Scalability between drivers (4x C2050)

March 26, 2013 © 2013 OpenEye Scienti!c Software

1  

2  

3  

4  

1   2   3   4  

Speedu

p  (Single  GPU

 5me  /  Mul5-­‐GPU

 5me)  

Number  of  GPUs    

Ideal  

260  driver  

295  driver  

High  is  

Best  

Page 23: Molecular Shape Searching on GPUs: A Brave New World

Really bad for 8x M2090

March 26, 2013 © 2013 OpenEye Scienti!c Software

0  

1  

2  

3  

4  

5  

6  

7  

8  

1   2   3   4   5   6   7   8  

Speedu

p  (Single  GPU

 5me  /  Mul5-­‐GPU

 5me)

 

Number  of  GPUs    

High  is  

Best  

Page 24: Molecular Shape Searching on GPUs: A Brave New World

Ways to transfer to device

•  CL_MEM_USE_HOST_PTR –  kernelBuf = clCreateBuffer(CL_MEM_USE_HOST_PTR)

•  CL_MEM_ALLOC_HOST_PTR|CL_MEM_COPY_HOST_PTR –  kernelBuf = clCreateBuffer(CL_MEM_ALLOC_HOST_PTR|CL_MEM_COPY_HOST_PTR)

•  CL_MEM_ALLOC_HOST_PTR –  kernelBuf = clCreateBuffer(CL_MEM_ALLOC_HOST_PTR) - cacheable –  ptr = clEnqueueMapBuffer(kernelBuf, CL_MAP_WRITE) –  memcpy(ptr, data) –  clEnqueueUnmapMemObject(ptr)

•  clEnqueueMapBuffer –  kernelBuf = clCreateBuffer() - cacheable –  ptr = clEnqueueMapBuffer(kernelBuf, CL_MAP_WRITE) –  memcpy(ptr, data) –  clEnqueueUnmapMemObject(ptr)

•  clEnqueueWriteBuffer –  kernelBuf = clCreateBuffer() - cacheable –  clEnqueueWriteBuffer(kernelBuf, data)

•  oclCopyCompute –  pinnedBuf = clCreateBuffer(CL_MEM_ALLOC_HOST_PTR|CL_MEM_READ_WRITE) – cacheable –  pinnedPtr = clEnqueueMapBuffer(pinnedBuf, CL_MAP_WRITE) – cacheable –  memcpy(pinnedPtr, data) –  kernelBuf = clCreateBuffer() – cacheable –  clEnqueueWriteBuffer(kernelBuf, pinnedPtr)

March 26, 2013 © 2013 OpenEye Scienti!c Software

Page 25: Molecular Shape Searching on GPUs: A Brave New World

Ways to transfer from device

•  CL_MEM_ALLOC_HOST_PTR –  kernelBuf = clCreateBuffer(CL_MEM_ALLOC_HOST_PTR) - cacheable –  ptr = clEnqueueMapBuffer(kernelBuf, CL_MAP_WRITE) –  memcpy(data, ptr) –  clEnqueueUnmapMemObject(ptr)

•  clEnqueueMapBuffer –  kernelBuf = clCreateBuffer() - cacheable –  ptr = clEnqueueMapBuffer(kernelBuf, CL_MAP_WRITE) –  memcpy(data, ptr) –  clEnqueueUnmapMemObject(ptr)

•  clEnqueueReadBuffer –  kernelBuf = clCreateBuffer() - cacheable –  clEnqueueWriteBuffer(kernelBuf, data)

•  oclCopyCompute –  pinnedBuf = clCreateBuffer(CL_MEM_ALLOC_HOST_PTR|CL_MEM_READ_WRITE) –

cacheable –  pinnedPtr = clEnqueueMapBuffer(pinnedBuf, CL_MAP_WRITE) – cacheable –  memcpy(pinnedPtr, data) –  kernelBuf = clCreateBuffer() – cacheable –  clEnqueueReadBuffer(kernelBuf, pinnedPtr)

March 26, 2013 © 2013 OpenEye Scienti!c Software

Page 26: Molecular Shape Searching on GPUs: A Brave New World

March 26, 2013 © 2013 OpenEye Scienti!c Software

0  

1  

2  

3  

4  

5  

6  

7  

8  

9  

1  1  1  1  1  2  2  2  2  2  3  3  3  3  3  4  4  4  4  4  5  5  5  5  5  6  6  6  6  6  7  7  7  7  7  8  8  8  8  8  Speedu

p  (Tim

e  Sequ

en5a

l  /  Tim

e  Pa

rallel)  

Number  of  GPUs  U5lized  

FastROCS  scalability  across  8x  M2070  

Page 27: Molecular Shape Searching on GPUs: A Brave New World

Lessons from the mess

•  clEnqueueWriteBuffer > clEnqueueMapBuffer

•  clEnqueueMapBuffer >> clEnqueueReadBuffer

•  CL_MEM_* constants aren’t worth the effort

March 26, 2013 © 2013 OpenEye Scienti!c Software

Page 28: Molecular Shape Searching on GPUs: A Brave New World

CUDA?

•  Serious customers will only use NVidia cards

•  Pinned memory

•  Better support for binaries and compatibility •  CUDA support >> OpenCL support

March 26, 2013 © 2013 OpenEye Scienti!c Software

Page 29: Molecular Shape Searching on GPUs: A Brave New World

FastROCS CUDA port

March 26, 2013 © 2013 OpenEye Scienti!c Software

0  

500,000  

1,000,000  

1,500,000  

2,000,000  

2,500,000  

3,000,000  

OpenCL   CUDA   CUDA-­‐pinned  

Confom

ers  p

er  Secon

d  

2xC2075  2xC2090  2xK20  

High  is  

Best  

Page 30: Molecular Shape Searching on GPUs: A Brave New World

CUDA Scaling?

March 26, 2013 © 2013 OpenEye Scienti!c Software

0  

1,000,000  

2,000,000  

3,000,000  

4,000,000  

5,000,000  

6,000,000  

7,000,000  

8,000,000  

1   2   3   4   5   6   7   8  

Conformers  p

er  Secon

d  

Number  of  individual  K10  GPUs    (Note,  each  K10  has  2  physical  GPUs  on  the  board)  

CUDA  

OpenCL  

Ideal  

High  is  

Best  

Page 31: Molecular Shape Searching on GPUs: A Brave New World

CUDA vs OpenCL: Ding Ding!

•  Portability vs Innovation

•  NVidia vs Intel and AMD

•  Open vs Proprietary

•  Customers don’t care…

March 26, 2013 © 2013 OpenEye Scienti!c Software

Page 32: Molecular Shape Searching on GPUs: A Brave New World

ROCS Implementations

•  We only care a little…

•  Fortran code (1995) •  C code (1999) •  C++ wrapper code (2003) •  OpenCL code (2009) •  CUDA code (2012) •  C++ thread-safe code (2013)

March 26, 2013 © 2013 OpenEye Scienti!c Software

Page 33: Molecular Shape Searching on GPUs: A Brave New World

OpenEye Software

•  Lots of Software –  14 products –  13 software libraries

•  C++ (no SIMD) –  2.5 million lines

•  Python –  416 thousand lines

•  Java –  63 thousand lines

•  C# –  38 thousand lines

©  2012  OpenEye  Scien;fic  So>ware  

Page 34: Molecular Shape Searching on GPUs: A Brave New World

20  

12  

10  Programmers  Hardcore  Scripter  Other  stuff  

The People

•  GPGPU = ½ of a developer –  Only %2.5 of development effort

© 2012 OpenEye Scientific Software

Page 35: Molecular Shape Searching on GPUs: A Brave New World

Technology Adoption Lifecycle

March 26, 2013 © 2013 OpenEye Scienti!c Software

%2.5   %13.5   %34   %34   %16  

OpenEye  GPGPU  development  

Page 36: Molecular Shape Searching on GPUs: A Brave New World

LinkedIn skills

March 26, 2013 © 2013 OpenEye Scienti!c Software

%2.2  

Page 37: Molecular Shape Searching on GPUs: A Brave New World

Technology Adoption Lifecycle

March 26, 2013 © 2013 OpenEye Scienti!c Software

%2.5   %13.5   %34   %34   %16  

GPGPU  development  

Page 38: Molecular Shape Searching on GPUs: A Brave New World

I Believe…

•  GPGPU computing can become ubiquitous…

•  By expressing parallelism everywhere…

•  We can make it easy for our customers… –  Pre-installed in every operating system –  Integrated seamlessly into every language –  Then eventually becoming the CPU

March 26, 2013 © 2013 OpenEye Scienti!c Software

Page 39: Molecular Shape Searching on GPUs: A Brave New World

Acknowledgements

•  Nikolai Sakharnykh (NVidia) •  Dave Mullaly (HP) •  Exxact Computing

March 26, 2013 © 2013 OpenEye Scienti!c Software

Page 40: Molecular Shape Searching on GPUs: A Brave New World

Father of “ROCS”

Andrew Grant April 28th 1963 - December 29th 2012

March 26, 2013 © 2013 OpenEye Scienti!c Software

Page 41: Molecular Shape Searching on GPUs: A Brave New World

March 26, 2013 © 2013 OpenEye Scienti!c Software

Page 42: Molecular Shape Searching on GPUs: A Brave New World

Dude, where’s my color?

March 26, 2013 © 2010 OpenEye Scienti!c Software

0  0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9  

ROCS   FastROCS  

DUD  Av

erage  AU

C  

Shape  Only  With  Color  

Page 43: Molecular Shape Searching on GPUs: A Brave New World

ROCS vs FastROCS Histogram

March 26, 2013 © 2010 OpenEye Scienti!c Software

0  

2  

4  

6  

8  

10  

12  0.10  

0.15  

0.20  

0.25  

0.30  

0.35  

0.40  

0.45  

0.50  

0.55  

0.60  

0.65  

0.70  

0.75  

0.80  

0.85  

0.90  

0.95  

1.00  

Num

ber  o

f  Targets  

Kendall  Tau  Correla5on  Coefficient