amazon resource for bioinformatics

36

Upload: brad-chapman

Post on 11-May-2015

685 views

Category:

Documents


6 download

DESCRIPTION

Walk through using CloudBioLinux, CloudMan, BioCloudCentral to do custom biological analyses on Amazon EC2 hardware.

TRANSCRIPT

Page 1: Amazon resource for bioinformatics

Amazon resources for bioinformatics

Brad Chapman

Bioinformatics Interest Group, 18 Oct 2012

Page 2: Amazon resource for bioinformatics

Goals

Automate:Reduce stepsRemove activation energyIncrease abstraction

Improve:SharingReproducibilityTeaching

Page 3: Amazon resource for bioinformatics

Installation

Page 4: Amazon resource for bioinformatics

Easier installation

Page 5: Amazon resource for bioinformatics

No installation

Page 6: Amazon resource for bioinformatics

Challenge

Biology computing platform

Widely accessible

Customizable

Community driven

Page 7: Amazon resource for bioinformatics

General cloud frameworks

http://aws.amazon.com/

Page 9: Amazon resource for bioinformatics

CloudBioLinux

Amazon image with bioinformatics software andlibraries

Automated build framework

Community e�ort to maintain and extend

http://cloudbiolinux.org

Page 10: Amazon resource for bioinformatics

CloudMan

SGE cluster plus automation

Web interface and monitoring

Persistence and sharing

Powers the Galaxy Cloud o�ering

http://usecloudman.org/

Page 11: Amazon resource for bioinformatics

BioCloudCentral

Automate setup of Amazon instance

Launch CloudBioLinux and CloudMan

Provide easy ssh access, no key pairs

http://biocloudcentral.org

Page 12: Amazon resource for bioinformatics

Galaxy

http://usegalaxy.org

Page 13: Amazon resource for bioinformatics

Acknowledgments

CloudBioLinux: Ntino Krampis, Tim Booth,Dawn Field, Pjotr Prins, John Chilton andCloudBioLinux community.

CloudMan: Enis Afgan, James Taylor

BioCloudCentral: Enis Afgan, John Chilton,Dannon Baker

Page 14: Amazon resource for bioinformatics

Documentation

http://cda.currentprotocols.com/WileyCDA/CPUnit/

refId-bi1109.html

Page 15: Amazon resource for bioinformatics

What we'll do

1 Sign up for Amazon

2 Start a CloudBioLinux/CloudMan instance

3 Add nodes to create a compute cluster

4 Run variant calling pipeline

Everything done through the web

Page 16: Amazon resource for bioinformatics

Getting started

Sign up for Amazon Web Serviceshttp://aws.amzaon.com

Get security credentials: Access Key and Secret Keyhttp://portal.aws.amazon.com/gp/aws/

securityCredentials

Page 17: Amazon resource for bioinformatics

Launch: http://biocloudcentral.org

Page 18: Amazon resource for bioinformatics

Ready two minutes later

Page 19: Amazon resource for bioinformatics

Login to CloudMan

Page 20: Amazon resource for bioinformatics

Shared CloudMan images

Package a complete analysis environmentDataCustomizations

Sharable with other users

Share string with NGS analysis platform:

cm-b53c6f1223f966914df347687f6fc818/shared/2012-07-23--19-23/

Page 21: Amazon resource for bioinformatics

Start CloudMan

Page 22: Amazon resource for bioinformatics

CloudMan console

Page 23: Amazon resource for bioinformatics

CloudMan admin page

Page 24: Amazon resource for bioinformatics

CloudMan: managing a cluster

Page 25: Amazon resource for bioinformatics

Associated Galaxy instance

Page 26: Amazon resource for bioinformatics

Analysis data on shared instance

Page 27: Amazon resource for bioinformatics

Graphical variant-calling pipeline

Page 28: Amazon resource for bioinformatics

Analysis data linked to pipeline

Page 29: Amazon resource for bioinformatics

Con�gure pipeline

Page 30: Amazon resource for bioinformatics

Run pipeline

Page 31: Amazon resource for bioinformatics

Shut everything down

Page 32: Amazon resource for bioinformatics

What happened

1 Sign up for Amazon

2 Start a CloudBioLinux/CloudMan instance

3 Add nodes to create a compute cluster

4 Run variant calling pipeline

Everything done through the web

Page 33: Amazon resource for bioinformatics

ssh to the machine

$ ssh [email protected]

[email protected]'s password:

Welcome to Ubuntu 12.04 LTS

(GNU/Linux 3.2.0-23-virtual x86_64)

ubuntu@ip-10-72-197-11:~$

Page 34: Amazon resource for bioinformatics

NX graphical client: login

http://www.nomachine.com/download.php

Page 35: Amazon resource for bioinformatics

NX graphical client: desktop

Page 36: Amazon resource for bioinformatics

Summary

Use cloud resources to build:

Machines with standard software

Cluster management

Analysis pipelines

Reproducible, sharable instances

Web-based interfaces