using biogrids › wiki › downloads › biogrids_aws_rna_seq_july_25_2019.pdfusing biogrids for...

27
Using BioGrids for RNA-Seq on AWS and Your Laptop TMEC 304 July 25 2019 James Vincent BioGrids Consortium Harvard Medical School

Upload: others

Post on 28-May-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Using BioGrids › wiki › downloads › BioGrids_AWS_RNA_Seq_July_25_2019.pdfUsing BioGrids for RNA-Seq on AWS and Your Laptop TMEC 304 July 25 2019 James Vincent BioGrids Consortium

UsingBioGridsforRNA-SeqonAWSandYourLaptop

TMEC304 July252019

JamesVincent

BioGridsConsortiumHarvardMedicalSchool

Page 2: Using BioGrids › wiki › downloads › BioGrids_AWS_RNA_Seq_July_25_2019.pdfUsing BioGrids for RNA-Seq on AWS and Your Laptop TMEC 304 July 25 2019 James Vincent BioGrids Consortium

[email protected]

Todaywewill.....

InstallsoftwarewithBioGrids

RunanRNA-Seqworkflow

ReplicateaboveonAWSandlaptop

Page 3: Using BioGrids › wiki › downloads › BioGrids_AWS_RNA_Seq_July_25_2019.pdfUsing BioGrids for RNA-Seq on AWS and Your Laptop TMEC 304 July 25 2019 James Vincent BioGrids Consortium

[email protected]

BioGridsonAWS

faithfullaptop AWSEC2Instance

Page 4: Using BioGrids › wiki › downloads › BioGrids_AWS_RNA_Seq_July_25_2019.pdfUsing BioGrids for RNA-Seq on AWS and Your Laptop TMEC 304 July 25 2019 James Vincent BioGrids Consortium

[email protected]

RNA-SeqWorkflow

openaterminalopenabrowser:biogrids.org/wiki/workshops

Page 5: Using BioGrids › wiki › downloads › BioGrids_AWS_RNA_Seq_July_25_2019.pdfUsing BioGrids for RNA-Seq on AWS and Your Laptop TMEC 304 July 25 2019 James Vincent BioGrids Consortium

[email protected]

SubtleThings

•  CapsuleEnvironment•  .bashrc/.profilenotchanged•  binaryinstalls

Page 6: Using BioGrids › wiki › downloads › BioGrids_AWS_RNA_Seq_July_25_2019.pdfUsing BioGrids for RNA-Seq on AWS and Your Laptop TMEC 304 July 25 2019 James Vincent BioGrids Consortium

[email protected]

https://www.biostars.org/p/189261/:Thisseemstobeabugwheninstallingfastqcusingapt-getinstallfastqc

STARmanual.pdf….whichcreatesproblemsforSTARcompilation.Oneoptiontoavoidthisproblemistoinstallgcc…….

http://github.gersteinlab.org/exceRpt/ManualInstallation:….generallynotrecommended…<snip>…instructionsonhowtoinstallexceRptanditsvariousdependencieswill[oneday]belistedtowardthebottomofthispage.

AvoidTimeSinks

Page 7: Using BioGrids › wiki › downloads › BioGrids_AWS_RNA_Seq_July_25_2019.pdfUsing BioGrids for RNA-Seq on AWS and Your Laptop TMEC 304 July 25 2019 James Vincent BioGrids Consortium

[email protected]

ReproducibleResearch

$ STAR --sbapp:d !!Capsule:STAR using star version 2.5.3a ! Version information for: /programs/i386-mac/star !!Default version: 2.5.3a !In-use version: 2.5.3a !Other available versions: none !Overrides use this shell variable: STAR_M !

SelfDocumenting

STAR --sbapp:d ! samtools --sbapp:d !

Includeinworkflow:

Page 8: Using BioGrids › wiki › downloads › BioGrids_AWS_RNA_Seq_July_25_2019.pdfUsing BioGrids for RNA-Seq on AWS and Your Laptop TMEC 304 July 25 2019 James Vincent BioGrids Consortium

[email protected]

ConfigFile

[installer] !site = biogrid-production !key = 70rYFBTDnmCr93VUklfbf1s3M4jdyC9bFVYHew== !user = jvincent1 !![packages] [email protected] = i386-mac [email protected] = i386-mac [email protected] = i386-mac !

Page 9: Using BioGrids › wiki › downloads › BioGrids_AWS_RNA_Seq_July_25_2019.pdfUsing BioGrids for RNA-Seq on AWS and Your Laptop TMEC 304 July 25 2019 James Vincent BioGrids Consortium

[email protected]

ThisIsHandy

biogridssavemysetup.txt biogridsreactivatemysetup.txt

faithfullaptop newworkstation

Page 10: Using BioGrids › wiki › downloads › BioGrids_AWS_RNA_Seq_July_25_2019.pdfUsing BioGrids for RNA-Seq on AWS and Your Laptop TMEC 304 July 25 2019 James Vincent BioGrids Consortium

[email protected]

BioGridsisPortable

faithfullaptop

laboratoryworkstation

HMSO2computecluster

BCHcomputecluster

Page 11: Using BioGrids › wiki › downloads › BioGrids_AWS_RNA_Seq_July_25_2019.pdfUsing BioGrids for RNA-Seq on AWS and Your Laptop TMEC 304 July 25 2019 James Vincent BioGrids Consortium

[email protected]

BioGridsBenefits

savetime-reduceheadachesscaleandshareworkflowspartofreproducibleresearch

Page 12: Using BioGrids › wiki › downloads › BioGrids_AWS_RNA_Seq_July_25_2019.pdfUsing BioGrids for RNA-Seq on AWS and Your Laptop TMEC 304 July 25 2019 James Vincent BioGrids Consortium

[email protected]

BioGridsConsortium

ComputeInfrastructure

PersonnelSBGridBioGrids

FundingHMSToolsandTechnologiesCommittee

Page 13: Using BioGrids › wiki › downloads › BioGrids_AWS_RNA_Seq_July_25_2019.pdfUsing BioGrids for RNA-Seq on AWS and Your Laptop TMEC 304 July 25 2019 James Vincent BioGrids Consortium

[email protected]

WhyBioGrids?

You CoverofNature

compilesoftwarecompilelibrariesmanagedependenciesmanageversionsmanagepathschangeversions....

learntousesoftwareoptimizeworkflowgetsciencedone

Page 14: Using BioGrids › wiki › downloads › BioGrids_AWS_RNA_Seq_July_25_2019.pdfUsing BioGrids for RNA-Seq on AWS and Your Laptop TMEC 304 July 25 2019 James Vincent BioGrids Consortium

[email protected]

RNA-SeqOverview

HarvardChanBioinformaticsCore(HBC)

http://bioinformatics.sph.harvard.edu/training

Page 15: Using BioGrids › wiki › downloads › BioGrids_AWS_RNA_Seq_July_25_2019.pdfUsing BioGrids for RNA-Seq on AWS and Your Laptop TMEC 304 July 25 2019 James Vincent BioGrids Consortium

[email protected]

RNA-SeqOverview

hbctraining.github.io/Intro-to-rnaseq-hpc-O2

Biologicalsamples/Libraryprep

sequencereads

qualitycheck

adapter/qualitytrimming

spliceawaremappingtogenome

countreadsassociatedwithgenes

statisticalanalysisidentifydifferentiallyexpressedgenes

Page 16: Using BioGrids › wiki › downloads › BioGrids_AWS_RNA_Seq_July_25_2019.pdfUsing BioGrids for RNA-Seq on AWS and Your Laptop TMEC 304 July 25 2019 James Vincent BioGrids Consortium

[email protected]

RNAPrep

Page 17: Using BioGrids › wiki › downloads › BioGrids_AWS_RNA_Seq_July_25_2019.pdfUsing BioGrids for RNA-Seq on AWS and Your Laptop TMEC 304 July 25 2019 James Vincent BioGrids Consortium

[email protected]

Sequencing

Page 18: Using BioGrids › wiki › downloads › BioGrids_AWS_RNA_Seq_July_25_2019.pdfUsing BioGrids for RNA-Seq on AWS and Your Laptop TMEC 304 July 25 2019 James Vincent BioGrids Consortium

[email protected]

RNA-SeqOverview

hbctraining.github.io/Intro-to-rnaseq-hpc-O2

Biologicalsamples/Libraryprep

sequencereads

qualitycheck

adapter/qualitytrimming

spliceawaremappingtogenome

countreadsassociatedwithgenes

statisticalanalysisidentifydifferentiallyexpressedgenes

FastQC

(trimmomatic)

STAR

subRead

BioGridsApps

Page 19: Using BioGrids › wiki › downloads › BioGrids_AWS_RNA_Seq_July_25_2019.pdfUsing BioGrids for RNA-Seq on AWS and Your Laptop TMEC 304 July 25 2019 James Vincent BioGrids Consortium

[email protected]

MappedReads

Page 20: Using BioGrids › wiki › downloads › BioGrids_AWS_RNA_Seq_July_25_2019.pdfUsing BioGrids for RNA-Seq on AWS and Your Laptop TMEC 304 July 25 2019 James Vincent BioGrids Consortium

[email protected]

CheckResults

IGV1:Genomes/LoadGenomefromFile...(chr1_MOV10.fa)2:File/Loadfromfile...(.gtffile)3:File/Loadfromfile...(.bamfile)

Page 21: Using BioGrids › wiki › downloads › BioGrids_AWS_RNA_Seq_July_25_2019.pdfUsing BioGrids for RNA-Seq on AWS and Your Laptop TMEC 304 July 25 2019 James Vincent BioGrids Consortium

[email protected]

workflow softwarestack

computeresources

DevOpswithBioGrids

bioinformatics BioGrids laptopHMSO2AWS

Page 22: Using BioGrids › wiki › downloads › BioGrids_AWS_RNA_Seq_July_25_2019.pdfUsing BioGrids for RNA-Seq on AWS and Your Laptop TMEC 304 July 25 2019 James Vincent BioGrids Consortium

[email protected]

AWSHandsOn

https://sbgrid.signin.aws.amazon.com/consoleusername:workshop21password:Biogrids_Workshop1

Page 23: Using BioGrids › wiki › downloads › BioGrids_AWS_RNA_Seq_July_25_2019.pdfUsing BioGrids for RNA-Seq on AWS and Your Laptop TMEC 304 July 25 2019 James Vincent BioGrids Consortium

[email protected]

AWS-AmazonWebServices

Page 24: Using BioGrids › wiki › downloads › BioGrids_AWS_RNA_Seq_July_25_2019.pdfUsing BioGrids for RNA-Seq on AWS and Your Laptop TMEC 304 July 25 2019 James Vincent BioGrids Consortium

[email protected]

AWSParallelClusteraws-parallelcluster.readthedocs.io

scalableHPCcluster

Page 25: Using BioGrids › wiki › downloads › BioGrids_AWS_RNA_Seq_July_25_2019.pdfUsing BioGrids for RNA-Seq on AWS and Your Laptop TMEC 304 July 25 2019 James Vincent BioGrids Consortium

[email protected]

[email protected]

BioGrids is funded by the  Harvard Medical School

Tools and Technologies Committee 

Page 26: Using BioGrids › wiki › downloads › BioGrids_AWS_RNA_Seq_July_25_2019.pdfUsing BioGrids for RNA-Seq on AWS and Your Laptop TMEC 304 July 25 2019 James Vincent BioGrids Consortium

[email protected]

AdditionalResourcesENCODEdatafilescanbefoundhereforCalTechRNA-Seq:http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeCaltechRnaSeq/Usethisbamfile:wgEncodeCaltechRnaSeqK562R1x75dAlignsRep1V2RegionofMOV10gene:chr1:113,214,934-113,243,900Howtodownloadwholegenome:-UCSCftpsite:hgdownload.cse.ucsc.edu-UCSCwebsite:http://hgdownload.cse.ucsc.edu/goldenPath/hg19/chromosomes/-UCSCrecommendsusinganftpclientforlargefiledownloads-chr1isonly70M

Page 27: Using BioGrids › wiki › downloads › BioGrids_AWS_RNA_Seq_July_25_2019.pdfUsing BioGrids for RNA-Seq on AWS and Your Laptop TMEC 304 July 25 2019 James Vincent BioGrids Consortium

[email protected]

References

TRAININGhbctraining.github.io/Intro-to-rnaseq-hpc-O2AWShttps://aws.amazon.com/ec2/getting-startedENCODEhttps://www.encodeproject.orgIMAGEShttps://www.diagenode.com/en/categories/Library-preparation-for-RNA-seqhttps://rnaseq.uoregon.edu