teaching bioinformatics data analysis using medicago truncatula as a model

19
Teaching Bioinformatics data analysis using Medicago truncatula as a model Vivek Krishnakumar Session: Teaching Genetics, Genomics, Bioinformatics and Biotechnology Plant & Animal Genome XXIV Saturday, Jan 9 th , 2016

Upload: vivek-krishnakumar

Post on 15-Apr-2017

206 views

Category:

Science


2 download

TRANSCRIPT

Page 1: Teaching Bioinformatics data analysis using Medicago truncatula as a model

Teaching Bioinformatics data analysis using Medicagotruncatula as a model

Vivek KrishnakumarSession: Teaching Genetics, Genomics, Bioinformatics and Biotechnology

Plant & Animal Genome XXIVSaturday, Jan 9th, 2016

Page 2: Teaching Bioinformatics data analysis using Medicago truncatula as a model

Outline

• Background¡ Medicago genomeproject¡ Outreachmandate¡ OurVision

• JCVIPlantBioinformaticsWorkshop• Communityaccesstoworkshopresources• RelatedInitiatives• Summary

Page 3: Teaching Bioinformatics data analysis using Medicago truncatula as a model

Medicago genomeproject

• Medicago truncatula,acloserelativeofalfalfa,isthepreeminentmodelforlegumegenomics

• Sequencinginitiatedin2003,renewedin2006,movedtocurationphasein2009

• FundedbyNSFPlantGenomeawards#0321460,#0604966and #0821966,respectively

Page 4: Teaching Bioinformatics data analysis using Medicago truncatula as a model

Medicago genomeprojectactivities

• Sequencing¡ Sanger-basedBACsequencing¡ Sequencefinishing/gapclosure¡ NextGen sequencing(NGS)usingIllumina/454

• Assembly¡ Tiling-path&geneticmapbasedgenomeassembly¡ WholeGenomeShotgun(WGS)assembly¡ OpticalMapbasedgenomeassemblyimprovement

• Annotation¡ denovo genefinding,transposonclassification¡ Transcriptomebasedgenestructuralannotation¡ TranscriptomebasedAlternativeSplicing(AS)detection¡ Genefunctionalannotation

• OnlineDatabases¡ Medicago truncatula GenomeDatabase¡ Medicago CommunityAnnotationPortal

Page 5: Teaching Bioinformatics data analysis using Medicago truncatula as a model

OutreachMandate

NSFAward#0821966:Attheeducationallevel,participatinginstitutionswillhostvisitingstudentsintheirlaboratoriesforsummerinternships.Inaddition,annualworkshopswillbeheldtoprovideeducationingenomeannotationandanalysistograduatestudents,postdoctoralfellowsandinterestedfacultyinthelegumecommunity.

http://www.nsf.gov/awardsearch/showAward?AWD_ID=0821966

Page 6: Teaching Bioinformatics data analysis using Medicago truncatula as a model

OurVision

• Genomeandtranscriptome sequencingisnowcommonplace,sequencingtechconstantlyevolving

• Newmethodologiesandtoolstoanalyze/visualizedatacontinuetobedevelopedandreleased

• Pressingneedforresearcherstokeepabreastofnewbioinformaticsanalysistechniques

• Goal:¡ Developacomprehensivecurriculumcapableofcoveringtheoreticalandpracticalnuancesofgenomicdataanalysis,targetedtowardsresearcherslookingtohonetheirbioinformaticsskills

Page 7: Teaching Bioinformatics data analysis using Medicago truncatula as a model

JCVIPlantBioinformaticsWorkshopBackground

• Annualweek-longworkshop• Startedin2010andconcludedin2014• Opentoparticipantswithin/outsidetheUSA• Opentouniversityandindustryparticipants• Opentoremotelylocatedparticipants• FullypaidforbytheNSFAward(exceptforinternationaltravel)

• FocusedonvariousaspectsofGenomicsandBioinformaticsdataanalysis

Page 8: Teaching Bioinformatics data analysis using Medicago truncatula as a model

JCVIPlantBioinformaticsWorkshopPresentations

• Internalinstructors(fromthePlantGenomicsgroups)presenttalksontopicsderivingfromtheirdomainknowledge

¡ Linux:Getting familiarwithcommandline interface (CLI),1. learningtousecommandlinetoolkits2. understanding common fileformats(GFF3, BED,SAM)

¡ Assembly:1. genomesequencing technologies(454, Illumina,PacBio)2. genomeassemblymethodsandtools(SOAPdenovo, Velvet)3. assembly comparison tools(nucmer)

¡ Annotation:1. genefindingmethodologies2. functional annotationtools3. transcriptome assembly andanalysis4. differential expression analysis

¡ Variation:1. SingleNucleotideVariations(SNV)andtheireffects2. Variantanalysis tools

• Guestinstructorspresentdomain specifictalks:smallRNAanalysis(BlakeMeyers,DBI),Repeatanalysis(Heidrun Gundlach,MIPS),Comparative genomics(EricLyons,UofA/iPlant), Quantifying transcriptabundance(Andrew Farmer,NCGR),SyntheticBiology(OtherJCVIResearchers)

Page 9: Teaching Bioinformatics data analysis using Medicago truncatula as a model

• Hands-ondataanalysissessionsareinterspersedbetweenpresentations

• Exercisesaredesignedagainstrealdata,eithergeneratedbytheMedicago project,orotherpublisheddatasets

• Attendeesperformallthedataanalysisonthecommand-lineinterface,directlyonJCVIhostedcomputationalresources

• ComputationalneedsforremoteattendeesmanagedviacloudcomputetechnologypoweredbyAmazonwebservices

JCVIPlantBioinformaticsWorkshopHands-onSessions

Page 10: Teaching Bioinformatics data analysis using Medicago truncatula as a model

JCVIPlantBioinformaticsWorkshopCloud-basedcollaborationtechnologies

• Cloud-baseddocumentsharing

¡ GoogleDriveplatform¡ Presentationandhands-on

materialhostedaslivedocuments

¡ Contentorganizedintologicalfolders

¡ Contentaccessible afterworkshopcompletion

• Cloud-basedteleconferencing¡ CiscoWebExplatform¡ Facilitates instantaneous voice

andvideocalling¡ Sharecontentwithremote

participants¡ Selectiverecordingoftalks

Page 11: Teaching Bioinformatics data analysis using Medicago truncatula as a model

JCVIPlantBioinformaticsWorkshopCloud-basedcomputetechnologies

• Settingupandtestingcompute,dataandanalysistoolswithinJCVIenabledestimationofresourcerequirementsintermsofCPU,RAMandstorage

• ResourcesreplicatedontotheAmazonElasticCloudCompute(EC2)infrastructuretobuildVirtualMachine(VM)image

• VMimageusedtospawnon-demandinstancesasperrequirementsofremoteattendees

Resource Allocation(per machine)

ProcessingCores 20CPU

Memory (RAM) 40GB

Storage 150GB

For a total of 20 users, 4x machines allocated

Page 12: Teaching Bioinformatics data analysis using Medicago truncatula as a model

JCVIPlantBioinformaticsWorkshopParticipation

2013

2013 2014

2012

Undergrad &GraduateStudents

Postdocs/Scientist Faculty Women Universities Intl.

Universities Industries Govt.Agencies

Workshop2014 7 11 4 10 14 2 2 2Workshop2013 8 5 4 7 15 2 3 1

Totals 15 16 8 17 29 4 5 3

Page 13: Teaching Bioinformatics data analysis using Medicago truncatula as a model

Communityaccesstoworkshopresources

• Forposterity,completesetofworkshopresourceshavebeenpostedasafree-to-userVirtualMachine(VM)imageavailableontheopen-accesscloudcomputinginfrastructure,Atmosphere,developedandmadeavailablebyCyVerse (formerlyiPlantCollaborative)

• VMimage:https://atmo.iplantcollaborative.org/application/images/899

• Presentations&Hands-onexercisematerial:http://j.mp/jcvi-bioinfo-workshop

Page 14: Teaching Bioinformatics data analysis using Medicago truncatula as a model

Requirements toaccesstheseresources:• CreateaniPlant account:

https://user.iplantcollaborative.org• RequestaccesstoAtmosphere:

https://pods.iplantcollaborative.org/wiki/x/mIly

• CreatenewinstancefromWorkshopVMimage:https://pods.iplantcollaborative.org/wiki/x/Blm

• Onceinstance isrunning,followtheSSHinstructionsfrom“ConnectingtoiPlant Instance”documentintheGoogleDocsrepository:http://j.mp/jcvi-bioinfo-workshop

Communityaccesstoworkshopresources

Layoutofdataandtools:

Componentspecific layout:

Page 15: Teaching Bioinformatics data analysis using Medicago truncatula as a model

SimilarInitiativesOSUSummerBioinformaticsWorkshop

• Annualsummerworkshopstartedin2012

• Targetedtowardstudentsandfacultywithlimitedbackgroundinbioinformatics

• SimilarinscopeastheJCVIworkshop:Instructorspresentbackgroundinformation,attendeesformgroupsandworktogethertoanalyzedataandpresenttheirfindings

• PartofOSUBioinformaticsGraduateCertificationprogram

• ParticipantslearntouseHighPerformanceComputingsystems(viaOSUHPCC)

• ExposesresearcherstoiPlantcommunityresources:Atmosphere(cloud),DiscoveryEnvironment(workflows)

Peter HoytDana Brunson

Page 16: Teaching Bioinformatics data analysis using Medicago truncatula as a model

SimilarInitiativesOSUSummerBioinformaticsWorkshop

Undergrads GraduateStudents Postdocs Faculty/staff

WomenorUnderrepresent

edgroups

CollegesRepresented

Universitiesrepresented

InternationalUniversities Industries Govt.

Agencies

2015 1 12 2 6 13 4 4 1 12014 0 20 6 7 27 4 2 0 1 2Total 1 32 8 13 40 8 6 1 2 2

Page 17: Teaching Bioinformatics data analysis using Medicago truncatula as a model

Conclusion

• Developedcurriculumconsistingofdiversetopics,maintainingrelevancetocurrentadvances

• Implementedcurriculumaspartoftrainingworkshopsover4yearperiod

• Cloudcomputingtechnologyutilizedtoexpandthereachoftheworkshop

• WorkshopmaterialsmadeavailabletothebroadercommunityviaiPlant

• Teachingmaterialadaptedandutilizedbysimilarinitiatives

Page 18: Teaching Bioinformatics data analysis using Medicago truncatula as a model

Acknowledgements

JCVIInstructors• Haibao Tang• ShelbyBidwell• BenjaminRosen• MariaKim• Yongwook Choi• AgnesChan• ChristopherTownJCVIGuestInstructors• Suman Pakala• BarbaraMethé• ChuckMerryman

GuestInstructors(US)• EricLyons(Arizona/iPlant)• Nevin Young(UMN)• KevinSilverstein(UMN)• AndrewFarmer(NCGR)• PatrickZhao(Noble

Foundation)• StevenCannon(USDA-ARS)• BlakeMeyers(DBI)GuestInstructors(Intl.)• Heidrun Gundlach (MIPS)• JeromeGouzy (INRA)

Page 19: Teaching Bioinformatics data analysis using Medicago truncatula as a model

THANKYOU!