![Page 1: Talk at BaseSpace Developer conference SF 2013](https://reader036.vdocuments.site/reader036/viewer/2022062900/58ed8bbc1a28abfb068b471f/html5/thumbnails/1.jpg)
Bioinformatics|Software|Services
NOVOALIGN BASESPACE APP
Zayed AlbertynBioinformatics Director, Novocraft technologies Sdn BhdIllumina® BaseSpace Developer Conference, San Francisco9th December 2013
![Page 2: Talk at BaseSpace Developer conference SF 2013](https://reader036.vdocuments.site/reader036/viewer/2022062900/58ed8bbc1a28abfb068b471f/html5/thumbnails/2.jpg)
Bioinformatics|Software|Services
Novocraft Technologies Sdn Bhd
• Incorporated in 2008, BioNexus Status Company
• Small team of Mathematicians, Biologists & Software Engineeers
• Develop Innovation & World Class Products
• High-Performance Computing in growing Genomics Era
• International Market & User Base
![Page 3: Talk at BaseSpace Developer conference SF 2013](https://reader036.vdocuments.site/reader036/viewer/2022062900/58ed8bbc1a28abfb068b471f/html5/thumbnails/3.jpg)
Bioinformatics|Software|Services
Products• Novoalign– Illumina, 454• NovoalignCS – SOLiD • Novosort • Cluster Solutions
– NovoalignMPI, NovoalignCSMPI• NGS WorkBench (web)• All running on standard commodity hardware
– No special GPU/supercomputer required– Mac OS & Linux versions available– Open source operating system (Linux)
• NGS Cloud computing HPC workflows – Amazon EC2/S3/EBS
![Page 4: Talk at BaseSpace Developer conference SF 2013](https://reader036.vdocuments.site/reader036/viewer/2022062900/58ed8bbc1a28abfb068b471f/html5/thumbnails/4.jpg)
Bioinformatics|Software|Services
NGS ServicesConsultation on NextGen projects
• Exome• Whole genome• SNV, Indel, Structural
Variations (SVs)• RNASeq• CHIP-Seq• Methylome• Small RNA• de-novo assembly
Automated pipelines
In-house/custom and open source software
Illumina and other platforms
Cloud Solutions-packaged AMIs,containers
![Page 5: Talk at BaseSpace Developer conference SF 2013](https://reader036.vdocuments.site/reader036/viewer/2022062900/58ed8bbc1a28abfb068b471f/html5/thumbnails/5.jpg)
Bioinformatics|Software|Services
Collaborations
• Academic/research institutes• Industry– HPC providers– Pharma– Cloud solutions
• Resellers– US and Global
![Page 6: Talk at BaseSpace Developer conference SF 2013](https://reader036.vdocuments.site/reader036/viewer/2022062900/58ed8bbc1a28abfb068b471f/html5/thumbnails/6.jpg)
Bioinformatics|Software|Services
A few of our NOVOALIGN users
![Page 7: Talk at BaseSpace Developer conference SF 2013](https://reader036.vdocuments.site/reader036/viewer/2022062900/58ed8bbc1a28abfb068b471f/html5/thumbnails/7.jpg)
Bioinformatics|Software|Services
User Examples
![Page 8: Talk at BaseSpace Developer conference SF 2013](https://reader036.vdocuments.site/reader036/viewer/2022062900/58ed8bbc1a28abfb068b471f/html5/thumbnails/8.jpg)
Bioinformatics|Software|Services
• Hash-based aligner• Peer reviewed publications: 2009-present• Accuracy– SNPs and short Indels
• Read length > 250 bp as of V3.X.X
NOVOALIGN
![Page 9: Talk at BaseSpace Developer conference SF 2013](https://reader036.vdocuments.site/reader036/viewer/2022062900/58ed8bbc1a28abfb068b471f/html5/thumbnails/9.jpg)
Bioinformatics|Software|Services ROC Curves
• True Positive vs False positive rate• Higher Y value - better at finding the
“true” result• Lower X value – better at excluding “false”
results
http://lh3lh3.users.sourceforge.net/alnROC.shtml
![Page 10: Talk at BaseSpace Developer conference SF 2013](https://reader036.vdocuments.site/reader036/viewer/2022062900/58ed8bbc1a28abfb068b471f/html5/thumbnails/10.jpg)
Bioinformatics|Software|Services
The performance of various methods for mapping reads to reference repeats.
Highnam G et al. Nucl. Acids Res. 2013;41:e32
![Page 11: Talk at BaseSpace Developer conference SF 2013](https://reader036.vdocuments.site/reader036/viewer/2022062900/58ed8bbc1a28abfb068b471f/html5/thumbnails/11.jpg)
Bioinformatics|Software|Services
The performance of various methods for mapping reads to reference repeats.
Highnam G et al. Nucl. Acids Res. 2013;41:e32
![Page 12: Talk at BaseSpace Developer conference SF 2013](https://reader036.vdocuments.site/reader036/viewer/2022062900/58ed8bbc1a28abfb068b471f/html5/thumbnails/12.jpg)
Bioinformatics|Software|Services
http://www.bioplanet.com/gcathttp://www.bioplanet.com/gcat/reports/112/variant-calls/ion-torrent-225bp-se-exome-30x/novoalign-gatk-ug/compare-183-119/group-read-depth
Genome-in-a-bottle Consortium dataset
![Page 13: Talk at BaseSpace Developer conference SF 2013](https://reader036.vdocuments.site/reader036/viewer/2022062900/58ed8bbc1a28abfb068b471f/html5/thumbnails/13.jpg)
Bioinformatics|Software|Services
http://bcbio.wordpress.comCourtesy Brad Chapman & Oliver Hoffman. HSPH
“Our standard workflow uses novoalign based on its stringency in resolving large insertions and deletions. These results suggest equally good results using bwa mem, along with improved processing times”
![Page 14: Talk at BaseSpace Developer conference SF 2013](https://reader036.vdocuments.site/reader036/viewer/2022062900/58ed8bbc1a28abfb068b471f/html5/thumbnails/14.jpg)
Bioinformatics|Software|ServicesGraphical representation of the total number of
downstream false positives expressed as a percentage...
Oliver GR. 2012 [http://f1000r.es/NMpsFc] F1000Research 2012, 1:2 (doi: 10.12688/f1000research.1-2.v2)
![Page 15: Talk at BaseSpace Developer conference SF 2013](https://reader036.vdocuments.site/reader036/viewer/2022062900/58ed8bbc1a28abfb068b471f/html5/thumbnails/15.jpg)
Bioinformatics|Software|Services
Novosort comparison on Illumina reads
![Page 16: Talk at BaseSpace Developer conference SF 2013](https://reader036.vdocuments.site/reader036/viewer/2022062900/58ed8bbc1a28abfb068b471f/html5/thumbnails/16.jpg)
Bioinformatics|Software|Services
Developing on BaseSpace
![Page 17: Talk at BaseSpace Developer conference SF 2013](https://reader036.vdocuments.site/reader036/viewer/2022062900/58ed8bbc1a28abfb068b471f/html5/thumbnails/17.jpg)
Bioinformatics|Software|Services
Motivation
• Reach out to more users• Enable seamless integration with the cloud• Establish BaseSpace Novoalign community
![Page 18: Talk at BaseSpace Developer conference SF 2013](https://reader036.vdocuments.site/reader036/viewer/2022062900/58ed8bbc1a28abfb068b471f/html5/thumbnails/18.jpg)
Bioinformatics|Software|Services
Alignment
• Alignment Quality Calibration
• Multithreaded• Adaptor
stripping
Sorting
• Novosort• Multithreaded
Variant Calling
• Freebayes• SNPs & Indels
What is the App?
![Page 19: Talk at BaseSpace Developer conference SF 2013](https://reader036.vdocuments.site/reader036/viewer/2022062900/58ed8bbc1a28abfb068b471f/html5/thumbnails/19.jpg)
Bioinformatics|Software|Services
What is the App?• Novoalign– Paired-end– Human-genome only, others later– Caveat: require min. 8Gb RAM machine
• Alignment coordinate-sorting– Novosort
• Variant Calling– Freebayes (Erik Garrison & Gabor Marth )
![Page 20: Talk at BaseSpace Developer conference SF 2013](https://reader036.vdocuments.site/reader036/viewer/2022062900/58ed8bbc1a28abfb068b471f/html5/thumbnails/20.jpg)
Bioinformatics|Software|Services
What is the App?• Novoalign– Paired-end– Human-genome only, others later– Caveat: require min. 8Gb RAM machine
• Alignment coordinate-sorting– Novosort
• Variant Calling– Freebayes (Erik Garrison & Gabor Marth )
![Page 21: Talk at BaseSpace Developer conference SF 2013](https://reader036.vdocuments.site/reader036/viewer/2022062900/58ed8bbc1a28abfb068b471f/html5/thumbnails/21.jpg)
Bioinformatics|Software|Services
What is the App?• Novoalign– Paired-end– Human-genome only, others later– Caveat: require min. 8Gb RAM machine
• Alignment coordinate-sorting– Novosort
• Variant Calling– Freebayes (Erik Garrison & Gabor Marth )
![Page 22: Talk at BaseSpace Developer conference SF 2013](https://reader036.vdocuments.site/reader036/viewer/2022062900/58ed8bbc1a28abfb068b471f/html5/thumbnails/22.jpg)
Bioinformatics|Software|Services
New-developer Challenges
• The “Docker” way of doing things– Image vs Container
• Front-end : Javascript/CSS• Basck-end: Algorithms/scripting
![Page 23: Talk at BaseSpace Developer conference SF 2013](https://reader036.vdocuments.site/reader036/viewer/2022062900/58ed8bbc1a28abfb068b471f/html5/thumbnails/23.jpg)
Bioinformatics|Software|Services
Back-end process
Front-end
process
Perl/C++/R/Python
![Page 24: Talk at BaseSpace Developer conference SF 2013](https://reader036.vdocuments.site/reader036/viewer/2022062900/58ed8bbc1a28abfb068b471f/html5/thumbnails/24.jpg)
Bioinformatics|Software|Services
Back-end Development ProcessStart the Native VM•Vmware•Linux environment
Start your own Docker Repository•Create new IMAGE on Docker.io•Done automatically on your first push
Attach to your image•Docker run …
Make small test dataset• Illumina cancer panel read•Subset chr22 alignmnents
Develop the app back-end process•Automated script runs pipeline•Alignment->sorting->variant calling
Postprocess •Charting with R•ggplot2
![Page 25: Talk at BaseSpace Developer conference SF 2013](https://reader036.vdocuments.site/reader036/viewer/2022062900/58ed8bbc1a28abfb068b471f/html5/thumbnails/25.jpg)
Bioinformatics|Software|Services
Front-end Development Process
BaseSpace Developer tools• Code editor• Preview form inputs
Initiate test runs• Send data to your
backend Native app
Build Report form• Write Liquid/Js/HTML5
![Page 26: Talk at BaseSpace Developer conference SF 2013](https://reader036.vdocuments.site/reader036/viewer/2022062900/58ed8bbc1a28abfb068b471f/html5/thumbnails/26.jpg)
Bioinformatics|Software|Services
App Screenshots
![Page 27: Talk at BaseSpace Developer conference SF 2013](https://reader036.vdocuments.site/reader036/viewer/2022062900/58ed8bbc1a28abfb068b471f/html5/thumbnails/27.jpg)
Bioinformatics|Software|Services
![Page 28: Talk at BaseSpace Developer conference SF 2013](https://reader036.vdocuments.site/reader036/viewer/2022062900/58ed8bbc1a28abfb068b471f/html5/thumbnails/28.jpg)
Bioinformatics|Software|Services
![Page 29: Talk at BaseSpace Developer conference SF 2013](https://reader036.vdocuments.site/reader036/viewer/2022062900/58ed8bbc1a28abfb068b471f/html5/thumbnails/29.jpg)
Bioinformatics|Software|Services
![Page 30: Talk at BaseSpace Developer conference SF 2013](https://reader036.vdocuments.site/reader036/viewer/2022062900/58ed8bbc1a28abfb068b471f/html5/thumbnails/30.jpg)
Bioinformatics|Software|Services
![Page 31: Talk at BaseSpace Developer conference SF 2013](https://reader036.vdocuments.site/reader036/viewer/2022062900/58ed8bbc1a28abfb068b471f/html5/thumbnails/31.jpg)
Bioinformatics|Software|Services
NovocraftLeadershipColin HercusHaniza HashimBioinformaticsAkzam SaidinKaamesh KaamahalaranAbdul Malik AhmadSoftware DevelopmentDeepa MuruganSharon ChinLaura Hamit
Acknowledgements
IlluminaRaymond TeckotzkyMayank Tyagi
VT/GeneByGeneDavid MittelmanGareth HighnamNir LiebovichJason Wang
HSPH Bioinformatics CoreOliver HoffmanBrad Chapman