2011-04-26_various-assemblers-presentation
TRANSCRIPT
![Page 1: 2011-04-26_various-assemblers-presentation](https://reader035.vdocuments.site/reader035/viewer/2022081403/554e8e24b4c905fc368b4ad1/html5/thumbnails/1.jpg)
EBI is an Outstation of the European Molecular Biology Laboratory.
Assembly tools and Visualisation
Matthias Haimel
![Page 2: 2011-04-26_various-assemblers-presentation](https://reader035.vdocuments.site/reader035/viewer/2022081403/554e8e24b4c905fc368b4ad1/html5/thumbnails/2.jpg)
Assemblers25.04.112
Overview
• Assemblers• ABySS• SOAPdenovo
• Visualisation• Tablet• AbySS-Explorer
• Read mapping• Sam / Bam
• Visualisation• Artemis• IGV - Integrative Genomics Viewer
![Page 3: 2011-04-26_various-assemblers-presentation](https://reader035.vdocuments.site/reader035/viewer/2022081403/554e8e24b4c905fc368b4ad1/html5/thumbnails/3.jpg)
Assemblers25.04.113
ABySS Assembly By Short Sequences
• Genome Sciences Centre, Vancouver• http://www.bcgsc.ca/platform/bioinfo/software/abyss• Open source, BCCA Licence
• de Bruijn graph • Trimming (tip clipping), bubble popping• Use paired-end information: resolve ambiguities between contigs• parallel (use cluster)
• Files• Fasta / Fastq• Sam/Bam• colour-space
![Page 4: 2011-04-26_various-assemblers-presentation](https://reader035.vdocuments.site/reader035/viewer/2022081403/554e8e24b4c905fc368b4ad1/html5/thumbnails/4.jpg)
Assemblers25.04.114
ABySS
• ABYSS (singe end)• e.g. ABYSS -k27 single.fastq -o contigs.fa
• abyss-pe (paired end)• e.g. abyss-pe k=27 n=10 in='read_1.fastq read_2.fastq' name=ecli
• Multiple libraries• ... lib=’read1 read2’ read1=’read1_1.fa read1_2.fa’ read2=’read2_1.fa read1_2.fa’
![Page 5: 2011-04-26_various-assemblers-presentation](https://reader035.vdocuments.site/reader035/viewer/2022081403/554e8e24b4c905fc368b4ad1/html5/thumbnails/5.jpg)
Assemblers25.04.115
SOAPdenovo
• Beijing Genomics Institute (BGI), China• http://soap.genomics.org.cn/soapdenovo.html• Panda genome• Source available
• de Bruijn graph • pre-set Kmer frequency threshold• Bubble removing
• Build scaffold• mapping reads to contigs• gap filling
![Page 6: 2011-04-26_various-assemblers-presentation](https://reader035.vdocuments.site/reader035/viewer/2022081403/554e8e24b4c905fc368b4ad1/html5/thumbnails/6.jpg)
Assemblers25.04.116
SOAPdenovo
• Full run • e.g. SOAPdenovo all -s read.config -K 27 -o contigs.fa
• Run sub steps • pregraph = velveth• contig = velvetg• map map reads to contigs• scaff scaffolding
• Configuration• Config file input instead of read files• Specify rank, usage (assembly/scaffolding), insert size
![Page 7: 2011-04-26_various-assemblers-presentation](https://reader035.vdocuments.site/reader035/viewer/2022081403/554e8e24b4c905fc368b4ad1/html5/thumbnails/7.jpg)
Assemblers25.04.117
Visualisation
• Tablet• Lightweight• Easy to use
• Formats• ACE• AFG• BAM• BANK (AMOS)
http://bioinf.scri.ac.uk/tablet/
![Page 8: 2011-04-26_various-assemblers-presentation](https://reader035.vdocuments.site/reader035/viewer/2022081403/554e8e24b4c905fc368b4ad1/html5/thumbnails/8.jpg)
Assemblers25.04.118
Visualisation - Velvet
• Tablet• Velvetg ... -amos_file yes
• GraphViz• Transform velvet graph into GraphViz format• Contributed by Paul Harrison• <velvet>/contrib/layout/• Velvet -> .dot file (Python script)• .dot -> png (graphviz)
![Page 9: 2011-04-26_various-assemblers-presentation](https://reader035.vdocuments.site/reader035/viewer/2022081403/554e8e24b4c905fc368b4ad1/html5/thumbnails/9.jpg)
Assemblers25.04.119
Visualisation
• ABySS-Explorer• Visualizes ABySS assemblies• Interactive graph structure• Filter contigs
http://www.bcgsc.ca/platform/bioinfo/software/abyss-explorer
![Page 10: 2011-04-26_various-assemblers-presentation](https://reader035.vdocuments.site/reader035/viewer/2022081403/554e8e24b4c905fc368b4ad1/html5/thumbnails/10.jpg)
Assemblers25.04.1110
Assembler - Practical
• Assemblers• ABySS • SOAPdenovo
• Visualisation• Tablet• ABySS-Explorer
![Page 11: 2011-04-26_various-assemblers-presentation](https://reader035.vdocuments.site/reader035/viewer/2022081403/554e8e24b4c905fc368b4ad1/html5/thumbnails/11.jpg)
Assemblers25.04.1111
Read mapping
• SAM / BAM• Sequence Alignment / Map format (SAM)• Binary form of SAM (BAM)• generic format • Flexible and simple• Compact (BAM)• Allow indexing• Load regions• Support streaming
http://samtools.sourceforge.net/SAM1.pdf
![Page 12: 2011-04-26_various-assemblers-presentation](https://reader035.vdocuments.site/reader035/viewer/2022081403/554e8e24b4c905fc368b4ad1/html5/thumbnails/12.jpg)
Assemblers25.04.1112
SAM
• Header• File format version information• Sequence dictionary (name/length/..)• Read group (platform/library/...)• Program info
• Body• Alignment information
![Page 13: 2011-04-26_various-assemblers-presentation](https://reader035.vdocuments.site/reader035/viewer/2022081403/554e8e24b4c905fc368b4ad1/html5/thumbnails/13.jpg)
Assemblers25.04.1113
SAM Header
• '@' followed by record type (two characters)
@HD VN:1.0@SQ SN:chr20 LN:62435964@RG ID:L1 PU:SC_1_10LB:SC_1 SM:NA12891@RG ID:L2 PU:SC_2_12LB:SC_2 SM:NA12891
![Page 14: 2011-04-26_various-assemblers-presentation](https://reader035.vdocuments.site/reader035/viewer/2022081403/554e8e24b4c905fc368b4ad1/html5/thumbnails/14.jpg)
Assemblers25.04.1114
SAM Alignment
• Tab delimited lines
![Page 15: 2011-04-26_various-assemblers-presentation](https://reader035.vdocuments.site/reader035/viewer/2022081403/554e8e24b4c905fc368b4ad1/html5/thumbnails/15.jpg)
Assemblers25.04.1115
SAM Alignment
• Tab delimited lines
Read_28833_29006_6945 99 chr20 28833 20 10M1D25M = 28993 195 \ AGCT... <<<<... NM:i:1 RG:Z:L1read_28701_28881_323b 147 chr20 28834 30 35M = 28701 -168 \ACCT... <<7;:... MF:i:18 RG:Z:L2
![Page 16: 2011-04-26_various-assemblers-presentation](https://reader035.vdocuments.site/reader035/viewer/2022081403/554e8e24b4c905fc368b4ad1/html5/thumbnails/16.jpg)
Assemblers25.04.1116
Tools
• Mapping Reads• BWA • Bowtie • SSAHA2
• Manipulate SAM/BAM• SAM Tools package• Picard
![Page 17: 2011-04-26_various-assemblers-presentation](https://reader035.vdocuments.site/reader035/viewer/2022081403/554e8e24b4c905fc368b4ad1/html5/thumbnails/17.jpg)
Assemblers25.04.1117
BWA
• Burrows-Wheeler Alignment Tool• Map (singe/paired-end/long) reads to a sequence
• Index database• bwa index -a bwtsw database.fasta
• Align reads• bwa aln database.fasta short_read.fastq > aln_sa.sai
• Generate alignments• bwa sampe database.fasta aln_sa1.sai aln_sa2.sai read1.fq read2.fq > aln.sam
• Long reads• bwa bwasw database.fasta long_read.fastq > aln.sam
![Page 18: 2011-04-26_various-assemblers-presentation](https://reader035.vdocuments.site/reader035/viewer/2022081403/554e8e24b4c905fc368b4ad1/html5/thumbnails/18.jpg)
Assemblers25.04.1118
SAM tools
• Utilities for SAM format• samtools <command> ...
• Commands:• view: SAM <-> BAM• sort: sort BAM file • index: build BAM file index• merge: merges x BAM files• pileup: alignment in the pileup format• tview: integrated Text alignment viewer
![Page 19: 2011-04-26_various-assemblers-presentation](https://reader035.vdocuments.site/reader035/viewer/2022081403/554e8e24b4c905fc368b4ad1/html5/thumbnails/19.jpg)
Assemblers25.04.1119
Visualisation Integrative Genomics Viewer
• IGV• Good integration
• Formats• DAS• BAM• GFF• ...
• Tools• Run scripts• Export region• ...
http://www.broadinstitute.org/igv/
![Page 20: 2011-04-26_various-assemblers-presentation](https://reader035.vdocuments.site/reader035/viewer/2022081403/554e8e24b4c905fc368b4ad1/html5/thumbnails/20.jpg)
Assemblers25.04.1120
Visualisation
• Artemis• Sequence Viewer• Annotation tool
• Formats• EMBL• GENBANK• GFF• FASTA• BAM
http://www.sanger.ac.uk/resources/software/artemis/
![Page 21: 2011-04-26_various-assemblers-presentation](https://reader035.vdocuments.site/reader035/viewer/2022081403/554e8e24b4c905fc368b4ad1/html5/thumbnails/21.jpg)
Assemblers25.04.1121
Mapping - Practical
• Mapping reads + prepare for visalization• BWA • samtools
• Visualisation• IGV