advanced practical course in genome...

13
Practical course in genome bioinformatics Day 6 – lecture Manual annotation (Web Apollo & automatic tools) 24 Feb 2017 http://ekhidna.biocenter.helsinki.fi/downloads/teaching/spring2017 26.2.2016 [email protected] http ://padlet.com/juhana_kammonen/bioinfo

Upload: others

Post on 04-Jun-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Advanced practical course in genome bioinformaticsekhidna.biocenter.helsinki.fi/downloads/teaching/spring2017/Practic… · •Web Apollo –collaborative online manual annotation

Practical course in genome bioinformatics

Day 6 – lectureManual annotation (Web Apollo & automatic tools)

24 Feb 2017http://ekhidna.biocenter.helsinki.fi/downloads/teaching/spring2017

26.2.2016 [email protected] http://padlet.com/juhana_kammonen/bioinfo

Page 2: Advanced practical course in genome bioinformaticsekhidna.biocenter.helsinki.fi/downloads/teaching/spring2017/Practic… · •Web Apollo –collaborative online manual annotation

Genome project “roadmap”

• After experimental design and preparations a genome project can be roughly split into the following steps:

1. Sequencing2. (de novo) assembly, scaffolding3. RNA-sequencing and mapping4. Gene prediction5. Manual & functional annotation6. Submission and publication of the genome in a biodatabase7. Further downstream analysis

26.2.2016 [email protected]

You are here !

http://padlet.com/juhana_kammonen/bioinfo

Page 3: Advanced practical course in genome bioinformaticsekhidna.biocenter.helsinki.fi/downloads/teaching/spring2017/Practic… · •Web Apollo –collaborative online manual annotation

Manual annotation - outline

• Repetitive elements prediction (ab initio) finds and masks repeats in the genome

• Automated gene prediction reveals potential gene content from the genome• A challenging task especially in eukaryotic genomes

• After gene prediction the genome must be manually curated• Apply a set of trained human eyes to evaluate different tracks

of evidence and find potential errors in gene predictions

26.2.2016 [email protected] http://padlet.com/juhana_kammonen/bioinfo

Page 4: Advanced practical course in genome bioinformaticsekhidna.biocenter.helsinki.fi/downloads/teaching/spring2017/Practic… · •Web Apollo –collaborative online manual annotation

The ”rocky path” from de novoassembly to manual annotation

26.2.2016 [email protected] http://padlet.com/juhana_kammonen/bioinfo

Page 5: Advanced practical course in genome bioinformaticsekhidna.biocenter.helsinki.fi/downloads/teaching/spring2017/Practic… · •Web Apollo –collaborative online manual annotation

Evidence tracks not covered well by gene prediction

• RNA-sequencing coverage• Relatively weak correlation with

actual gene location but admittedly is an indication of expression level

• Should be included as an evidence track in Web Apollo

• Splicing in eukaryotes• Exons may be incorrectly linked /

separated in the gene models

26.2.2016 [email protected] http://padlet.com/juhana_kammonen/bioinfo

Page 6: Advanced practical course in genome bioinformaticsekhidna.biocenter.helsinki.fi/downloads/teaching/spring2017/Practic… · •Web Apollo –collaborative online manual annotation

Gene prediction and splicing

26.2.2016 [email protected] http://padlet.com/juhana_kammonen/bioinfo

Cantarel BL, Korf I, Robb SMC, Parra G, Ross E, Moore B, Holt C, Sanchez Alvarado A, Yandell M (2008). MAKER: An easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Research, 18(1), 188–196.

Page 7: Advanced practical course in genome bioinformaticsekhidna.biocenter.helsinki.fi/downloads/teaching/spring2017/Practic… · •Web Apollo –collaborative online manual annotation

Gene prediction accuracy revisited

26.2.2016 [email protected] http://padlet.com/juhana_kammonen/bioinfo

Page 8: Advanced practical course in genome bioinformaticsekhidna.biocenter.helsinki.fi/downloads/teaching/spring2017/Practic… · •Web Apollo –collaborative online manual annotation

Web Apollo manual annotation tool

26.2.2016 [email protected] http://padlet.com/juhana_kammonen/bioinfo

Lee E, Helt GA, Reese JT, Munoz-Torres MC, Childers CP, Buels RM, Stein L, Holmes IH, Elsik CG, Lewis SE (2013). Web Apollo: a web-based genomic annotation editing platform. Genome Biology, 14(8), R93.

Page 9: Advanced practical course in genome bioinformaticsekhidna.biocenter.helsinki.fi/downloads/teaching/spring2017/Practic… · •Web Apollo –collaborative online manual annotation

Web Apollo annotation view

26.2.2016 [email protected] http://padlet.com/juhana_kammonen/bioinfo

Page 10: Advanced practical course in genome bioinformaticsekhidna.biocenter.helsinki.fi/downloads/teaching/spring2017/Practic… · •Web Apollo –collaborative online manual annotation

Basic recipe for manually annotating a single gene• Use as many relevant tracks of evidence as possible to

verify a predicted gene

• Align the predicted sequence against various databases

• BLAST

• Possible databases of related species

• Add comments on the annotation of your findings

• If the predicted annotation was modified, specify the reason

• Web Apollo allows the annotations to be marked e.g. as ”needs revision”

26.2.2016 [email protected] http://padlet.com/juhana_kammonen/bioinfo

Page 11: Advanced practical course in genome bioinformaticsekhidna.biocenter.helsinki.fi/downloads/teaching/spring2017/Practic… · •Web Apollo –collaborative online manual annotation

Today’s features

• Web Apollo – collaborative online manual annotation tool• Lee E, Helt GA, Reese JT, Munoz-Torres MC, Childers CP, Buels RM, Stein L,

Holmes IH, Elsik CG, Lewis SE (2013). Web Apollo: a web-based genomic annotation editing platform. Genome Biology, 14(8), R93.

• BLAST – Basic Local Alignment Search Tool• ”Swiss army knife” of a bioinformatician

• Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990). Basic local alignment search tool. Journal of Molecular Biology, 215(3):403-10

• MAFFT – Multiple Alignment Using Fast Fourier Transform• Katoh K, Misawa K, Kuma K, & Miyata T (2002). MAFFT: a novel method

for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Research, 30(14), 3059–3066.

26.2.2016 [email protected] http://padlet.com/juhana_kammonen/bioinfo

Page 12: Advanced practical course in genome bioinformaticsekhidna.biocenter.helsinki.fi/downloads/teaching/spring2017/Practic… · •Web Apollo –collaborative online manual annotation

Manual annotation efforts at Viikki campus• Betula pendula (silver birch) genome annotation in

spring of 2014• 1000 genes curated and annotated during 3 weeks

• Taphrina betulina genome annotation in 2015• 800 genes curated and annotated during 2 weeks

• P. Hispida saimensis genome annotation coming up later this year

26.2.2016 [email protected] http://padlet.com/juhana_kammonen/bioinfo

Page 13: Advanced practical course in genome bioinformaticsekhidna.biocenter.helsinki.fi/downloads/teaching/spring2017/Practic… · •Web Apollo –collaborative online manual annotation

Next: Computer exercises

• Getting familiar with Web Apollo http://apollo.berkeleybop.org

• Example annotation of a gene in Apis mellifera (honeybee) genome with Web Apollo

• Confirmation of proper annotation using BLAST and MAFFT

• Download exercise sheet from: http://ekhidna.biocenter.helsinki.fi/downloads/teaching/spring2017/Exercises_day6.pdf

26.2.2016 [email protected] http://padlet.com/juhana_kammonen/bioinfo