viewing & getting go cost functional modeling workshop 22-24 april, helsinki

Download Viewing & Getting GO COST Functional Modeling Workshop 22-24 April, Helsinki

If you can't read please download the document

Upload: rudolph-kory-melton

Post on 27-Dec-2015

216 views

Category:

Documents


2 download

TRANSCRIPT

  • Slide 1
  • Viewing & Getting GO COST Functional Modeling Workshop 22-24 April, Helsinki
  • Slide 2
  • Summary 1.Ontology Browsers QuickGO, AmiGO searching for GO Terms, getting GO Plant Ontology Browser 2.Finding/ Adding GO for Functional Modeling. GOProfiler summary of GO for your species GORetriever gets evisting GO GOanna adds GO (Blast) Adding GO for large datasets 3.Array annotation 4.Using added GO in GO Enrichment Analysis.
  • Slide 3
  • GO Browsers QuickGO Browser (EBI GOA Project) http://www.ebi.ac.uk/ego/ protein annotations search by GO Term or by UniProt ID AmiGO Browser (GO Consortium Project) http://amigo.geneontology.org/cgi- bin/amigo/go.cgi search by GO Term or by accession
  • Slide 4
  • http://www.ebi.ac.uk/QuickGO/ Example: QuickGO Browser
  • Slide 5
  • QuickGO Features Searching for gene products Can use gene/protein gene names, but better to use accessions. Works off UniProt accessions/IDs Can enter multiple accessions (separated by a space). Can search for GO Terms Has autocomplete Provides ranked list of matches Matches are also grouped by BP, CC, MF GO Terms definitions, annotations, parents, children terms. Advanced filtering options. Download as protein lists or gene association file format.
  • Slide 6
  • record information annotation data number of annotations
  • Slide 7
  • Add taxon ID for horse. (Use to find taxon ID for your species.)
  • Slide 8
  • Slide 9
  • Slide 10
  • http://amigo.geneontology.org Example: AmiGO Browser
  • Slide 11
  • AmiGO Features Need to select either a gene product of GO Term search. Searching for gene products Can use gene/protein gene names, but better to use accessions. Works off multiple accessions/IDs Only accepts a single accession, not a list. View information about gene product & about annotations for that gene product. Can search for GO Terms Large numbers of GO annotations are truncated. Some filtering options. Filter by ontology or evidence code. Filter by database or species. Download as sequences or gene association file.
  • Slide 12
  • Plant Ontology (PO) Browser describes plant anatomy and morphology and stages of development for all plants Plant Anatomy e.g., plant structures (PO:0009011) such as plant organ (PO:0009008), plant cell (PO:0009002), whole plant (PO:0000003), portion of plant tissue (PO:0009007), and vascular system (PO:0000034), etc. Plant Structure Development Stage e.g., plant tissue development stage (PO:0025423), leaf development stage (PO:0001050), whole plant development stage (PO:0007033), seed development stage (PO:0001170), and sporophyte development stage (PO:0028002), etc.
  • Slide 13
  • http://www.plantontology.org/
  • Slide 14
  • Ontology Browsers Use to identify specific ontology terms of interest. Use to download specific annotation files for specific gene lists for species use as input for GO or PO expression analysis
  • Slide 15
  • Tutorial 1. Familiarizing your self with ontology browsers. OR Use browsers to look for GO/PO for accessions from your own data set.
  • Slide 16
  • 2. Finding/ Adding GO for Functional Modeling How much GO is available for your species? How much GO is available for your data set? How much of this is in the tool(s) you want to use? Do you need to add GO? GOProfilerGORetrieverLast update? Source? GOanna, Blast2GO, etc
  • Slide 17
  • GOProfiler GOProfiler allows you get an overview of what GO annotation exists for the species you are interested in.
  • Slide 18
  • Slide 19
  • Number of proteins is based upon GO Consortium records for these species. Species with only IEA annotations do not have an active GO annotation project GO provided automatically by EBI GOA Project.
  • Slide 20
  • GORetriever Allows you to get existing GO annotations for a specific set of gene products. Accepts a text file of accessions or IDs. Returns GO annotations, list of accessions that have no GO and a GO Summary file.
  • Slide 21
  • Input file text file of return separated accessions.
  • Slide 22
  • GORetriever Results
  • Slide 23
  • Slide 24
  • add GO to this list using GOanna or Blast2GO
  • Slide 25
  • GORetriever Results do functional grouping using GOSlimViewer
  • Slide 26
  • only returns existing GO only accepts limited accession types GOanna does a Blast search against existing GO annotated products. allows you to quickly transfer GO to gene products where they have similar sequences accepts fasta files
  • Slide 27
  • Incorrect email address you will not receive your results! Contact AgBase if you have not received results after 24-48h.
  • Slide 28
  • GOanna Results If you enter an incorrect email address you will not receive your results! Contact AgBase if you have not received results after 24-48h.
  • Slide 29
  • query IDs are hyperlinked to BLAST data (files must be in the same directory)
  • Slide 30
  • *WHAT IS A GOOD ALIGNMENT? 1. Manually inspect alignments and delete any lines where there is not a good alignment*. 2. Add this additional annotation to the annotations from GORetriever.
  • Slide 31
  • GOanna2ga New to AgBase: an online script to convert your GOanna file to a gene association file format. add manually checked GOanna annotations to a GORetriever file
  • Slide 32
  • Tutorial 2 Getting GO. GOProfiler check what is available GORetriever get existing GO GOanna add GO annotations Note - you will use Blast2GO to add additional GO annotations to your data sets tomorrow. OR Getting existing GO & adding additional GO to your own data set.
  • Slide 33
  • Some limitations of GOanna: BLAST analysis is slow results emailed limit to 5,000 inputs or an overall file size of 6Mb limit to 3 jobs submitted/user at one time How do I do to get GO for my 50,000 RNA-Seq dataset? 50 x GOanna submissions + manual interpretation of results impractical and slow!! ALTERNATIVELY: Contact AgBase we use internal GO annotation pipelines/queuing We can help customize databasess GO can be kept private and released after publication
  • Slide 34
  • How do I do to get GO for my 50,000 RNA-Seq dataset? GOanna is being deployed on the iPlant discovery environment increased computing capacity faster Blast searches no limitations on file number or size http://www.iplantcollaborative.org/
  • Slide 35
  • GO annotation of RNA-Seq data 1.Retrieve any existing GO annotation for gene products Genome2Seq: Rapidly retrieves a fasta file of sequences and GO based on genome co-ordinates generated from RNA-Seq data. 2.InterProScan identifies functional motifs and domains Can be mapped to GO terms (IEA) VERY computer intensive do this on HPC resources; being implemented on iPlant Improved results if transcripts are translated (e.g. EMBOSS) 3.BLAST based similarity transfer (ISA) e.g. Blast2GO, GOanna Should only transfer GO annotations based upon direct experimental evidence codes. Need to test sample set to determine good matches/Evalues. 4.Combine GO annotations into single file. Remove duplicates
  • Slide 36
  • Slide 37
  • Adding GO Annotation GO annotations are usually added as gene association files. Check the number of the columns. Can check file format against the GO guide: Check your analysis tool: accepts additional GO annotations format required http://www.geneontology.org/GO.format.annotation.shtml
  • Slide 38
  • GO Enrichment tools that support agricultural species.