cacao training
DESCRIPTION
CACAO Training. ASM-JGI 2012. Transferring information to new genomes. Lists of genes. Database. New knowledge. Known functions of Homologs or subsets. Curation is rate limiting. Literature. Database. Biocurators (rate limiting). Datasets. CACAO is growing. CACAO biodiversity. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/1.jpg)
CACAO TrainingASM-JGI 2012
![Page 2: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/2.jpg)
Transferring information to new genomes
Database
Lists of genes
Known functions ofHomologs or subsets
New knowledge
![Page 3: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/3.jpg)
Curation is rate limiting
Literature
Datasets
Biocurators(rate limiting)
Database
![Page 4: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/4.jpg)
CACAO is growing
Spring 2010 Fall 2010 Spring 2011 Fall 2011 Spring 20120
500
1000
1500
2000
1 2 5 9 616 2297
309
165153
753871
1796
1316
schools
students
annotations
![Page 5: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/5.jpg)
CACAO biodiversityE.
coli
Hum
anM
ouse
Pseu
dom
onas
Bacil
lus
Arab
idop
sisSt
rept
ococ
cus
Sacc
haro
myc
es Rat
Xant
hom
onas
Lact
obac
illus
Clos
tridi
umVi
brio
Dros
ophi
laBo
rrel
ia
Cory
neba
cter
ium
Stap
hylo
cocc
usCa
mpy
loba
cter
Citro
bacte
rLe
ishm
ania
0
50
100
150
200
250
Anno
tatio
ns
Spring 2012
![Page 6: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/6.jpg)
CACAO 2
• CACAO changes the job of the professionals from primary curation to assessment
• Growth in CACAO makes assessment rate limiting
• Solution: Promote CACAO veterans to help with assessment
![Page 8: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/8.jpg)
The biocurator training …
![Page 9: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/9.jpg)
What’s in it for you?
– We hope you will • learn how we think about protein function• gain skills that will help your future career• enjoy contributing to a resource used by people all over the world• have fun!
![Page 10: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/10.jpg)
Annotation
Annotation: a note that is made while reading any form of text
For scientists,1. Nucleotide level: Where the genes are in
the genome 2. Protein level: What their functions are
From Wikipedia
![Page 11: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/11.jpg)
Annotation
Annotation: a note that is made while reading any form of text
For scientists,1. Nucleotide level: Where the genes are in
the genome 2. Protein level: What their functions are
From Wikipedia
![Page 12: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/12.jpg)
Functional Annotation
Annotation: a note that is made while reading any form of text
Functional Annotation: a note in a specific format that is made based on evidence in a peer-reviewed paper about the attributes of a protein
![Page 13: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/13.jpg)
Functional Annotation
Functional Annotation: a note in a specific format that is made based on evidence in a peer-reviewed paper about the attributes of a protein
• Specific format = GO (Gene Ontology) Annotation
![Page 14: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/14.jpg)
GO (Gene Ontology) Annotations• 3 aspects (ontologies) for
describing protein attributes:1. Biological Process2. Molecular Function3. Cellular Component
• Controlled vocabulary– Everyone uses the same terms– Terms have 7 digit IDs that computers can
understand
• Relationships between terms
GO:0005886
![Page 15: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/15.jpg)
Molecular Function• activities or “jobs” of a gene product
GO:0004347 hexokinase activity
From PMID:9341134, rndsystems.com
GO:0016301 Kinase activity
![Page 16: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/16.jpg)
Biological Process• a commonly recognized series of events
GO:0051301 cell division
From ridge.icu.ac.jp, edtech.clas.pdx.edu, scielosp.org
GO:0006351 transcription, DNA dependent
GO:0009405 pathogenesis
![Page 17: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/17.jpg)
Cellular Component• where a gene product acts
From visualphotos.com, epmm.group.shef.ac.uk, http://www.cellsignal.com/products/2415.html
GO:0005739 mitochondrion
GO:0009274 peptidoglycan-based
cell wall
GO:0005840 ribosome
![Page 18: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/18.jpg)
Where can you search for GO terms? GONUTS (gowiki.tamu.edu)
- http://gowiki.tamu.edu- http://www.ebi.ac.uk/QuickGO- http://amigo.geneontology.org
![Page 19: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/19.jpg)
![Page 20: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/20.jpg)
![Page 21: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/21.jpg)
![Page 22: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/22.jpg)
![Page 23: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/23.jpg)
What do you actually need once you have found the correct term?
GO:0004713
![Page 24: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/24.jpg)
Functional Annotation
Functional Annotation: a note in a specific format that is made based on evidence in a peer-reviewed paper about the attributes of a protein
• Specific format = GO (Gene Ontology) Annotation
• Peer-reviewed paper
![Page 25: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/25.jpg)
Finding a scientific paper
• Has to be a scientific paper with experimental data in it. (Anything else is a valid reason to challenge!!)
• No review articles, no books, no textbooks, no wikipedia articles, no class notes…
• You will need the PMID number
22110029
![Page 26: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/26.jpg)
Functional Annotation
Functional Annotation: a note in a specific format that is made based on evidence in a peer-reviewed paper about the attributes of a protein
• Specific format = GO (Gene Ontology) Annotation• Peer-reviewed paper• Protein
![Page 27: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/27.jpg)
What can you annotate? Proteins.• PubMed for papers on a specific topic or protein or GO term• Search UniProt for something interesting (i.e. allergen) or a
protein of interest (i.e. PcnB)• Check the references in the paper you are currently reading
No matter what, you will need to find the protein’s accession on UniProt (http://uniprot.org)
Use that accession to make a page for that protein on GONUTS (http://gowiki.tamu.edu)
Add your GO annotations to the protein’s page on GONUTS
![Page 28: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/28.jpg)
Why do you need an accession from UniProt (http://www.uniprot.org)?
1. UniProt is not editable by the community, but GONUTS is.2. GONUTS can make a page that has the annotations from UniProt for
any protein using it’s UniProt accession.3. Correct & complete annotations at the end of the competition will be
submitted back to UniProt.
*
![Page 29: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/29.jpg)
How do you make a new protein page in GONUTS?
1
2
• GoPageMaker will: Check if the page exists in GONUTS & take you there if it does. Make a page if it does not exist in GONUTS already & pull all of the
annotations from UniProt into a table that you can edit.
• Make as many protein pages as you would like!
![Page 30: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/30.jpg)
Annotations
edit table
![Page 31: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/31.jpg)
Functional Annotation
Functional Annotation: a note in a specific format that is made based on evidence in a peer-reviewed paper about the attributes of a protein
• Specific format = GO (Gene Ontology) Annotation• Peer-reviewed paper• Protein
![Page 32: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/32.jpg)
Annotations
edit table
![Page 33: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/33.jpg)
Form for your annotation (when you edit the table)
![Page 34: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/34.jpg)
4 REQUIRED parts of EVERY GO annotation
GOEvidence
code
ReferenceNotes (about evidence)
![Page 35: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/35.jpg)
Summary of Evidence Codes for CACAO
Evidence codes describe the type of work or analysis done by the authors
• IDA: Inferred from Direct Assay• IMP: Inferred from Mutant Phenotype• IGI: Inferred from Genetic Interaction• ISO: Inferred from Sequence Orthology• ISA: Inferred from Sequence Alignment• ISM: Inferred from Sequence Model• IGC: Inferred from Genomic Context
If it’s not one of these 7, your annotation is incorrect!!!
http://gowiki.tamu.edu/wiki/index.php/evidence_codes
![Page 36: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/36.jpg)
Functional Annotation
Functional Annotation: a note in a specific format that is made based on evidence in a peer-reviewed paper about the attributes of a protein
• Specific format = GO (Gene Ontology) Annotation• Peer-reviewed paper• Protein• Evidence code
![Page 37: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/37.jpg)
4 REQUIRED parts of EVERY GO annotation
GOEvidence
code
ReferenceNotes (about evidence)
![Page 38: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/38.jpg)
2 other parts that may rarely be required…
With/From
Qualifier
![Page 39: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/39.jpg)
How is CACAO scored? Rounds
• Points for a complete AND correct annotation (normally 1 week/round, today = 25 mins)
• 4 necessary parts• May be additional parts• NOTE: We will take away points if the annotation is not correct when assessed by an
experienced CACAO biocurator
• Challenges are used to steal points for incorrect &/or incomplete annotations (normally 1 week/round, today = 20 mins)
• Identify a problem • Suggest correct alternative
• Refinements can be entered by any team (during any challenge week)
![Page 40: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/40.jpg)
Scoreboard & Challengeshttp://gowiki.tamu.edu/wiki/index.php/
Category:ASM_JGI_challenge
![Page 41: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/41.jpg)
Team & Individual Pages
challenge
![Page 42: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/42.jpg)
Challenges
1. Enter the reason for your challenge here. - (i.e. What’s wrong)
2. Provide the fix(es) for it.
![Page 43: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/43.jpg)
Annotation discussion (aka argument)
![Page 44: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/44.jpg)
• UniProt – http://uniprot.org– Find your protein(s) here (UniProt accession required)
• PubMed – http://pubmed.org– Find your papers about the protein’s attributes (molecular function,
biological process, cellular component)
• GONUTS – http://gowiki.tamu.edu– Search for GO terms– Make page for your protein on GONUTS (using UniProt accession)– Add your annotation to the protein’s Annotation table during first
(Annotation) week of any round– Review and challenge competitors’ annotations during the second
(challenge) week of any round
![Page 45: CACAO Training](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f68550346895dce682f/html5/thumbnails/45.jpg)
ASM-JGI Competition!
• You now have 25 mins to:– Use the assigned paper for your group and …– Find the correct UniProt accession– Make the page for the protein on GONUTS– Make at least one annotation
• You will have 20 mins to challenge other teams’ annotations– What fields are wrong & why?!