nature's most abundant genes (iscb rocky mountain bioinformatics meeting 2009)

11
Ramy K. Aziz San Diego State University & Cairo University Rocky 2009 Dec 10 2009 Nature’s most successful genes?

Upload: ramykaram

Post on 13-Jan-2015

1.458 views

Category:

Education


0 download

DESCRIPTION

Oral presentation in the 7th Annual Rocky Mountain Bioinformatics Meeting http://www.iscb.org/rocky09 Most results were published in http://nar.oxfordjournals.org/cgi/content/full/gkq140v1

TRANSCRIPT

Page 1: Nature's most abundant genes (ISCB Rocky Mountain Bioinformatics Meeting 2009)

Ramy K. AzizSan Diego State University & Cairo University

Rocky 2009Dec 10 2009

Nature’s mostsuccessful genes?

Page 2: Nature's most abundant genes (ISCB Rocky Mountain Bioinformatics Meeting 2009)
Page 3: Nature's most abundant genes (ISCB Rocky Mountain Bioinformatics Meeting 2009)
Page 4: Nature's most abundant genes (ISCB Rocky Mountain Bioinformatics Meeting 2009)

• What is prevalence? For an object x,

– Ubiquity (number of sets to which x belongs)

– Abundance (“average” frequency of x in a set)

@sets = (genomes, metagenomes, biomes)

• What to count? (PEG/ EGT/ function/ family)?

• How to count? and where (genomes/ MGs)?

– Gene length matters frequency / gene length

– Metagenome size matters relative abundance

Spelling out the question:half the way to the answer

Page 5: Nature's most abundant genes (ISCB Rocky Mountain Bioinformatics Meeting 2009)

• Current knowledge:RuBisCo* (*ribulose-1,5-bis phosphate carboxylase) is the enzyme with the highest copy number (mass?) in ecosystems. However, its gene is neither the most ubiquitous nor the most abundant

• Any guesses? (an enzyme? a transcription factor? a transporter? DNA

metabolism? Carbohydrate metabolism?)

– Guess 1:

– Guess 2:

– Guess 3:

Spelling out the question:half the way to the answer

Page 6: Nature's most abundant genes (ISCB Rocky Mountain Bioinformatics Meeting 2009)

And the winner is …

Page 7: Nature's most abundant genes (ISCB Rocky Mountain Bioinformatics Meeting 2009)

And the winner is …

Page 8: Nature's most abundant genes (ISCB Rocky Mountain Bioinformatics Meeting 2009)

Metagenomes

187 sets;

6

million sequences

Pearson Corr.0.524 eco-essentiality

Life essentials

fert

ility

Habitat -specific

Page 9: Nature's most abundant genes (ISCB Rocky Mountain Bioinformatics Meeting 2009)

Gene ubiquity in genomes (2,137)

Pearson Corr.0.645

Transposase

ABC transporterATP-binding

Glycosyltransferase

ABC transporterpermease

Two-component Sensor/ Regulator

tRNA synthetases

Page 10: Nature's most abundant genes (ISCB Rocky Mountain Bioinformatics Meeting 2009)

(How/Why) Does it matter?

• Current annotations suck! Improvement needed.

• Transposases no longer ‘junk hypothetical proteins’; their quorum dictates attention!

• The ‘selfish’ transposase genes must be offering their hosts some advantage.

• If rRNA is used to track genomes’ vertical history, transposases can track ‘horizontal’ history.

• Cheaters (always?) win…

• Transposases shall inherit the earth?

Page 11: Nature's most abundant genes (ISCB Rocky Mountain Bioinformatics Meeting 2009)

• This study could not have been possible without…

Rob Edwards & Mya Breitbart

And:

• Forest Rohwer, Liz Dinsdale, Anca Segall, Peter Salamon, & the Math group

• NSF funding (PhAnToMe grant)

Acknowledgment