the genome

10
The WordSeeker Functional Genomics Toolkit Lonnie Welch, Stuckey Professor Bioinformatics Laboratory Electrical Engineering and Computer Science Biomedical Engineering Program Molecular and Cellular Biology Program Ohio University [email protected]

Upload: talon-porter

Post on 30-Dec-2015

42 views

Category:

Documents


0 download

DESCRIPTION

The WordSeeker Functional Genomics Toolkit Lonnie Welch, Stuckey Professor Bioinformatics Laboratory Electrical Engineering and Computer Science Biomedical Engineering Program Molecular and Cellular Biology Program Ohio University [email protected]. genes. junk. The genome. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: The genome

The WordSeeker Functional Genomics Toolkit

Lonnie Welch, Stuckey ProfessorBioinformatics Laboratory

Electrical Engineering and Computer Science Biomedical Engineering Program

Molecular and Cellular Biology Program Ohio University [email protected]

Page 2: The genome

The genome

genes

junk

Genes: 3% Junk: 97%

"DNA differs from written language in that islands of sense are separated by a sea of nonsense, never transcribed." (Richard Dawkins, 2004)

"So much junk DNA in our genome." (S. Ohno, 1972)

Page 3: The genome

“The aim of the ENCODE (encyclopaedia of DNA elements) project is to identify every sequence with functional properties in the human genome.

Some highlights of the pilot phase of this project:

•involved an analysis of 1% (30 megabases) of the human genome

•remarkably:•much functional information is not “conserved” across organisms

•up to 93% of bases in the ENCODE regions are transcribed

•not good news for genes, which will no longer be able to hog the limelight

•the genome is much more than a mere vehicle for genes

[1] John M. Greally, Genomics: Encyclopaedia of humble DNA, Nature 447, 782-783 (14 June 2007).

Page 4: The genome

The genome genes

Functional elements?

Functional Elements: 90%?? Junk: 10%??

“Perhaps it is time to bid farewell to the term ‘junk’ DNA – we knew not your true nature.” (Regulatory RNAs and the demise of ‘junk’ DNA. Genome Biology 2006, 7:328)

"...a certain amount of hubris was required for anyone to call any part of the genome 'junk, ' given our level of ignorance."(Francis Collins, 2006)

Page 5: The genome

WordSeeker

Page 6: The genome

WordSeeker Users

OU• Sarah Wyatt (NSF, NASA) – plant gravitropism; regulatory genomics• Allan Showalter (NSF, USDA) – cell wall genes; functional genomics• Susan Evans (NIH) – regulatory aspects of cancer

OARDC• Eric Stockinger (NSF) – cold tolerance in crops

OSU• Erich Grotewold (NSF, USDA, DOE) – genome-wide regulatory genomics• Rebecca Lamb (NSF) – cell development

BGSU• Paul Morris (NSF, DOE) – homology in Oomycete promoters

National Human Genome Research institute (NIH)• Laura Elnitski – regulatory aspects of cancer

Centers for Disease Control• Henry Wan – avian flu

Page 7: The genome

Genome Database

                                                                                                   

      

• organized in six major organism groups: Archaea, Bacteria, Eukaryotae, Viruses, Viroids, and Plasmids

• provides views for a variety of genomes, complete chromosomes, sequence maps with contigs, and integrated genetic and physical maps

source: National Center for Biotechnology Information, April 2008.

Page 8: The genome

Additional Suggestions(Prof. Frank Drews, OU EECS)

Desirable hardware features: Memory intense applications utilizing many cores will

saturate the front-side busses Large number of cores high front-side bus bandwidth

Large last-level caches Equip cluster nodes with the new Graphics Processing

Units (GPU’s) (such as NVIDIA's GeForce 8800 series GPU's) for memory intense algorithms Can off-load some processing to these GPU’s A number of recent bioinformatics algorithms run on

these GPU’s and show impressive speed-ups • E.g., M. Schatz, C. Trapnell, A. Delcher, A. Varshney,

High-throughput sequence alignment using Graphics Processing Units, BMC Bioinformatics, Vol. 8, No. 1. (2007)

Page 9: The genome
Page 10: The genome

Ohio Bioinformatics Consortium Statewide Bioinformatics Curriculum

Comprehensive curriculum Shared courses Managed by Ralph Regula School

Bioinformatics Research Infrastructure State-of-the-art Biological researchers define requirements Bioinformatics researchers design algorithms to meet

requirements Ohio Supercomputer Center integrates, hosts and supports

bioinformatics software

$9M will be invested over the next 5 years via Choose Ohio First.