manifestations of a code genes, genomes, bioinformatics and cyberspace – and the promise they hold...

21
Manifestations of a Code Genes, genomes, bioinformatics and cyberspace – and the promise they hold for biology education

Upload: felix-cook

Post on 28-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Manifestations of a Code Genes, genomes, bioinformatics and cyberspace – and the promise they hold for biology education

Manifestations of a Code

Genes, genomes, bioinformatics and cyberspace – and the promise they

hold for biology education

Page 2: Manifestations of a Code Genes, genomes, bioinformatics and cyberspace – and the promise they hold for biology education

The iPlant CollaborativeVision

www.iPlantCollaborative.org

Enable life science researchers and educators touse and extend cyberinfrastructure

Page 3: Manifestations of a Code Genes, genomes, bioinformatics and cyberspace – and the promise they hold for biology education

A GENOME is all of a living thing’s genetic material.

The genetic material is DNA (DeoxyriboNucleic Acid)

DNA, a double helical molecule, is made up of four nucleotide “letters”:A-- G--

T-- C--

What is a genome?

Slide: JGI, 2009

Page 4: Manifestations of a Code Genes, genomes, bioinformatics and cyberspace – and the promise they hold for biology education

Just as computer software is rendered in long strings of 0s and 1s, the GENOME or “software” of life is represented by a string of the four nucleotides, A, G, C, and T.

To understand the software of either - a computer or a living organism - we must know the order, or sequence, of these informative bits.

What is sequencing?

Slide: JGI, 2009

Page 5: Manifestations of a Code Genes, genomes, bioinformatics and cyberspace – and the promise they hold for biology education

¢0.57

¢0.19

¢0.35

Sequ

ence

pro

ducti

on (B

illio

ns o

f bas

es/m

onth

)Se

quen

ce p

rodu

ction

(Bill

ions

of b

ases

/mon

th)

¢0.50¢0.50

¢1 ¢1

00

Cost: Cents per baseCost: Cents per base

1.01.0

00

2.02.0

3.03.0

19891989 19911991 19931993 19951995 19971997 19991999 20032003 2005200520012001

¢0.46

¢0.08

20072007

Human Genome completed

Economics of Scale

Human Genome launched

> ¢0.05

Slide: JGI, 2009

Page 6: Manifestations of a Code Genes, genomes, bioinformatics and cyberspace – and the promise they hold for biology education

•1986 DOE announces Human Genome Initiative-- $5.3 million to develop technology.

•1990 DOE & NIH present their HGP plan to Congress.

1997 Escherichia coli genome published

•1997 Yeast genome published

•2000 Fruit fly (Drosophila) genome published.

•2000 Working draft of the human genome announced.

•2000 Thale cress (Arabidopsis) genome published (2x).

•2002 Rice genome published (2x).

•2003 Human genome published.

•2006 First tree genome published in Science.

•2007 First metagenomics study published

Important Dates in Genomics

Page 7: Manifestations of a Code Genes, genomes, bioinformatics and cyberspace – and the promise they hold for biology education

Another angle

Slide: Stein, 2010

Page 8: Manifestations of a Code Genes, genomes, bioinformatics and cyberspace – and the promise they hold for biology education

Coming into the Genome Age

For the first time in the history of science students can work with the same data and tools that are used by researchers.

Learning by posing and answering question.

Students generate new knowledge.

Page 9: Manifestations of a Code Genes, genomes, bioinformatics and cyberspace – and the promise they hold for biology education

Workshop Objectives

Illustrate the evolving concept of “gene.” Conceptualize a “big picture” of complex, dynamic

genomes. Guide students to address real problems through modern

genome science. Use educational and research interfaces for bioinformatics. Work with “real” genome sequences gathered by students

– in the lab or online.

Page 10: Manifestations of a Code Genes, genomes, bioinformatics and cyberspace – and the promise they hold for biology education

Exciting?

>mouse_ear_cress_1080 GAAATAATCAATGGAATATGTAGAGGTCTCCTGTACCTTCACAGAGATTCTAGGCTGAGAGCAGTGCATATAGATATCTTTCGTACTCATCTGCTTTTTCTGGTCTCCATCACAAAAGCCAACTAGGTAATCATATCAATCTCTCTTTACCGTTTACTCGACCTTTTCCAATCAGGTGCT TCTGGTGTGTCTACTACTATCAGTTTTAGGTCTTTGTATACCTGATCTTATCTGCTACTG AGGCTTGTAAAAGTGATTAAAACTGTGACATTTACTCTAAGAGAAGTAACCTGTTTGATGCATTTCCCTAATATACCGGTGTGGAAAAGTGTAGGTATCTGTACTCAGCTGAAATGGTGGACGATTTTGAAGAAGATGAACTCTCATTGACTGAAAGCGGGTTGAAGAGTGAAGATGGCGTTATTATCGAGATGAATGTCTCCTGGATGCTTTTATTATCATGTTTGGGAATTTACCAAGGGAGAGGTATCAGAATCTATCTTAGAAGGTTACATTTAGCTCAAGCTTGCATCAACATCTTTACTTAGAGCTCTACGGGTTTTAGTGTGTTTGAAGTTTCTTAACTCCTAGTATAATTAGAATCTTCTGCAGCAGACTTTAGAGTTTTGGGATGTAGAGCTAACCAGAGTCGGTTTGTTTAAACTAGAATCTTTTTATGTAGCAGACTTGTTCAGTACCTGAATACCAGTTTTAAATTACCGTCAGATGTTGATCTTGTTGGTAATAATGGAGAAACGGAAGAATAATTAGACGAAACAAACTCTTTAAGAACGTATCTTTCAGTTTTCCATCACAAATTTTCTTACAAGCTACAAAAATCGAACTATATATAACTGAACCGAATTTAAACCGGAGGGAGGGTTTGACTTTGGTCAATCACATTTCCAATGATACCGTCGTTTGGTTTGGGGAAGCCTCGTCGTACAAATACGACGTCGTTTAAGGAAAGCCCTCCTTAACCCCAGTTATAAGCTCAAAGTTGTACTTGACCTTTTTAAAGAAGCACGAAACGAAAAACCCTAAAATTCCCAAGCAGAGAAAGAGAGACAGAGCAAGTACAGATTTCAACTAGCTCAAGATGATCATCCCTGTTCGTTGCTTTACTTGTGGAAAGGTTGATATTTTCCCCTTCGCTTTGGTCTTATTTAGGGTTTTACTCCGTCTTTATAGGGTTTTAGTTACTCCAAATTTGGCTAAGAAGAGATCTTTACTCTCTGTATTTGACACGAATGTTTTTAATCGGTTGGATACATGTTGGGTCGATTAGAGAAATAAAGTATTGAGCTTTACTAAGCTTTCACCTTGTGATTGGTTTAGGTGATTGGAAACAAATGGGATCAGTATCTTGATCTTCTCCAGCTCGACTACACTGAAGGGTAAGCTTACAATGATTCTCACTTCTTGCTGCTCTAATCATCATACTTTGTGTCAAAAAGAGAGTAATTGCTTTGCGTTTTAGAGAAATTAGCCCAGATTTCGTATTGGGTCTGTGAAGTTTCATATTAGCTAACACACTTCTCTAATTGATAACAGAAGCTATAAAATAGATTTGCTGATGAAGGAGTTAGCTTTTTATAATCTTCTGTGTTTGTGTTTTACTGTCTGTGTCATTGGAAGAGACTATGTCCTGCCTATATAATCTCTATGTGCCTATCTAGATTTTCTATACAATTGATATTTGATAGAAGTAGAAAGTAAGACTTAAGGTCTTTTGATTAGACTTGTGCCCATCTACATGATTCTTATTGGACTAATCATTCTTTGTGTGAAAATAGAATACTTTGTCTGAACATGAGAGAATGGTTCATAATACGTGTGAAGTATGGGATTAGTTCAACAATTTCGCTATTGGAGAAGCAAACCAAGGGTTAATCGTTTATAGGGTTAAGCTAATGCTCTGCTCTTTATATGTTATTGGAACAGACTATTGTTGTGCCTATCTTGTTTAGTTGTAGATTCTATCTCGACTGTTATAAGTATGACTGAAGGCTTGATGACTTATGATTCTCTTTACACCTGTAGAAGGATTTAAGCTTGGTGTCTAGATATTCAATCTGTGTTGGTTTTGTCTTTCTTTTGGCTCTTAGTGTTGTTCAATCTCCTCAATAGGTATGAAGTTACAATATCCTTATTATTTTGCAGGGACGCACTTGATGCACTCCAGCTAGTCAGATACTGCTGCAGGCGTATGCTAATGACCTTGCATCAACATCTTTACTTAGAGCTCTACGGGTTTTAGTGTGT

Page 11: Manifestations of a Code Genes, genomes, bioinformatics and cyberspace – and the promise they hold for biology education

This better?

Page 12: Manifestations of a Code Genes, genomes, bioinformatics and cyberspace – and the promise they hold for biology education

FindGene Families

Generate mathematical

evidence

Analyze large data amounts

Browse in context

Build gene models

Gatherbiological evidence

Annotation workflow

Get DNA sequence

Page 13: Manifestations of a Code Genes, genomes, bioinformatics and cyberspace – and the promise they hold for biology education

Walk or…

Page 14: Manifestations of a Code Genes, genomes, bioinformatics and cyberspace – and the promise they hold for biology education

Early concept (2009)

Page 15: Manifestations of a Code Genes, genomes, bioinformatics and cyberspace – and the promise they hold for biology education

DNA Subway 2014

Page 16: Manifestations of a Code Genes, genomes, bioinformatics and cyberspace – and the promise they hold for biology education

Molecular biology and bioinformatics conceptsRepeatMasker• Eukaryotic genomes contain large amounts of repetitive DNA.• Transposons can be located anywhere.• Transposons can mutate like any other DNA sequence.

FGenesH Gene Predictor• Protein-coding information begins with start, followed by codons, ends in stop.• Codons in mRNA (AUG, UAA,…) have sequence equivalents in DNA (ATG, TAA,…).• Most eukaryotic introns have “canonical splice sites,” GT---AG (mRNA: GU---AG).• Gene prediction programs search for patterns to predict genes and their structure.• Different gene prediction programs may predict different genes and/or structures.

Multiple Gene Predictors• The protein coding sequence of a mRNA is flanked by untranslated regions (UTRs).• UTRs hold regulatory information.

BLAST Searches• Gene or protein homologs share similarities due to common ancestry. • Biological evidence is needed to curate gene models predicted by computers.• mRNA transcripts and protein sequence data provide “hard” evidence for genes.

Page 17: Manifestations of a Code Genes, genomes, bioinformatics and cyberspace – and the promise they hold for biology education

What is a gene?

• Can we define a gene?• Has the definition of a gene changed?• How can we find genes?

Page 18: Manifestations of a Code Genes, genomes, bioinformatics and cyberspace – and the promise they hold for biology education

Views

• Genes as “independent hereditary units (1866), Mendel• Genes as “beads on strings” (1926), Morgan• One gene, one enzyme (1941), Beadle & Tatum• DNA is molecule of heredity (), Avery• DNA > RNA > Protein (1953), Crick, Watson, Wilkins

Page 19: Manifestations of a Code Genes, genomes, bioinformatics and cyberspace – and the promise they hold for biology education

More Views

• Transposons (1940s-50s), McClintock• Reverse transcription (1970), Temin & Baltimore• Split genes (1977), Roberts & Sharp• RNA interference (1998), Fire and Mello

Page 20: Manifestations of a Code Genes, genomes, bioinformatics and cyberspace – and the promise they hold for biology education

Sequence & course material repository

http://gfx.dnalc.org/files/evidenceDon’t open items, save them to your computer!!

• Annotation (sequences & evidence)• Manuals (DNA, Subway, Apollo, JalView)• Presentations (.ppt files)• Prospecting (sequences)• Readings (Bioinformatics tools, splicing, etc.)• Worksheets (Word docs, handouts, etc.)• BCR-ABL (temporary; not course-related)

Page 21: Manifestations of a Code Genes, genomes, bioinformatics and cyberspace – and the promise they hold for biology education