the medicago truncatula genome: a progress report dr. bruce a. roe advanced center for genome...

45
A A C C G G T T The The Medicago Medicago truncatula truncatula genome: genome: a progress report a progress report Dr. Bruce A. Roe Dr. Bruce A. Roe Advanced Center for Genome Advanced Center for Genome Technology Technology Department of Chemistry and Department of Chemistry and Biochemistry Biochemistry University of Oklahoma University of Oklahoma [email protected] [email protected] www.genome.ou.edu www.genome.ou.edu Plant and Animal Plant and Animal Genome Genome San Deigo January 11 San Deigo January 11 , , 2004 2004 Photos by Steve Hughes, Genetic Resource Centre (PIRSA-SARDI), Adelaide, Australia. http://www.fao.org/ag/AGP/AGPC/doc/gallery/pictures/meditrunc/meditrunc.htm

Upload: josephine-melton

Post on 28-Dec-2015

224 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

The The MedicagoMedicago truncatulatruncatula genome: genome:a progress reporta progress report

Dr. Bruce A. RoeDr. Bruce A. RoeAdvanced Center for Genome TechnologyAdvanced Center for Genome TechnologyDepartment of Chemistry and BiochemistryDepartment of Chemistry and BiochemistryUniversity of OklahomaUniversity of [email protected] [email protected] www.genome.ou.edu

Plant and Animal GenomePlant and Animal GenomeSan Deigo January 11San Deigo January 11, 2004, 2004

Photos by Steve Hughes, Genetic Resource Centre (PIRSA-SARDI), Adelaide, Australia.http://www.fao.org/ag/AGP/AGPC/doc/gallery/pictures/meditrunc/meditrunc.htm

Page 2: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

• An important forage crop • A genetically tractable model legume • A relatively small (~500 Mbp) diploid genome

• Active legume research community

• Medicago Research Consortium• Large collection of ESTs• Excellent BAC library• Integrated physical and genetic map• Large number of BAC-end sequences

Why sequence the Medicago genome?

Page 3: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

DNA GenBank

Sequence Pipeline at the University of Oklahoma Genome Center, OU-ACGT

DNA shearing(HydroshearTM)

Colony Piking(QPixIITM)

Growing subclones(HiGroTM)

Subclone Isolation I(Mini-StaccatoTM)

Subclone isolation II(VPrepTM)

Thermocycling(ABI 9700)

Sequencing(ABI 3700)

Data assembly and Analysis

Primer Synthesis

Miscelaneous liquid handling

Closure

Page 4: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

• This Zymark robot has 384 cannula array, four built in shakers, three attached storage racks, built-in barcoding and a Twister II robotic arm.

• This automation has allow us to perform the DNA isolation completely unattended from as many as eighty 384 well plates of bacterial cells per day.

Subclone Isolation (Mini-StaccatoTM)

QuickTime™ and aYUV420 codec decompressor

are needed to see this picture.

Page 5: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

• Once all three solutions have been added, the plates are transferred from the SciClone workspace deck to a storage rack by the Twister II robotic arm.

Subclone Isolation (Mini-StaccatoTM)

QuickTime™ and aYUV420 codec decompressor

are needed to see this picture.

Page 6: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

• Liquid handling station with 384-channel pipettor head • Four movable shelves on either side of the pipettor head • Used for subclone isolation, sequencing reaction set-up and clean-up.

Subclone Isolation and Sequencing Reaction Pipetting (Velocity 11 VPrep)

QuickTime™ and aYUV420 codec decompressor

are needed to see this picture.

Page 7: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

Data assembly and Analysis

32 GB RAM running Solaris 8 OS and 3 TB of data stored on RAID-5 arrays with autoloader tape backup

Also:• 12 workstations each with 1 GB RAM

Sun V880 server Phred/Phrap/Consed

Exgap

Page 8: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

Initial WGS Skimming for ~500 MbMedicago truncatula genome

• Collected ~25,000 end-sequences from ~12,500 plasmid-based WGS clones.

• Of these ~25,000 sequences, ~1,000 have homology with Medicago truncatula ESTs.

• URL: http://www.genome.ou.edu/medicago.html

Page 9: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

Phrap assembly of our Medicago truncatula whole genome shotgun survey sequencing dataat 0.005-fold genomic sequence coverage

Page 10: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

DotPlot of a Phrap assembled whole genome shotgun contig showing multiple repeated regions

0 100 200 300 400 500 600 700

700

60

0

500

4

00

300

200

1

00

0

Bas

es

Bases

Page 11: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

DotPlot of a Phrap assembled whole genome shotgun contig showing 4 repeated blocks of ~600 bases

0 500 1000

1

00

0

5

00

0

Ba

ses

Bases

Page 12: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

Yet another genomic contig showing extensive repeated regions

Contig 1931

0 200 400 600

600

40

0

200

0

Bas

es

Bases

Page 13: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

>Contig1931 TTTACGTCCCCGTAGTGAACTATTTCCTAAGTTGACTAGTCAATTAGGTGATAGTTCGTCCGGATGACGTACCGCCGTGAACCCGATATGAGAATTTCATGTGGTGCATCCTTCTATGTTTGATAAGGTCATTTTGAACGGTCGGATTGAACGTGGCTGGTGTCGTTCACGATAGAGGCACGTTTAGGTCCCTACGGTGAACTAGTTCCTAAGTTGACTAGTCAATTAGGTGATAGTTTGTCCGGATGACGTACCTCCGTGAACCCGATCTGAGAAATTCAAGTTTCTGCATCCTTCTATGTTTGATAAGGTCATTTTGAACGGTCGGATTGAAGGTGGCTGGTGTTCTTCACATTCTAGGCACGTTTAGGTTCCCGCGGTGAACTAGTTCCTAAGTTGACTAGTCAATTAGGTGATAGTTCGTCCGGATGACCTACCTCCGTGAACCCGATATTAGAAATTCAAGTTTCTGCATCCTTCTATGTTTGATAAGGTCATTTTGAACGGTCAGATTGAACGTGGCTGGTGTCGTTCACGATCTAGGCACGTTTAGGTCCCCGCAGTGAACTAGTTCCTAAGTTGACTAGTCAATTAGGTGATAGTTTGTCCGGATGACGTGACTCCGTAAAGCCAGTATGAGAACTTCTAGTTTCTGCATCCTTTTATGTTTGATAAGGTCATTTTGAACGGTGGGATTGAACGTTGTTGGTGTCGTTCACGATCTAGGCACGTTTAGGTCCCCGCAGTGAACTAGTTCCTTAGTTGACTAGTCAATTAGGTGATAGTTCGTCCGGATGACGTATCTCCGTCAGCCCGATCTGAGAAATTCAAATTTCTGCATCCTTCTATGTTTGATAAGGTCATTTTGAACGGTCGGATTGAACGTGGCTGGTGTCGTGCACGATCAAGGCACGTTTAGGTCCCCGCAGCGAACTAGTTCCTAAGTTGACTAGTCAATTAGGTGATACCTTGTCCGGATGACGTACCTCCGTGAACCCGATCTGAGAAATTCAAGTTTCTGCATCCTTCTATGTTTGATAAGGTCATTTTGAACGGTTGGATTGAACATGGCTGGTGTCGTTCACGATCTAGGCACGTTTAGGTCCCCGCAGTGAACTAGTTCCTAAGTTGACTAGTCAATTAGGTGATAGTTCGTCTGGATGACGTACCTCCTTGAACCCAATATGAGAAATTCAATTTTCTTCATCCTTCTATGTTTGATAAGGTCATTTTGAACGGTCGGATTGAACGTGCCTGGTGTCGTTCACGATCGAGGCACGTTTAGGTCCCCGCAGTGAAC. . .

Page 14: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

Summary of our Medicago truncatula WGS Sequencing Assembly with only 0.005-fold

Genomic Sequence Coverage

• The largest contig (21,157 bp) contained the 26S rRNA genes

• 19 smaller contigs (105,455 bp total) were from the chloroplast genome

• The remaining ~500 contigs, ranging in size from 2,000 to 12,000 bp contain highly repetitive DNA, which were unique to Medicago, as they had no significant homology in the GenBank database

• We concluded that a more directed strategy was needed

Page 15: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

Mapped BAC approach in collaboration with Doug Cook and DJ Kim at U.C. Davis with

funding from the Noble Foundation, Ardmore, OK

Page 16: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

The first ~1000The first ~1000 Medicago truncatula Medicago truncatula BACsBACs• Initially concentrated on BACs with known biological Initially concentrated on BACs with known biological

markers and in regions of biological interest that were markers and in regions of biological interest that were supplied to us by the UC Davis group.supplied to us by the UC Davis group.

• Requests for sequencing specific BACs were directed Requests for sequencing specific BACs were directed to Doug Cook and DJ Kim at UC Davis and they to Doug Cook and DJ Kim at UC Davis and they supplied us with the BACs once these BACs have supplied us with the BACs once these BACs have been characterized.been characterized.

• Once the BACs were received, we created the shotgun Once the BACs were received, we created the shotgun libraries, isolated the sequencing templates and libraries, isolated the sequencing templates and obtained the working draft sequence followed by obtained the working draft sequence followed by closure and finishing.closure and finishing.

• All data was made publically available in GenBank All data was made publically available in GenBank within 24 hours of sequence assembly.within 24 hours of sequence assembly.

Page 17: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

UC Davis

--------

OklahomaUniversity

Page 18: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

Medicago BAC Sequencing

0

10000000

20000000

30000000

40000000

50000000

60000000

70000000

80000000

90000000

100000000

4/15/026/15/028/15/0210/15/0212/15/02

2/15/034/15/036/15/038/15/0310/15/0312/15/03

Date

Number of Bases

Phase 1

Phase 2

Phase 3

Total

Page 19: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

The next ~750The next ~750 Medicago truncatula Medicago truncatula BACsBACs

• With recent NSF funding, we will be With recent NSF funding, we will be sequencing BACs from chromosomes sequencing BACs from chromosomes 1,4, 6, and 8 with the goal of completing 1,4, 6, and 8 with the goal of completing the sequence of the euchromatic regions the sequence of the euchromatic regions of these chromosomes over the next 3 of these chromosomes over the next 3 years.years.

• Chromosomes 2 and 7 will be sequenced Chromosomes 2 and 7 will be sequenced at TIGR, chromosome 3 at The Sanger at TIGR, chromosome 3 at The Sanger Institute and and chromosome 5 at Institute and and chromosome 5 at Genoscope.Genoscope.

• All data will be released immediately as All data will be released immediately as before.before.

Page 20: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

www.genome.ou.edu/medicago.html

Page 21: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

www.genome.ou.edu/medicago_totals.html

Page 22: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

Medicago-specific gene with ESTs but no known homology

Gene density of this BAC is ~1 gene per 10 kb

Page 23: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

Medicago-specific gene with ESTs but no known homology

Page 24: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

myosin-like protein

Gene density ~1 gene per 10 kb

Page 25: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

myosin-like protein

Page 26: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

Page 27: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

Gene Size Distribution (All Sequence Data) (FgenesH vs. Genscan)

0

500

1000

1500

2000

2500

3000

3500

4000

4500

1-10

00

100

1-20

00

200

1-30

00

300

1-40

00

400

1-50

00

500

1-60

00

600

1-70

00

700

1-80

00

800

1-90

00

900

1-10

000

100

01-1

100

0

110

01-1

200

0

120

01-1

300

0

130

01-1

400

0

140

01-1

500

0

150

01-1

600

0

160

01-1

700

0

170

01-1

800

0

180

01-1

900

0

190

01-2

000

0

200

01-a

bo

ve

FgeneSH

Genscan

Number of

Genes

Gene Size Range

13,396 FgeneSH predicted genes11,488 Genscan predicted genes

Page 28: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

Exon Size Distribution (All Sequence Data) (FgenesH vs. Genscan)

0

2000

4000

6000

8000

10000

12000

14000

16000

18000

20000

1-50

51-1

00

101

-200

201

-300

301

-400

401

-500

501

-600

601

-700

701

-800

801

-900

901

-100

0

100

1-15

00

150

1-20

00

200

1-25

00

250

1-30

00

300

1-35

00

350

1-40

00

Number of

Exons

Exon Size Range

FgeneSH

Genscan

59,808 FgeneSH predicted exons55,792 Genscan predicted exons

Page 29: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

Intron Size Distribution (All Sequence Data) (FgenesH vs. Genscan)

0

2000

4000

6000

8000

10000

12000

1-50

51-1

00

101

-200

201

-300

301

-400

401

-500

501

-600

601

-700

701

-800

801

-900

901

-100

0

100

1-15

00

150

1-20

00

200

1-25

00

250

1-30

00

300

1-35

00

350

1-40

00

Number of

Introns

Intron Size Range

FgeneSH

Genscan

46,412 FgeneSH predicted introns44,305 Genscan predicted introns

Page 30: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

FgeneSH GenscanTotal number of genes 13,397 11,488Total length of genes 30,793,326 51,687,528Total exon length 15,794,243 14,400,445Total number of exons 59,808 55,792Total intron length 14,999,083 37,287,083Total number of introns 46,412 44,305_______________________________________________________

Base Pairs Sequenced 87,423,457 87,423,457_______________________________________________________

Gene Space(Gene Length/BP Sequenced) 35% 59%_______________________________________________________

Gene Density (Genes/200Mb) 30,649 26,281

1 gene/6.5 kb 1 gene/7.6 kb_______________________________________________________

Arabidopsis 25,498 protein coding genes

Gene Density of the ~450 Mb Medicago truncatula genome

Page 31: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

Medicago GC Content for ~90 Mb of Genomic BAC Clones Sequenced (mainly from gene rich regions)

Page 32: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

Metabolic Overview of Medicago13,396 FgeneSH predicted genes using the COG Database

DNA Metabolism23%

Cellular Processes23%Metabolism

24%

Poorly Characterized

17%

No Hits5%

Multiple COG Hits8%

Page 33: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

Metabolic Overview (detailed view) of Medicago13,396 FgeneSH predicted genes using the COG Database

No Hits5%

Translation, ribosomal structure & biogenesis

7% Transcription5%

DNA replication, recombination & repair

11%

Multiple COG Hits8%

Poorly Characterized17%

Cell division & chromosome

partitioning 2%

Posttranslational modification, protein

turnover, chaperones 5%

Cell envelope biogenesis, outer

membrane 4%

Cell motility & secretion 3%

Inorganic ion transport & metabolism 3%

Signal transduction

mechanisms 5%Energy production & conversion 5%

Carbohydrate transport & metabolism 4%

Amino acid transport & metabolism 5%

Nucleotide transport & metabolism 2%

Coenzyme metabolism 2%

Lipid metabolism 2%

Secondary metabolites biosynthesis, transport &

catabolism 3%

Page 34: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

Gene Duplication: Three copies of the phosphoglycerate kinase gene in one BAC

Page 35: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

AC138448.fg.10 MATKRSVGTLKEAELKGKRVFVRVDLNVPLDDNLNITDDTRIRAAVPTIKYLTGYGAKVILSSHL-----AC138448.fg.11 MA-KKSVGDLSGAELKGKKVFVRADLNVPLDDNQNITDDTRIRAAIPTIKYLIQNGAKVILSSHL-----AC138448.fg.8 MATKRSVGTLKEGELKGKRVFVRVDLNVPLDDNLNITDDTRIRAAVPTIKYLTGYGAKVILSSHLEIYKT

AC138448.fg.10 ------------------------------------------GRPKGVTPKYSLKPLVPRLSELLGTQVKAC138448.fg.11 ------------------------------------------GRPKGVTPKYSLAPLVPRLSELIGIEVIAC138448.fg.8 EVSVSEYNLAVSEYKLAISDTYRYRIRVRHDSSPFLEYRGSQGRPKGVTPKYSLKPLVPRLSELLETQVK

AC138448.fg.10 IADDSIGEEVEKLVAQIPEGGVLLLENVRFHKEEEKNDPEFAKKLASLADLYVNDAFGTAHRAHASTEGVAC138448.fg.11 KAEDSIGPEVEKLVASLPDGGVLLLENVRFYKEEEKNDPEHAKKLAALADLYVNDAFGTAHRAHASTEGVAC138448.fg.8 ISDDCIGEEVEKLVAQIPEGGVLLLENVRFHKEEEKNEPEFAKKLASLADLYVNDAFGTAHRAHASTEGV

AC138448.fg.10 AKYLKPSVAGFLMQKELDYLVGAVSNPKKPFAAIVGGSKVSSKIGVIESLLEKVDILLLGGGMIFTFYKAAC138448.fg.11 TKYLKPSVAGFLLQKELDYLVGAVSSPKRPFAAIVGGSKVSSKIGVIESLLEKVDILLLGGGMIFTFYKAAC138448.fg.8 AKYLKPSVAGFLMQKELDYLVGAVSNPKKPFAAIVGGSKVSSKIGVIESLLEKVDILLLGGGMIYTFYKA

AC138448.fg.10 QGYAVGSSLVEEDKLDLATTLIEKAKAKGVSLLLPTDVVIADKFAADANDKIVPASSIPDGWMGLDIGPDAC138448.fg.11 QGLAVGSSLVEEDKLELATTLIAKAKAKGVSLLLPSDVVIADKFAPDANSQIVPASAIPDGWMGLDIGPDAC138448.fg.8 QGYSIGSSLVEEDKLDLATSLMEKAKAKGVSLLLPTDVVIADKFSADANDKIVPASSIPDGWMGLDIGPD

AC138448.fg.10 SIKTFNEALDKSQTIIWNGPMGVFEFDKFAAGTEAIAKKLAEVSGKGVTTIIGGGDSVAAVEKVGLADKMAC138448.fg.11 SIKTFNEALDTTQTIIWNGPMGVFEFDKFAVGTESIAKKLADLSGKGVTTIIGGGDSVAAVEKVGVADVMAC138448.fg.8 SIKTFNEALDKSQTIIWNGPMGVFEFDKFAAGTEAIAKKLAEVSGKGVTTIIGGGDSVAAVEKVGLADKM

AC138448.fg.10 SHISTGGGASLELLEGKPLPGVLALDDA* 401 amino acidsAC138448.fg.11 SHISTGGGASLELLEGKELPGVLALDEATPVAV* 405 amino acids, differs at 42 positionsAC138448.fg.8 SHISTGGGASLELLEGKPLPGVLALDDA* 448 amino acids, differs at 6 positions

Gene Duplication: Three copies of phosphoglycerate kinase in one BAC

Page 36: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

Printrepeat Analysis of M. truncatula BAC AC121240 vs. A. thaliana Chr.2

Expansion, Duplication, Repeat Elements

~5 kb region

~25 kb region

Page 37: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

PIP of M. truncatula BAC AC121240 vs. A. thaliana Chr.2

Page 38: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

Medicago truncatulaMedicago truncatulaSummary and ConclusionsSummary and Conclusions

• Average Predicted Gene Density of 1 gene per 6.5 Average Predicted Gene Density of 1 gene per 6.5 to 7.6 Kb by FgeneSH and Genscan, respectively.to 7.6 Kb by FgeneSH and Genscan, respectively.

• Genome characteristics such as %GC, intron/exon Genome characteristics such as %GC, intron/exon size and conserved unique 5’ splice sites reveal size and conserved unique 5’ splice sites reveal Medicago characteristicsMedicago characteristics

• The sequence of the The sequence of the Medicago truncatulaMedicago truncatula genome genome shows homology to the sequenced shows homology to the sequenced Arabidopsis Arabidopsis thalianathaliana genome but expansion, rearrangements genome but expansion, rearrangements and duplications are evident.and duplications are evident.

Page 39: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

Data Release and Preliminary AnnotationData Release and Preliminary Annotation

• All our sequence data is available through links on our All our sequence data is available through links on our web site to GenBank and on our ftp site at URL: web site to GenBank and on our ftp site at URL: ftp.genome.ou.edu/medicagoftp.genome.ou.edu/medicago

• keyword and blast searches can be done on our web site keyword and blast searches can be done on our web site at URL: at URL: http://www.genome.ou.edu/medicago.htmlhttp://www.genome.ou.edu/medicago.html

• Additional annotation via Genome Browser database Additional annotation via Genome Browser database are available on our web site at URL: are available on our web site at URL: http://www.genome.ou.edu/medicago_table.htmlhttp://www.genome.ou.edu/medicago_table.html

• E-mail suggestions for additional annotation to Bruce E-mail suggestions for additional annotation to Bruce Roe at: Roe at: [email protected]

Page 40: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

Three Year PlanThree Year Plan

• Obtain the contiguous sequence of the Gene Obtain the contiguous sequence of the Gene Rich regions of four of the 8 Rich regions of four of the 8 Medicago truncatulaMedicago truncatula genome at OU, with the remaining four being genome at OU, with the remaining four being completed by our international partners at TIGR, completed by our international partners at TIGR, Sanger, and Genoscope.Sanger, and Genoscope.

• This information will serve as a solid foundation This information will serve as a solid foundation for anticipated comparative and functional for anticipated comparative and functional legume genomics.legume genomics.

Page 41: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

Laboratory OrganizationLaboratory OrganizationBruce Roe, PIBruce Roe, PI

InformaticsInformatics

Support TeamsSupport Teams

ProductionProduction AdministrationAdministration

Jim WhiteJim WhiteSteve KentonSteve KentonHongshing LaiHongshing LaiSean QianSean Qian

Rose Morales-Diaz*Rose Morales-Diaz*Mounir Elharam*Mounir Elharam*Yonas TesfaiYonas TesfaiSteve Shaull**Steve Shaull**Doug WhiteDoug WhiteWork-study Undergraduates**Work-study Undergraduates**

Kay Lynn HaleKay Lynn HaleDixie WishnuckDixie WishnuckTami WomackTami WomackMary Catherine WilliamsMary Catherine Williams

DNA SynthesisDNA Synthesis

Phoebe Loh*Phoebe Loh*Sulan QiSulan QiBart Ford*Bart Ford*

Reagents &Reagents &Equip. Maint.Equip. Maint.

Mounir Elharam*Mounir Elharam*Doug WhiteDoug White

Axin HuaAxin HuaWeihong XuWeihong Xu

Jami MilamJami Milam Sara Downard**Sara Downard**

Limei YangLimei YangAngie Prescott*Angie Prescott*Audra Wendt**Audra Wendt**Mandi Aycock**Mandi Aycock**

Ziyun YaoZiyun YaoSteve Shaull*Steve Shaull*Youngju YoonYoungju Yoon

Trang DoTrang Do Anh DoAnh DoLily FuLily FuYang YeYang YeJames Yu James Yu Tessa Manning**Tessa Manning**

Fu Ying Fu Ying Liping ZhouLiping ZhouRuihua ShiRuihua ShiJunjie WuJunjie Wu

Stephan DeschampsStephan DeschampsShelly OommenShelly OommenChristopher LauChristopher LauYanhong LiYanhong Li

Research TeamsResearch TeamsDoris KupferDoris KupferJulia Kim*Julia Kim*Sun SoSun SoGraham Wiley**Graham Wiley**Lauren Ritterhouse**Lauren Ritterhouse**

Lin SongLin Song Ying NiYing NiHuarong JiangHuarong Jiang

ShaoPing LinShaoPing LinHonggui JiaHonggui JiaHongming WuHongming WuBaifang QinBaifang QinPeng Zhang Peng Zhang

Fares NajarFares Najar Chunmei QuChunmei QuKeqin WangKeqin WangCarson QuCarson QuShuling LiShuling Li

Funding from the Noble Foundation, DOE, and NSFCollaborators at Univ. Minnesota, UC Davis, TIGR, Sanger, Genoscope, and the Noble Foundation

Pheobe LohPheobe Loh * *Sulan QiSulan QiBart Ford*Bart Ford*

* Previous undergraduate * Previous undergraduate research studentresearch student

** Present undergraduate ** Present undergraduate research studentresearch student

Page 42: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

The AACCGGTT Team

Page 43: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

Page 44: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

Conserved Intron/Exon Boundry Features by a FELINEs** Analysis of 181,444 Medicago truncatula ESTs in GenBank

vs Genomic Sequence

Size Range Mean LengthExons 6 - 5,789 nt 268 ntIntrons 20 - 3,921 nt 429 nt

Intron Conserved Splice Site Sequence Elements PercentIntrons w/ 5’ GU 99.21%Introns w/ 5’ GC 0.36%*Introns w/ 5’ AU 0.31%Introns w/ U12 branch sites instead of A12 0.13%

*Compared to 0.5 - 2.5% in fungi, and 0.5% in mammals with an EST minimum identity of 90%

** S. Drabensctot, D. Kupfer, J. White, D. Dyer, B. Roe, K. Buchanan and J. Murphy. FELINES: A Utility for Extracting and Examining EST-Defined Introns and Exons. Nucleic Acid Research 31(22), E141 (2003).

Page 45: The Medicago truncatula genome: a progress report Dr. Bruce A. Roe Advanced Center for Genome Technology Department of Chemistry and Biochemistry University

AACC

GGTT

Consensus Logogram of the 5’GU vs the 5’AU Class of Introns in Medicago truncatula determined by FELINES

AU intron consensus

GU intron consensus