Download - Dating the Monocot–Dicot Divergence and the Origin of Core Eudicots Using Whole Chloroplast Genomes
-
8/8/2019 Dating the MonocotDicot Divergence and the Origin of Core Eudicots Using Whole Chloroplast Genomes
1/18
Dating the MonocotDicot Divergence and the Origin of Core Eudicots
Using Whole Chloroplast Genomes
Shu-Miaw Chaw,1 Chien-Chang Chang,1 Hsin-Liang Chen,1 Wen-Hsiung Li2
1 Institute of Botany, Academia Sinica, 128 Sec. 2, Academy Road, Taipei 115, Taiwan
2 Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA
Received: 31 July 2003 / Accepted: 23 October 2003
Abstract. We estimated the dates of the monocot
dicot split and the origin of core eudicots using a
large chloroplast (cp) genomic dataset. Sixty-one
protein-coding genes common to the 12 completely
sequenced cp genomes of land plants were concate-
nated and analyzed. Three reliable split events wereused as calibration points and for cross references.
Both the method based on the assumption of a con-
stant rate and the LiTanimura unequal-rate method
were used to estimate divergence times. The phylo-
genetic analyses indicated that nonsynonymous sub-
stitution rates of cp genomes are unequal among
tracheophyte lineages. For this reason, the constant-
rate method gave overestimates of the monocotdicot
divergence and the age of core eudicots, especially
when fast-evolving monocots were included in the
analysis. In contrast, the LiTanimura method gaveestimates consistent with the known evolutionary
sequence of seed plant lineages and with known fossil
records. Combining estimates calibrated by two
known fossil nodes and the LiTanimura method, we
propose that monocots branched off from dicots 140
150 Myr ago (late Jurassicearly Cretaceous), at least
50 Myr younger than previous estimates based on the
molecular clock hypothesis, and that the core eudi-
cots diverged 100115 Myr ago (AlbianAptian of
the Cretaceous). These estimates indicate that both
the monocotdicot divergence and the core eudicots
age are older than their respective fossil records.
Key words: Chloroplast genome Divergence ofmonocot and dicot Angiosperm phylogeny Age of core eudicots Molecular clock Un-equal rate
Introduction
Fossil evidence suggests that flowering plants (an-
giosperms) first appeared 140 million years (Myr)
ago in the early Cretaceous (Willis and McElwain
2002). They soon diversified and expanded globally in
the mid-Cretaceous (90100 Myr ago) (Nicholas et al.
1983). Although the angiosperm phylogeny has now
been largely established (Mathews and Donoghue
1999; PS Soltis et al. 1999; Qiu et al. 1999; Parkinson
et al. 1999; DE Soltis et al. 2000; Chase et al. 2000),
the question of why the oldest unequivocal fossil for
angiosperms is nearly 300 and 170 Myr later than the
first vascular plants (ca. 440 Myr ago [Taylor and
Taylor 1993]) and their extant sister group, the
gymnosperms (late Carboniferous, ca. 310 Myr ago
[Doyle 1998]), respectively, remains an abominable
mystery (Darwin et al. 1903). A number of hy-
potheses have been proposed to explain the late ar-
rival of angiosperms in the fossil record. These
include (1) the escape of fossilization in the initial
stage of angiosperm evolution (Thomas and Spicer
1987), (2) bias in the fossil record (i.e., angiosperms
J Mol Evol (2004) 58:424441DOI: 10.1007/s00239-003-2564-9
Correspondence to: Shu-Miaw Chaw; email: smchaw@sinica.
edu.tw
-
8/8/2019 Dating the MonocotDicot Divergence and the Origin of Core Eudicots Using Whole Chloroplast Genomes
2/18
volved much earlier but went undetected), and (3)
he suggestion that the evolution of angiosperms wasriggered by a particular set of environmental con-
itions, and/or biotic interactions (such as co-evolu-
ion with faunal groups) (Willis and McElwain 2002).
Is the origin of angiosperms actually much older
han the known fossil record? Since Ramshaw et al.s
1972) first application of molecular data to address
his question, three decades have passed. In the in-
erim, molecular phylogenetic studies and critical
ossils of derived angiosperms from older geological
eposits (Magallo n et al. 1999; Wikstro m et al. 2001)
ave opened up an opportunity to readdress the age
nd evolution of angiosperms. Although all previous
stimates of the monocotdicot divergence (Table 1)
redate angiosperms fossil records, they are highly
ariable, ranging from 140190 Myr (Goremykin
t al. 1997; Sanderson 1997; Wikstro m et al. 2001;
anderson and Doyle 2001) to 200 Myr (Ramshaw
972; Wolfe et al. 1989; Laroche et al. 1995; Yang
t al. 1999) or even 300320 Myr (Martin et al. 1989,
993; Brandl et al. 1992).
Traditionally, the angiosperms were subdividednto two classes, Liliopsida (the monocots) and
Magnoliopsida (the dicots) (Cronquist 1988). How-
ver, this subdivision was first refuted by rbcL and
8S rRNA gene phylogenies (Chase et al. 1993; Chaw
t al. 1997) and later by analyses of multiple genes
from the three plant genomes (Mathews and Do-
noghue 1999; Parkinson et al. 1999; Qiu et al. 1999;PS Soltis et al. 1999; DE Soltis et al. 2000; Chaw et al.
2000). These phylogenetic analyses have led to the
conclusion that the dicots were split into the basal
dicots (or the magnoliids) and the eudicots and that
the monocot lineage was derived from one of the
basal magnoliids (Fig. 1A). Parallel to the molecular
data has been the accumulation of pollen fossils of
eudicots, which began in the late Barremian (of
Cretaceous, ca. 120 Myr ago) and spread globally in
the Albian (ca. 110 Myr ago) (Doyle 1992; Hughes
1994). In addition, many new megafossils of basal
eudicots have appeared, such as Tetracentraceae
from the Barremian (110118 Myr ago) (Magallo n
et al. 1999), as well as core eudicots, such as a pos-
sible Rhamnaceae/Rosaceae (rosids) from the early
Cenomanian (9497 Myr ago [Basinger and Dilcher
1984]). It has also been suggested that the date of
diversification of core eudicots was underestimated.
Wikstro m et al. (2001) have examined this issue
(Table 1) with nuclear 18S rDNA and two cp (rbcL
and atpB) genes. We now provide additional evidenceby analyzing whole chloroplast (cp) genomic DNA
sequences.
Cp DNA sequences are useful for studying the
plant phylogeny at deep levels of evolution because of
their lower rates of silent nucleotide substitution
able 1. Comparison of previous estimates of divergence between monocots and dicots
Reference Gene used
Reference point
(timea of divergence; Myr) Estimated timea (Myr)
Ramshaw et al. (1972) Cytochrome cb Mammalsbirds (280) 220240
Martin et al. (1989) nrc GapC, CHS Animalyeast (1000) 319 35
Drosophilavertebrates (600)
Mammalschicken (270)
Humanerat (85)
Wolfe et al. (1989) 12 cpc genes Bryophyteangioaperm (350450) 170230Maizewheat (5070) 150260
nr 26S, 18S rRNA Plantanimal (1000) 200250, 200210
randl et al. (1992) cp tRNA Maizewheat (5070)
Tracheophytebryophyte (350450) 230350
Plantanimal (1000)
Martin et al. (1993) nr GapC, cp rbcL Bryophytespermatophyte (450) 300
Coniferangiosperm (330)
aroche et al. (1995) 12 mtc genes Maizewheat (5070) 170238
VicieaePhaseoleae (4565) 157226
Goremykin et al. (1997) 58 cp genesb Bryophytespennatophyte (450) 160 16
anderson (1997) rbcL Marchantia (450) [160215]d
Yang et al. (1999) mt 1st intron of nad Maizewheat (5070) 170235
anderson & Doyle (2001) rbcL, 18S rRNA Land plant (450) 140190Wikstro m et al. (2001) cp rbcL, atpB FagalesCucurbitales (84) [158179]
nr 18S rDNA [131147]e
The unit of time is millon years ago (or before present).
The translated amino acid sequence was used.
cp, chloroplast; mt, mitochondrial; nr, nuclear.
The age of extant angioaperms.
The origin date of eudicots.
425
-
8/8/2019 Dating the MonocotDicot Divergence and the Origin of Core Eudicots Using Whole Chloroplast Genomes
3/18
(Palmer 1985a, b; Wolfe et al. 1989; Clegg et al.
1994). Moreover, concatenating sequences frommany genes may overcome the problem of multiple
substitutions that cause the loss of phylogenetic in-
formation between cp lineages (Lockhart et al. 1999)
and can reduce sampling errors due to substitutional
noise and the finite number of characters within a
gene (Sanderson and Doyle 2001). In this study we
analyzed 39,507 sites of cp DNA genomic sequences
from 61 protein-coding genes common to the 12
complete cp genomes of land plants (Table 2). Our
dataset is larger than those used in previous studies,
including that of Goremykin et al. (1997; see also
Table 1), who analyzed 40 proteins of cp genomes
from fewer taxa (five land plants, including only one
dicot and two monocots).
Molecular dating often assumes rate constancy,
but this is frequently violated (PS Soltis et al. 2002
and references herein). For example, substitution
rates of cp genes vary greatly among and within
tracheophyte (or vascular plant) lineages (Bousquet
et al. 1992; Gaut et al. 1992, 1993; Clegg et al. 1994;
Sanderson and Doyle 2001; PS Soltis et al. 2002),between protein-coding loci (Muse and Gaut 1997;
Matsuoka et al. 2002), and between nonsynonymous
and synonymous sites (Gaut et al. 1997; Matsuoka et
al. 2002). Sanderson and Doyle (2001) believed that
much of the conflict in estimating divergence times
was due to rate variation across lineages. In order to
mitigate this problem we used mean branch lengths ofthe sampled monocots and dicots.
The focus of this study is to estimate the dates of
the monocotdicot split and the origin of core eudi-
cots using a large cp genomic dataset. The date of the
monocotdicot divergence can be calculated by ex-
trapolation from the reliable dates of other speciation
events by means of phylogeny based on DNA se-
quence distances (Wolfe et al. 1989). Three diver-
gence events with well-supported fossil dates were
used as calibration points and cross references. Both
the method based on the assumption of a constant
rate and Li and Tanimuras (1987) unequal-rate
method (hereafter the LiTanimura method) were
used to estimate divergence times, and the estimates
were compared with known fossil dates. Although
several other methods without the rate constancy
assumption, such as the nonparametric rate
smoothing method (NPRS), have been proposed
(Sanderson 1997 and references cited herein), we
chose the LiTanimura method for its simplicity. The
method uses lineages in which the molecular clockholds better than the others to estimate the diver-
gence time at a particular node. We also discuss
possible reasons for discrepancies among estimates of
divergence dates obtained in this study and previous
studies.
Fig. 1. Rooted phylogenetic tree for the 12 sampled species. A
Phylogeny of angiosperms based on Qiu et al. (1999) and P. S.
Soltis et al.s (1999) phylogenetic trees. Solid lines lead to taxa
sampled in this study. B Rooted NJ tree using the PamiloBianchi
Li distances based on the Ka values concatenated from 61 cp
protein-coding genes. The branches leading to nodes C2 and A are
not drawn to scale. Lengths are indicated. The calibration points
(nodes C1, C2, C3) were used to estimate the divergence between
monocots and dicots (node A) and the origin of core eudicots (node
B). Gene loss (open bar), loss but with known transfer to nucleus
(hatched bar), retention (gray bar), and likely gain with no simi-
larity to prokaryotic genes (filled bar) are plotted on the branches
leading to each lineage. The upper numbers at each node denote the
bootstrap percentages (where applicable, values of the interior
branch test indicated after the slash). Total gene number in the cp
genome is given after each species, in parentheses. Branch lengths
and the scale bar are Ka values per 100 sites.
426
-
8/8/2019 Dating the MonocotDicot Divergence and the Origin of Core Eudicots Using Whole Chloroplast Genomes
4/18
Data and Methods
Database Search for Cp Genome Sequences
ndividual genes of the 12 published cp genome sequences (Table
) were downloaded from GenBank, National Center for Bio-
echnology Information (NCBI). Nomenclatures of the cp pro-
ein-coding genes complied by Hallick and Bairoch (1994), Stoebe
t al. (1998), Martin et al. (2002), and Swiss-Prot Protein
Knowledgebase (2003) were used as guides. When synonyms were
ncountered, their sequence homologies with the typified names
ere carefully verified. Two homology criteria were considered:
) the alignable length between two proteins is larger than 80%
f the longer sequence, and (2) the sequence identity in the
ligned region is at least 40% if L > 150, or at least 0.06 +
8L)0.032 (1 exp()L/1000)) (Rost 1999; Gu et al. 2002). Note that we
aise the identity to 40% instead of 30% because the taxa we
ampled are comparatively recent and cp genes are highly con-
erved (Wolfe et al. 1989).
Since the sequence of Medicago was not annotated, its protein-
oding genes were annotated using the Nucleotide queryProtein
atabase (BLASTX) algorithm at NCBI with each known gene
om Lotus as query. If a particular gene was missing from Lotus,
hat gene from the rest of the 10 taxa was used instead. Open
eading frames annotated by us were also verified using the BLAST
sequences algorithm and the Nucleotide queryTranslated db
lgorithm in NCBI against the corresponding gene and the whole
enome of Arabidopsis, respectively. A query sequence with more
han 40% identity to the specific known genes was then considered
s a putative homologous gene. A remnant of the accD gene in the
ce was reported previously (Hiratsuka et al. 1989) but could not
e detected by Katayama and Ogihara (1996) or Ogihara et al.
2002) using Southern hybridization. We were not able to locate it
from the rice cp genome either. We used the reviews of Millen et al.
(2001) and Martin et al. (2002) cp genes as guides to confirm ourBLAST searches, especially for those genes lost or with unknown
functions.
After careful comparison and annotation, a total of 98 protein-
coding genes was found in the cp genomes of the 12 sampled
species (Table 2). The lengths as well as the presence or absence of
those genes in each taxon are presented in Appendix 1. An open
reading frame homologous to a known gene was given the same
name to facilitate comparison and alignment. For some unanno-
tated genes filtered by using BLASTX search, their positions in the
corresponding genomes were indicated. We excluded pseudogenes
and genes duplicated in the inverted repeat regions. The cp encoded
RNA genes were previously shown to be problematic in early cp
phylogeny (Martin et al. 1998; Lockhart et al. 1999) and in thepresent study as well (data not shown). Therefore, RNA genes were
excluded from analysis.
Alignment of All Cp Genes and Phylogenetic Analyses
Amino acid sequences of each gene from the 12 taxa were first
aligned one by one using GeneDoc (Nicholas and Nicholas 1997)
with minor adjustments. The alignment was then used as a guide
for aligning the corresponding nucleotide sequences. Unknown
sites, start and stop codons, and regions difficult to align were
removed from each gene alignment. All aligned individual gene
sequences were then assembled using the Text Editor in MEGA 2.1
(Kumar et al. 2001). Gaps were completely deleted from the as-
sembled alignment concatenated from the 61 cp protein-coding
genes common to the 12 sampled taxa (see also Results). The
working data file (in MEGA format) is shown in Appendix 2,
available in the Supplementary Material Section at the JME Web
site.
able 2. Scientific names, classification, and NCBI accessions of species in the dataset
lassificationa Scientific name NCBI accession No. (version date)b/Reference
ryophyte
Marchantiaceae Marchantia polymorpha NC_001319 (Aug 2002)/Ohyama et al. (1986)
etridophyte
Psilotaceae Psilotum nudum AP004638 (Nov 2002)/Wakasugi et al. (2000)
Gymnoaperm
Pinaceae Pinus thunbergii NC_001631 (Sep 2002)/Wakasugi et al. (1994)
AngiospermsMonocots
Poaceae
Andropogoneae Zea mays NC_001666 (Sep 2002)/Maier et al. (1995)
Oryzeae Oryza sativa NC_001320 (Sep 2002)/Hiratsuka et al. (1989)
Triticeae Triticum aestivum NC_002762 (Sep 2002)/Ikeo and Ogihara (2000)
Dicots
Eudicots
Caryophyllidae
Chenopodiaceae Spinacia oleracea NC_002202 (Aug 2002)/Schmitz-Linneweber et al. (2001)
Asteridae
Solanaceae Nicotiana tabacum NC_001879 (Sep 2002)/Shinozaki et al. (1986)
Rosidae
Brassicaceae Arabidopsis thaliana NC_000932 (Sep 2002)/Sato et al. (1999)Onagraceae Oenothera elata subsp. hookeri NC_002693 (Sep 2002)/Hupfer et al. (2000)
Fabaceae
Papillionoideae
Loteae Lotus japonicus NC_002694 (Sep 2002)/Kato et al. (2000)
Trifolieae Medicago truncatula AC093544c(Nov 2001)/Lin et al. (2001)
Ranks of species follow the NCBIs Taxonomy Guide.
Data modified from http://megasun.bch.umontreal.ca/ogmp/projects/other/cp_list.html (vers. 20 Dec 2002).
No gene annotation in this accession.
427
-
8/8/2019 Dating the MonocotDicot Divergence and the Origin of Core Eudicots Using Whole Chloroplast Genomes
5/18
Nucleotide sequence divergence between a pair of taxa (or
groups) was calculated in terms of the numbers of substitutions per
synonymous site (Ks) or per nonsynonymous site (Ka), using the
PamiloBianchiLi method implemented in MEGA 2.1. Diver-
gence value between two groups is presented as average distance
standard error, obtained from the option Compute Between
Groups Means in MEGA 2.1. Average distance between two
groups is the arithmetic mean of all pairwise distances between taxa
in the intergroup comparisons. To date the divergence between the
monocot and the dicot lineages, Saito and Neis (1987) neighbor-
joining (NJ) method and the Ka values (not Ks, because substitu-tions at the third codon positions are saturated across sampled land
plant lineages; see Results) were used to reconstruct the phylo-
genetic trees, rooted at the top of the Pinus lineage. Because the six
sampled dicots (Table 2) represent the two large clades (the rosids
and the asterids) of core eudicots and one of the remaining four
small core eudicot clades, they can be used to infer the age or
diversification date of core eudicots. The NJ trees reconstructed by
Ka values and Ks values were rooted at the monocot lineage (see
Results). Relative support for each node was evaluated using the
bootstrap test and the interior branch test implemented in MEGA
2.1 with 2000 replicates. The latter test is constructed based on the
interior branch length and its standard error. If this value is higher
than 95% for a given branch, then the inferred length for thatbranch is considered significantly higher than 0 (Kumar et al.
2001). To compare the evolutionary rates of sampled fern, pine,
monocot, and dicot lineages, Tajimas relative rate test (1993) im-
plemented in MEGA 2.1 was applied. Because the method does not
distinguish between Ka, and Ks, the first and second codon posi-
tions of the combined 61 cp protein-coding genes were used
instead.
Calibration Points
To date the divergence between the monocots and the dicots, three
split events (see Figs. 1 and 4) with reliable fossil dates were used asreference nodes: (C1) the Psilotum (fern)seed plant split (400420
Myr old [Pryer et al. 2001]); (C2) the Pinus (conifer)angiosperm
split (280310 Myr old); and (C3) the maizewheat split (5060
Myr old). Since uncertainties about the age of the reference node
were a probable reason behind the discrepancies among previous
estimates of angiosperm origin (Bremer 2000; Sanderson and Doyle
2001), we have carefully examined the dates of our calibration
nodes.
Fossil Dates
Psilotum has been repeatedly suggested as a member of ferns by
molecular data (e.g., Nickrent et al. 2000; Pryer et al. 2001), but the
architecture of its sperm cell suggests that Psilotum is an early
divergent fern (Renzaglia et al. 2001) with relatively remote affin-
ities to Ophioglossaceae (a basal fern family) and Equisetaceae
(sphenopsids). Kenrick and Crane (1997) considered that the basal
dichotomy of Euphyllophytina occurred in the earlymid Devo-
nian (ca. 400420 Myr ago) and resulted in two clades: one con-
taining the extinct Psilophyton and the other ferns, horsetails, and
seed plants. We took this splitting date as the lower bound for the
divergence between Psilotum and seed plants.
Pinus is a genus of Pinaceae, which contains over 230 species
and is the largest and most basal family of conifers (Hart 1987;
Price et al. 1993; Chaw et al. 1997). Delevoryas and Hope (1973)
and Miller (1977, 1988) proposed that the Triassic (206248 Myr
ago) period may represent a time when modern conifers were
evolving. Cladistic and stratigraphic analyses of living seed plants
(Doyle and Donoghue 1987; Crane 1988; Doyle 1998) suggested
that diversification of modern seed plants occurred from the lower
Pennsylvanian to the upper Triassic (215310 Myr ago). The
earliest fossil evidence of trees bearing the typical conifers bisac-
cate pollen that germinates distally dates from the late Carbonif-
erous to early Permian (ca. 250290 Myr ago) and conifer relatives
are known from ca. 310 Myr ago (Rai et al. 2003). Gymnosperms
and angiosperms are the two major taxa of seed plants, distinct
since the end of the Carboniferous, 300 Mya (Bow et al. 2003).
From the above considerations, we took 280310 Myr as an upper
bound for the split between the conifer and the angiosperm line-
ages.
Fossil leaves of rice (belonging to the grass family Poaceae)have been described from the upper Eocene, about 40 Myr ago
(Stebbins 1981), and the earliest unequivocal evidence of grass
fossils (including spikelets and inflorescence with pollen) were
found in PaleoceneEocene deposits, about 5060 Myr ago (Crepet
and Feldman 1991). Initial radiation of the grass family was sug-
gested to be 65 Myr ago (Stebbins 1987; Thomasson 1987). Bremer
(2000) regarded the 5070 Myr ago estimate of a maizewheat
divergence used by Wolfe et al. (1989) as rather uncertain. More-
over, phylogenetic analyses of the cp rpl16 intron sequences (Zhang
2000), eight character sets (GPWG 2001), cp genome structure
(Ogihara et al. 2002), and cp genomic comparison (Matsuoka et al.
2002) indicated that in the grass family, Oryzoideae (rice) and
Pooideae (wheat) diverged after the subfamily Panicoideae (maize),which was preceded by four other subfamilies (Zhang 2000).
Therefore, we took 5060 Myr as a reasonable estimate of the
maizewheat split.
Results
Cp Genome Data
The concatenated lengths of all known cp functional
protein-coding genes (Appendix 1) in the 12 sampledspecies (Table 2) range from 58,095 bp in the Triticum
to 71,509 bp in the Marchantia; the average is 63,661
4,764 bp. Sixty-one cp protein-coding genes, which
encode two envelope membrane proteins (cemA,
ycf9), 1 maturase (matK), 1 protease (clpP), 34 pho-
tosynthetic light reactions (atpA, atpB, atpE, atpF,
atpH, atpI, petA, petB, petD, petG, petL, petN, psaA,
psaB, psaC, psaI, psaJ, psbA, psbB, psbC, psbD, psbE,
psbF, psbH, psbI, psbJ, psbK, psbL, psbM, psbN,
psbT), 18 ribosomal proteins (rpl2, rpl14, rpl16, rpl20,
rpl32, rpl33, rpl36, rps2, rps3, rps4, rps7, rps8, rps11,
rps12, rps14, rps15, rps18, rps19), 4 RNA polym-
erases (rpoA, rpoB, rpoC1, rpoC2), and 1 cytochrome
c biogenesis protein (ccsA), are in common to all 12
taxa. After elimination of unknown sites, regions
difficult to align, start and stop codons, and all gaps,
39,507 sites were used for comparison and tree
reconstruction.
As shown in Table 3 (the first row), the 12 cp ge-
nomic sequences are AT-rich. This bias is particularly
strong at the third codon positions, primarily because
of the high T nucleotide contents. These data are
consistent with the high AT content found earlier in
the plastid genome (Whitfeld and Bottemley 1983).
Across the 11 tracheophytes nucleotide base compo-
sitions are homogeneous at the first and second co-
428
-
8/8/2019 Dating the MonocotDicot Divergence and the Origin of Core Eudicots Using Whole Chloroplast Genomes
6/18
on positions (v2 test, p = 0.793 and 0.981) but not
o at the third codon positions (p < 0.000). The Gontent is particularly high at the first codon posi-
ions in all taxa and Marchantia much prefers the use
f synonymous codons ending with A or T.
The mean Ka/Ks ratio for all species pairs is 0.19.
The mean Ka/Ks ratio difference between the mono-
ot (0.156) and the dicot (0.158) lineages is small.
These data are suggestive of stringent selective con-
traints on amino acid substitutions and correlate
well with the observed higher GC contents at the first
nd second positions (Table 3).
The Inferred Phylogenetic Trees
Figure 1A was simplified from the topology of the
maximum parsimony (MP) trees reconstructed with
multigenes by Qiu et al. (1999) and by S. P. Soltis
t al. (1999). Figure 1B is a NJ tree reconstructed with
Ka values using Marchantia as the outgroup. The
opology of this tree strongly indicates that, to the
xclusion of the fern (Psilotum) lineage, the seed
lants form a monophyletic clade, within which the
onifer (Pinus) lineage and the angiosperms comprise
wo separate subgroups. The sampled angiosperms
re subdivided into two well-supported lineages, the
monocots and the eudicots. Both bootstrap and in-
erior branch tests for the above-mentioned major
tracheophyte lineages are 100%, and the latter test
yielded a higher percentage support for the rice +wheat clade. The phylogenetic relationships of the
monocot lineage and the six core eudicots generally
agree well with those in recent multigene trees (Fig.
1A [Qiu et al. 1999; PS Soltis et al. 1999; DE Soltis et
al. 2000]) except that in our NJ tree the Caryophyll-
ales (represented by Spinacia) and asterid (repre-
sented by Nicotiana) are well resolved as sister clades.
This relationship was previously revealed in the trees
made by Wolfe et al. (1989) and Goremykin et al.
(2003).
The NJ tree reconstructed from the Ks values ap-
pears to be unreliable because it placed Arabidopsis as
basal to the remaining dicots (data not shown),
contrary to the most recent multigene phylogenies of
angiosperms (Mathews and Donoghue 1999; Qiu et
al. 1999; PS Soltis et al. 1999; DE Soltis et al. 2000;
Chase et al. 2000). These data caused us to question if
the third codon position, where most synonymous
substitution occurs, is saturated with substitutions.
To assess levels of sequence saturation with the
concatenated cp genes, pairwise uncorrected numbersof transitions and transversions (uncorrected P) were
plotted against corrected (Kimuras two-parameter)
sequence distance (Fig. 2). Sixty-six paired points [12
(12 ) 1)/2] are presented in Fig. 2. The curves of
both uncorrected transitions and transversions
able 3. Nucleotide base composition (%) of the concatenated 61 cp protein-coding genes in Marchantia and 11 sampled tracheophytes
odon positiona A C G T pb
All 33.2/29.2 0.4 (1.4%) 14.4/18.4 0.5 (2.7%) 17.8/21.5 0.4 (1.9%) 34.6/31.0 0.5 (1.6%) 0.000
st 30.3/28.2 0.4 (1.4%) 16.9/19.6 0.3 (1.5%) 29.1/30.2 0.3 (1.0%) 23.7/22.0 0.3 (1.4%) 0.793
nd 29.1/27.2 0.3 (1.1%) 20.8/21.7 0.2 (0.9%) 17.4/19.1 0.3 (1.6%) 32.6/32.0 0.3 (0.9%) 0.981
rd 40.1/32.3 0.8 (2.5%) 5.3/13.8 1.0 (7.2%) 7.0/15.0 0.8 (5.3%) 47.5/39.0 1.1 (2.8%) 0.000
The start and stop codons were not included in analysis. Data on Marchantia are before the slash and the average of 11 sampled
acheopytes is after the slash and presented as mean SE (coefficient of variation).
Probabihty (p) was based on v2 tests for homogeneity across the 11 sampled tracheophytes using PAUP 4.0b1 (Swofford 1998).
Fig. 2. Uncorrected pairwise sequence
divergence (P distance) plotted against
corrected distances (Kimura two-pa-
rameter) for transitions (Ts) and trans-
versions (Tv) at first and second codon
positions (A) and third codon position
(B). Each plot presents 66 data points.
429
-
8/8/2019 Dating the MonocotDicot Divergence and the Origin of Core Eudicots Using Whole Chloroplast Genomes
7/18
against sequence divergence at the first two codon
positions were nearly linear (Fig. 2A). In contrast, the
curves at the third codon position revealed a signifi-
cant trend toward asymptotic saturation (Fig. 2B),
indicating that substitutions at the third codon posi-
tion are saturated and not suitable for inferring
phylogenetic relationships among the sampled taxa
or for dating purposes. For this reason, we used only
the NJ tree based on the Ka values.
Because Fig. 1B did not resolve the relationships
among sampled eudicots, we reconstructed a phylo-
genetic tree of eudicots using the three monocots as
the outgroup. The NJ tree based on the Ks
values
yielded a reasonable topology (i.e., in agreement with
the phylogenetic relationships of the orders of flow-
ering plants compiled by APG [1998]) for the sampled
six eudicots (Fig. 3A), whereas the NJ tree based on
Ka values did not (data not shown). Based on the
limited number of eudicots, Fig. 3A suggests that the
six core eudicots first split (at node B) into two well-
supported monophyletic clades, the rosids (repre-
sented by Oenothera, Arabidopsis, Lotus, and Medi-
cago) and the asterids + Caryophyllales (represented
by Nicotiana and Spinacia, separately).
Both NJ trees reconstructed from Ka (Fig. 1B) and
Ks (Fig. 3A) values suggested a close relationship
between the Ehrhartoideae (rice) and the Pooideae(wheat) with the maize as an outgroup, but the
bootstrap values for the ricewheat clade are low to
moderate. Recently, using the NJ method with the
variable sites of 98 genes (including not only all
protein-coding but also RNA genes) common to the
cp genomes of these three cereals and rooting at the
Nicotiana lineage, Matsuoka et al. (2002) also placed
maize as sister to the ricewheat clade.
Phylogenetic Distribution of Cp Genes DuringTracheophyte Evolution
We examined a total of 98 protein-coding genes
(Appendix 1) that are present among the 12 studied
taxa. Figure 1B also presents the protein-coding gene
numbers held in each sampled species and specific
gene loss, transfer, and retention in the 11 lineages of
tracheophytes. Although Martin et al. (2002) have
done a similar evolutionary analysis for deeper
groups. Fig. 1B is focused on the tracheophyte line-
ages using Marchantia as the outgroup with addi-tional 5 core eudicots and 1 fern.
Compared with the cp genome of bryophyte
(Marchantia), those of tracheophytes have lost three
genes: two transporters (cysA and cysT) and one with
unknown function (cys66) (Fig. 1B). During trache-
ophyte evolution, the fern (Psilotum) and angiosperm
lineages have parallel losses of three chlorophyll
biosynthesis genes, chlB, chlB, and chlN. The seed
plant lineage has commonly transferred the rpl21
(Martin et al. 2002 and references cited therein) to its
nucleus. In the spinach and Arabidopsis lineages that
gene, however, has been unusually replaced by
a nuclear RPL21c gene of mitochondrial origin
(Martin et al. 1990; Gallois et al. 2001).
Within seed plants, the conifer (Pinus) lineage has
uniquely lost all 11 NADH dehydrogenase subunit
genes (ndhAK; 4 are completely missing and 7 are
pseudogenes [Wakasugi et al. 1994]) but has gained a
new gene of unknown function, ycf68 (Martin et al.
2002). The angiosperm lineage has further lost two
genes, one, psaM, involved in the photosynthetic light
reaction and the other, ycf12, of uncertain function.
Within angiosperms the three grasses have lost
three genes: one metabolism related, accD (Hiratsuka
et al. 1989; Maier et al. 1995; Ogihara et al. 2002), and
two genes of unknown function, ycf1 and ycf2. How-
Fig. 3. Relative branch lengths (based on Ka values) of monocots
and dicots, using Marchantia (A), Psilotum (B), and Pinus (C) asoutgroup, respectively. Gray branches are with Aratbidopsis, Spi-
nacia, and Nicotiana excluded (for their slower rates). Branch
lengths are Ka values per 100 sites. D Rooted NJ tree of the core
eudicot lineages based on Ks values, using the three grasses as
outgroup. Italicized numbers denote bootstrap percentages (boot-
strap test before the slash, interior branch test after the slash).
Branch lengths are Ks values per 100 sites.
430
-
8/8/2019 Dating the MonocotDicot Divergence and the Origin of Core Eudicots Using Whole Chloroplast Genomes
8/18
ver, we found that the grass lineage has also recruited
ine novel genes; one of them,ycf68, is shared with the
ine lineage, and the remaining eight, ycf6976, are
nique. Functions of these genes are not known yet
nd they have no detectable homology to prokaryotic
enes (Martin et al. 2002). Except for spinach, all
ampled eudicots have lost the translational initiation
actor 1 (infA). According to an extensive survey of
more than 300 diverse angiosperms by Millen et al.
2001), the infA gene of the cp genome has repeatedly
ecome defunct in about 24 separate angiosperm lin-
ages, including almost all rosid species.
Nucleotide Substitution Rates
Before applying molecular calibration, we assessedhe assumption of rate constancy. Fig. 1B shows that
he branches from the calibration point C1 leading to
he Psilotum (fern) lineage and the Pinus lineage are
ot equal in length. The NJ trees in Figs. 3B, C, and
D, using Marchantia, Pinus, and Psilotum as the
utgroup, respectively, also indicate that the Ka rates
n the monocot and the dicot lineages are unequal.
The monocot lineage has evolved faster than the di-
ots, by 39.6, 37.3, and 32.3%, respectively, for the
hree outgroups. In Fig. 1B the branches from node
A leading to Arabidopsis, Spinacia, and Nicotiana are
trikingly shorter than those leading to the other
hree dicots and the monocots. Tajimas relative rate
est using rice, Marchantia, Psilotum, and Pinus as
utgroups, respectively, confirmed this observation
all ps < 0.001). However, exclusion of the above
hree slower dicot lineages (gray lines in Figs. 3BD)
ed to even higher estimates of divergence dates (data
ot shown). We therefore used the entire dataset.
By applying the equation, r = K/(2T), where K is
he distance and T is the divergence time between the
wo taxa compared, nonsynonymous rates were cal-
brated and are shown in Table 4. Based on the three
ivergence events, C1, C2, and C3, and the dataset
with all six dicots, the Ka rates are 0.2150.225 10)9,
.2320.257 10)9, and 0.1640.197 10)9 substi-
tutions per nonsynonymous site per year, respec-
tively. Clearly, these three calibrated Ka rates are
unequal, differing from one another by from 8%
[(0.232 ) 0.215)/0.215] to nearly 42% [(0.232 ) 0.164)/
0.164], and the coniferangiosperms Ka rate is the
highest.
Dates of the MonocotDicot Divergence and the
Origin of Core Eudicots
Molecular Clock or Rate Constancy Method. The
date of the monocotdicot divergence was estimated
by applying the equation T= K/(2r). As indicated in
Table 4, based on the entire dataset and the calibra-
tion points C1, C2, and C3, three time estimates for
the monocotdicot divergence, 206 5217 6, 180 7 200 7, and 237 5285 9 Myr, were
obtained. These estimates suggest that the monocot
dicot divergence took place 220 40 Myr ago.
Using either the Ka or the Ks rates of the maize
wheat divergence and the mean Ka values (see node B
in Fig. 1B and Fig. 4) of all six eudicots or the Ksvalues between the rosid clade and the asterid +
Caryophyllales clade (see node B in Fig. 3A), the
divergence for core eudicots was estimated to be 154
7185 8 and 149 2181 3 Myr ago (Table
4), respectively. These two estimates are close to each
other, and their average is 170 Myr ago.
LiTanimura Method. Figure 4 was simplified
from the phylogenetic tree Fig. 1B with all branch
lengths indicated. The branch length of core eudicots
was calculated as the mean length of the branch
leading from their emergence point (node B) to the six
core eudicots. We then used the LiTanimura meth-
od, which uses lineages in which the molecular clock
holds better than the others, to estimate the diver-gence time at points A and B. For example, we know
that the branching date for Pinus (node C2) is 280
310 Myrs ago and want to estimate the branching
dates between the monocot and the dicot lineages.
The distances from node C2 to Pinus, monocots, and
able 4. Estimates of the monocotdicot divergence and the age of core eudicots based on the constant rate method
Outgroup Calibration event (fossil dates; Myr) Ka Rateb (10)9) Ka Time (Myr)
Monocotdicot divergence
Marchantia C1: Fernseed plant divergence (400420) Ka: 18.03 0.22 0.2150.225 Ka: 9.30 0.24 206 5217 6
silotum C2: Coniferangiospenn divergence (280310) Ka: 14.40 0.29 0.2320.257 Ka: 9.28 0.34 180 7200 7
inus C3: Maizewheat divergence (5060) Ka: 1.97 0.23 0.1640.197 Ka: 9.35 0.29 237 5285 9
Origin of core eudicots
inus C3: Maizewheat divergence (5060) Ka: 1.97 0.23 0.1640.197 Ka: 6.08 0.26 154 7185 8
Monocots C3: Maizewheat divergence (5060) Ks: 12.10 0.11 1.0081.210 Ks: 36.06 0.51 149 2181 3
K denotes the number of substitutions per 100 synonymous ( Ks) or nonsynonymous (Ka) sites between pair of taxa or groups.
Rate (r) is defined as the number of substitutions per site per year, r = K/(2T) (Li and Grauer 1991).
431
-
8/8/2019 Dating the MonocotDicot Divergence and the Origin of Core Eudicots Using Whole Chloroplast Genomes
9/18
dicots are 6.05, 9.02 (=3.91 + 4.22 + 0.89), and 7.66
(=3.91 + 0.90 + 2.85), respectively. Since the
monocot lineage has a longer branch length than do
the Pinus and dicot lineages, it is not used. Based on
the branch length of the dicot lineage, the monocot
dicot divergence (node A in Fig. 1B and Fig. 4) was
estimated to be 137152 Myr ago, which is derivedfrom (280 or 310) (0.90 + 2.85)/7.66; the origin of
core eudicots (node B in Fig. 1B and Fig. 4) was
estimated as 104115 Myr ago, which is calculated
from (280 or 310) 2.85/7.66. As shown in Fig. 4 the
distance from node C1 to dicots is 10.4 (=2.74 +
3.91 + 3.75), and we assume that the molecular clock
along the dicot lineage is approximately constant.
Similarly, using the branching date of Psilotum, 400
420 Myr ago (node C1 in Fig. 1 and Fig. 4), the
monocotdicot divergence (node A in Fig. 1 and Fig.
4) was estimated to be (400 or 420) (0.90 + 2.85)/10.4 = 144151 Myr ago, and the origin of core
eudicots was estimated to be (400 or 420) 2.85/10.4
= 110115 Myr ago. Table 5 shows that these dates
are highly close to those estimated from C2. Com-
bining estimates calibrated from both C1 and C2, we
estimated that monocots and dicots diverged at 140
150 Myr ago and the core eudicot lineages originated
100115 Myr ago.
Discussion
Rate Variation Among Tracheophyte Lineages
Our phylogenetic analyses (Figs. 1B, 3, and 4) in-
dicate that nonsynonymous substitution rates of cp
genomes are unequal among the six eudicot line-
ages, between the two angiosperm lineages (i.e.,
monocots and dicots), and among the tracheophyte
lineages (i.e., all sampled seed plants and a fern,
Psilotum). These observations were confirmed by
Tajimas relative rate test using Marchantia as the
outgroup and the first two coding positions (data
not shown).
Using 40 cp proteins, Goremykin et al. (1997)found that the average substitution rates (equiva-
lent to the Ka rate) along the branches from the
common node (equivalent to node C1 in our
Fig. 1B) of seed plants to Nicotiana and to Pinus
were quite similar. However, our cp genomic data
(Fig. 1B) suggest that the former branch is signifi-
cantly longer than the latter when Psilotum is
used as the outgroup (Tajimas relative rate test:
p < 0.001).
As revealed in Fig. 1B and Figs. 3BD, the Ka
rate in the grass lineage has evolved much fasterthan in the dicots. In addition, Fig. 1B (Ka rate)
and Fig. 3A (Ks rate) also indicate that among the
six annual dicots sampled, the Nicotiana and Spi-
nacia evolved more slowly than the rest. Extensive
rate variation among annual plants has also been
observed in other cp genes at nonsynonyrnous sites
(Wolfe et al. 1987, 1989). Generally, there is a
resonance of Bousquet et al.s observation (1992)
on the rbcL gene of seed plant lineages. They
found that the annual form evolved more rapidly
on average than the perennial form (represented in
our study by Psilotum and Pinus) and that the
grass family has the fastest evolution rate. Com-
paring the rbcL and ndhF loci in the grass family,
Gaut et al. (1997) found that at Ka sites rate var-
iation was not correlated between those two plastid
loci. Most recently, examining the whole cp ge-
nomes (106 genes) of maize, rice, and wheat,
Matsuoka et al. (2002) also found variation in Karates. The Ka rate variation seems to correlate well
with the evolutionary divergence pattern depicted
in the cp genomic NJ tree (Fig. 1B), which shows
that the angiosperm lineage has evolved faster than
the gymnosperm lineage and that the latter in turn
has evolved more rapidly than the fern lineage.
Since the number of completely sequenced cp ge-
Table 5. Ages of nodes (Myr) inferred from the phylogenetic tree
in Fig. 3 using the LiTanimura method (1987)
Node C1 (400420 Myr) C2 (280310 Myr)
C1 a 380421
C2 295309
A l44151 137152
B 110115 104115
a Nonapplicable.
Fig. 4. A phylogenetic tree simplified from Fig. 1B. Nodes and
lineages correspond to those in Fig. 1B. C1 was used to estimate the
divergence dates of C2, the monocotdicot divergence (node A),and the origin of core eudicots (node B) (refer to Table 5 and the
text for detail). The number on each branch is the Ka value per 100
sites.
432
-
8/8/2019 Dating the MonocotDicot Divergence and the Origin of Core Eudicots Using Whole Chloroplast Genomes
10/18
omes is quickly increasing, this trend may be re-
ested soon.
Significant rate variation in the cp genomes of the
racheophyte lineages is also consistent with the fin-
ing of P. S. Soltis et al. (2002), who studied one
uclear and three plastid genes using MP analyses. In
ummary, the molecular clock hypothesis does not
old for the Ka rates among the cp genomes of tra-
heophyte plants.
Reference Fossil Dates and the Phylogenetic Tree
Obtained
n the Data and Methods section we have carefully
ross-examined the three fossil dates by adopting
pdated phylogenies and documented fossil records.
Bremer (2000, pp 4709, 4710) suggested that in
hylogenetic dating rate calibration rather than
nequal substitution rates is the major source ofrror and is behind the discrepancies in earlier es-
imates of monocot and flowering plant evolution.
ndeed, in Table 4 the three calibrated rates based
n the molecular clock are discrete, and the ob-
ained dates for monocotdicot divergence do not
gree with one another. To evaluate if the fossil
ates and the cp Ka rates corresponded well with
ach other with respect to the two dating methods,
we also used the divergence rates and the Ka dis-
ances (Table 4) from the fernseed plant and con-
ferseed plant splits to date the others divergences.The rate constancy method led to an estimate of
50390 Myr ago for the former event and 320335
Myr ago for the latter. These two estimates differ
widely from the fossil records. In contrast, the di-
ergence times (Table 5) of these two events esti-
mated from the LiTanimura method are highly
ompatible with the paleobotanical data.
Sanderson and Doyle (2001) proposed that (1)
iases in the data or the statistical estimation
method used, (2) variation in rate across sites which
causes sequence divergences to be estimated in-
orrectly, and (3) incorrect phylogenies are the
nderlying sources of error in molecular dating. P.
. Soltis et al. (2002) added that inadequate sam-
ling of taxa...can compound the problem. The
ame concern could be raised about the results we
resent here. However, the effect of these problems
s likely to have been considerably reduced by the
ampling of 12 evolutionary successive land plants
Table 2; including all three living subclasses of
udicots), the use of 61 genes (>39,000 bp long)with different functions from the complete cp ge-
omes, and the highly reliable NJ tree (Fig. 1B),
which is consistent with the NJ tree of Goremykin
t al. (1997), inferred from concatenating 14,295
mino acids of cp genomes.
Comparison of Estimates from the Molecular Clock
and LiTanimura Methods
Tables 4 and 5 show that the dates of the monocot
dicot split and the origin of core eudicots estimated
by the rate constancy and LiTanimura methods
differ greatly, with estimates from the former method
predating the latter by 50 Myr. Estimates calibrated
from nodes C1 and C2 using the molecular clockmethod vary more than those obtained from the Li
Tanimura method.
In the rate constancy method we used the arith-
metic mean of all pairwise Ka distances between
monocots and dicots to estimate the divergence date
(Table 4). As a result, the obtained dates for the
monocotdicot split (220 Myr) and for the origin of
the core eudicots (170 Myr) appear to be severely
overestimated because the three high-rate grasses
were included in the distance calculation. This was
also the case in most previous estimates (Wolfe et al.1989; Martin et al. 1989, 1993; Brandl et al. 1992;
Laroche et al. 1995; Yang et al. 1999), which not only
used the molecular clock hypothesis but also included
one (maize) or several fast-evolving grass species (or
annual Liliales [such as Ramshaw 1972]).
Together with the preceding age estimates for the
monocotdicot split and the origin of core eudicots,
we concluded that the LiTanimura method can
substantially reduce the effect of rate variation among
lineages and provide an estimate more in line with
known fossil data.
Comparison of Our Estimates with Previous Estimates
Since our estimates based on the rate constancy
method seem unreliable, we shall compare only esti-
mates obtained from the LiTanimura method with
those from other methods. Goremykin et al. (1997)
used a very similar framework of the LiTanimura
method (1987) and claimed that their approach is
independent of the rate fluctuation on the grass
(high rate) and Marchantia (low rate) branches.
They estimated the divergence time between the Zea
Oryza lineage and the Nicotiana lineage to be 160
Myr ago, which predates ours by about 1020 Myr
(Table 5). Based on our cp genome data (Figs. 1B and
3A), the Nicotiana lineage has the slowest Ka and Ksrates among the sampled dicots but was used as half
of the denominator by Goremykin et al. (1997) in
estimating the monocotdicot divergence time. Since
our estimates of the dates for the monocotdicotlineage and the origin of core eudicots were based on
the mean branch length of six dicots, our data should
be more reliable than data based on single species.
In order to reduce the effects of unequal rates,
Bremer (2000) used the mean branch lengths from a
433
-
8/8/2019 Dating the MonocotDicot Divergence and the Origin of Core Eudicots Using Whole Chloroplast Genomes
11/18
group of terminal taxa to their common node (which
has a known fossil age) for calculating the change
rate (distance/age by Bremers definition). Using the
rbcL gene, the MP tree of monocots, and the eight
reference nodes with known fossil dates, Bremer
(2000) estimated the split between Acorus, presuma-
bly the basalmost extant monocot (APG 1998; Chase
et al. 2000), and the remaining monocots at 134 Myr
ago. According to the integrated and widely usedphylogenetic tree for the orders of flowering plants
(APG 1998), the separation of the monocot lineage
from the other magnoliids predated the branching-off
of eudicots. Therefore, Bremers estimate is compat-
ible with ours for the monocotdicot split (140150
Myr ago) and the core eudicot divergence (100115
Myr ago).
Using a single gene (rbcL) and the NPRS meth-
od, two genes (plus 18S rRNA) and maximum
likelihood analyses, and the calibration date of
Marchantia (450 Myr ago), Sanderson (1997) andSanderson and Doyle (2001) estimated that the age
of crown angiosperms originated 160 and 140190
Myr ago, respectively. Combining a three-gene da-
taset (rbcL, atpB, and 18S rRNA), the NPRS
method, and the split between Fagales and Cu-
curbitales (84 Myr ago), Wikstro m et al. (2001)
proposed the origin of the extant angiosperms to be
158179 Myr old and that of eudicots to be 131147
Myr old. These estimates are in good agreement
with ours, as the dicots we sampled are all eudicots.
Recent multigene analyses of angiosperm evolutionhave revealed that the monocotdicot divergence
was preceded by five living basal dicot lineages, the
Amborellaceae, the Nymphaeales, and a group in-
cluding Illiciaceae, Trimeniaceae, and Austrobailey-
aceae (i.e., the so-called ANITA group) (Qiu et al.
1999; PS Soltis et al. 1999; DE Soltis et al. 2000; see
Fig. 1A), and an extinct basal angiosperm lineage,
the Archaefructaceae (Sun et al. 2002). Therefore,
previous estimates for angiosperms origin based on
the monocotdicot split have underestimated the age
of angiosperms themselves. The above authors es-
timates are consistent with ours if we postulate that
approximately 20 (=160 ) 140) to 40 (=190 ) 150)
Myr separates the angiosperm origin and the split
between the ancestors of the monocot and eudicot
lineages.
Our estimated date for the origin of core eudicot
lineages is 100115 Myr ago (Table 5). This is earlier
than the many documented fossil-based estimates for
core eudicots, such as a possible Rhamnaceae/Rosa-
ceae (both are rosids, represented here by our sam-
ples: Lotus, Medicago, Arabidopsis, and Oenothera)
from the early Cenomanian (9497 Myr [Basinger
and Dilcher 1984]), 89 Myr for the Caparales (rep-
resented by our sample: Arabidopsis), 84 Myr for
Myrtales (Magallon et al. 1999) (represented by our
sample: Oenothera), and 83 Myr for the Caryophyll-
ales clade (represented by our sample: Spinacia)
(Magallo n et al. 1999). In addition, our estimate for
the age of core eudicots is reasonably shorter than the
fossil age of a basal eudicot, Tetracentraceae, from
the Barremian (110118 Myr ago [Magallo n et al.
1999]). Collectively, our cp genomic data indicate
that the core eudicots age is also older than known
fossil records indicate.
Conclusions
We observed significant Ka rate variation in cp ge-
nome data among major tracheophyte lineages.
Therefore, the rate constancy method is not appro-
priate for dating the divergence between monocots
and dicots or the age of eudicots, especially if fast-
evolving monocots are included. Using cp genome
data, we demonstrated that the LiTanimura method
gives estimates that better reflect the known evolu-tionary sequence of tracheophyte lineages and cor-
respond well with the fossil records of calibration
points we used.
Combining our estimates calibrated by two known
fossil nodes and the LiTanimura method, we pro-
pose that the monocot lineage branched off from
dicots 140150 Myr ago, in the late Jurassic to early
Cretaceous, and that the core eudicots radiated 100
115 Myr ago, between the Albian and the Aptian of
Cretaceous. These estimates are in accordance with
those of Sanderson (1997) and P. S. Soltis et al.
(2002), who analyzed one to three genes and used
MP and ML branch lengths with the NPRS method.
In summary, methods that accommodate unequal
rates give smaller estimates than the rate constancy
method and appear to agree well not only with one
another, but also with the recently documented fossil
evidence.
Our results confirm all previous conclusions that
molecular data indicate a pre-Cretaceous origin for
angiosperms, but our estimates for the monocotdicot divergence postdate previous estimates based
on the molecular clock hypothesis by at least 50 Myr
(=200150 Myr ago).
Acknowledgments. We thank Robert Friedman for critical com-
ments on an early version of the manuscript and Yoshihiro
Matsuoka and Shu-Shin Wu for help with the gene group assign-
ment for the three grasses and other taxa. We also thank the two
reviewers critical and valuable comments and suggestions. This
work was supported in part by National Science Council Grant
NSC912311B001103, and Academia Sinica Grant IB91 to S.M.C.,
and NIH Grant GM30998 to W.H.L.
Appendix
Appendix Table A1 continues on next page.
434
-
8/8/2019 Dating the MonocotDicot Divergence and the Origin of Core Eudicots Using Whole Chloroplast Genomes
12/18
p
gg
p
g
p
(
)
y
y
Taxon
Genea
Marc
hantia
Psilotum
Pinus
Triticum
Oryza
Zea
Lotus
Medica
go
Ara
bidopsis
Spinacia
Nicotiana
Oenot
hera
accD
951b
933
966
c
1,506
1,512
1,467
1,569
1,539
1,317
(ORF
316)
atpA
1,524
1,527
1,485
1,515
1,524
1,524
1,533
1,536
1,524
1,296
1,524
1,518
atpB
1,479
1,479
1,479
1,497
1,497
1,497
1,497
1,497
1,497
1,497
1,497
1,497
atpE
408
402
414
414
414
414
402
402
399
405
402
402
atpF
555
555
555
552
543
552
555
552
555
555
555
555
atpH
246
246
246
246
246
246
246
246
246
246
246
246
atpI
747
747
747
744
744
744
744
738
750
744
744
744
ccsA
963
933
963
969
966
966
972
972
987
972
942
960
(ORF
320)
(ORF320)
(ycf5)
(ORF321)
(ORF321)
(ycf5)
(ycf5)
(ycf5)
(ycf5)
(ycf5)
(ycf5)
cemA
1,305
1,350
786
693
693
693
690
690
690
702
690
690
(ORF
434)
(ycf10)
(ORF261)
(ORF230)
(ycf10)
(ORF229)
(ycf10)
chlB
1,542
1,533
(ORF
513)
chlL
870
876
(frxC)
chlN
1,398
1,404
(108667110064)
chlP
612
597
591
651
651
651
591
588
591
591
591
750
(ORF
203)
(ORF216)
cysA
1,113
(mbpX
)
cysT
867
(ORF
288)
infA
237
243
237
342
324
324
177
Wd
matK
1,113
1,512
1,548
1,629
1,629
1,635
1,527
1,521
1,581
1,518
1,530
1,539
(ORF
370i)
(ORF542)
ndhA
1,107
1,116
1,089
1,089
1,089
1,092
1,077
1,083
1,098
1,092
1,092
(ndh1)
ndhB
1,504
1,491
Ye
1,533
1,533
1,533
1,164
1,164
1,164
1,533
1,533
1,533
(ndh2)
ndhC
363
363
Y
363
363
363
363
363
363
363
363
363
(ndh3)
ndhD
1,500
1,545
Y
1,503
1,503
1,503
1,494
1,494
1,503
1,383
1,503
1,503
(ndh4)
ndhE
303
321
Y
306
306
306
306
306
306
306
306
306
(ndh4L)
ndhF
2,079
2,223
2,220
2,205
2,217
2,244
2,235
2,241
2,229
2,223
2,211
(ndh5)
ndhG
576
567
531
531
531
531
531
531
531
531
531
(ORF
191)
Continued
435
-
8/8/2019 Dating the MonocotDicot Divergence and the Origin of Core Eudicots Using Whole Chloroplast Genomes
13/18TableA1.
Continued
Taxon
Genea
Marc
hantia
Psilotum
Pinus
Triticum
Oryza
Z
ea
Lotus
Medicago
Ara
bidopsis
Spinacia
Nicotiana
Oenot
hera
ndhH
1,179
1,182
Y
1,182
1,182
1,182
1,182
1,182
1,182
1,182
1,182
1,182
(ORF392)
(ORF393)
ndhI
552
498
Y
543
537
543
486
486
519
513
504
498
(frxB)
(ORF178)
ndhJ
510
477
480
480
480
477
477
477
477
477
477
(ORF
169)
(ORF480)
ndhK
732
624
Y
738
741
747
693
684
678
846
744
744
(psbG)
(psbG)
(psbG)
(psbG)
petA
963
966
960
963
963
963
963
963
963
963
963
957
petB
648
648
648
648
648
705
648
648
648
648
636
648
petD
483
483
543
483
483
483
483
483
483
483
483
483
petE
114
114
114
114
114
114
114
114
114
114
114
114
(ORF
37)
(petG)
(petG)
(petG)
(petG)
(petG)
(petG)
(petG)
(petG)
(petG)
petL
96
96
189
96
96
96
96
96
96
96
96
96
(ORF
31)
(ORF62b)
(ORF31)
(ORF31)
(ycf7)
(ORF31)
petN
90
90
90
90
90
90
90
90
90
90
90
90
(5168
5257)
(ORF29)
(ycf6)
(ORF29)
(ORF29)
(ycf6)
(ycf6)
(ycf6)
(ycf6)
(ycf6)
(ycf6)
psaA
2,253
2,253
2,262
2,253
2,253
2,253
2,253
2,277
2,253
2,253
2,253
2,256
psaB
2,205
2,205
2,205
2,205
2,205
2,208
2,205
2,205
2,205
2,205
2,205
2,205
psaC
246
246
246
246
246
246
246
246
246
246
246
246
(frxA)
psaI
111
111
159
111
111
111
105
105
114
102
111
105
(ORF
36b)
(ORF36)
psaJ
129
129
135
129
135
129
135
135
135
135
135
132
(ORF
42b)
(ORF44)
psaM
99
99
93
(ORF
32)
psbA
1,062
1,062
1,062
1,062
1,062
1,062
1,062
1,062
1,062
1,062
1,062
1,062
psbB
1,527
1,527
1,527
1,527
1,527
1,527
1,527
1,527
1,527
1,527
1,527
1,527
psbC
1,422
1,386e
1,422
1,422
1,422
1,422
1,422
1,422
1,422
1,422
1,386
1,422
(1422)
psbD
1,062
1,062
1,062
1,062
1,062
1,062
1,062
1,062
1,062
1,062
1,062
1,062
psbE
252
252
252
252
252
252
252
252
252
252
252
252
psbF
120
120
120
120
120
120
120
120
120
120
120
120
psbH
225
225
228
222
222
222
222
219
222
222
222
222
(ORF
74)
psbI
111
111
158
111
111
111
111
111
111
111
111
111
(ORF
36a)
(83988508)
psbJ
123
123
123
123
123
123
123
123
123
123
123
123
(ORF
40)
(ORF40)
psbK
168
177
180
186
186
186
186
186
180
180
186
180
(ORF
55)
(ORF98)
psbL
117
117
117
117
117
117
117
117
117
117
117
117
(ORF
38)
436
-
8/8/2019 Dating the MonocotDicot Divergence and the Origin of Core Eudicots Using Whole Chloroplast Genomes
14/18psbM
105
105
114
105
105
105
105
105
105
105
105
105
(ORF34)
psbN
132
132
132
132
132
132
132
132
132
132
132
132
(ORF43)
psbT
108
99
108
117
108
102
102
108
102
102
105
108
(ORF35)
(ORF35)
(ORF33)
rbcL
1,428
1,428
1,428
1,434
1,434
1,431
1,428
1,428
1,440
1,428
1,434
1,428
rpl2
612
834
831
822
822
822
825
792
825
819
825
825
(ORF203)
rpl14
369
369
369
372
372
372
369
369
369
366
372
369
rpl16
432
423
405
411
411
411
408
408
408
408
405
408
rpl20
351
345
360
360
360
360
366
360
354
387
387
393
rpl21
351
390
rpl22
360
351
429
447
450
447
483
600
468
414
rpl23
276
273
276
282
282
282
282
282
282
W
282
282
(ORF42)
rpl32
210
183
213
192
192
180
153
180
159
174
168
156
(ORF69)
(ORF63)
rpl33
198
198
207
201
201
201
201
201
201
201
201
201
rpl36
114
114
114
114
114
114
114
114
114
114
114
114
(sec
X)
rpoA
1,023
1,023
1,008
1,020
1,014
1,020
1,002
1,002
990
1,008
1,014
1,104
rpoB
3,198
3,201
3,228
3,321
3,228
3,228
3,213
3,213
3,219
3,213
3,213
3,219
rpoC1
2,055
2,025
2,091
2,052
2,049
2,052
2,049
2,061
2,043
2,034
2,067
2,040
rpoC2
4,161
4,227
3,675
4,440
4,542
4,584
3,999
4,145
4,031
4,086
4,179
4,161
rps2
708
720
705
711
711
711
711
711
711
711
711
711
rps3
654
663
654
720
720
675
657
636
657
657
657
657
rps4
609
600
597
606
606
606
606
606
606
606
606
612
rps7
468
468
468
471
471
471
468
474
468
468
468
468
rps8
399
399
399
411
411
411
405
405
405
405
405
417
rps11
393
393
393
432
432
432
417
417
417
417
426
435
rps12
372
372
372
369
375
375
372
372
372
372
372
372
rps14
303
303
300
312
312
312
303
303
303
303
303
303
rps15
267
261
267
273
273
237
273
276
267
273
264
264
rps16
258
189
258
243
240
267
258
267
rps18
228
228
303
513
492
513
315
297
306
306
306
306
rps19
279
279
279
282
282
282
279
279
279
279
279
279
ycf1f
3,207
5,112
5,271
Y
1,032
5,502
1,053
7,305
(ORF1068)
(ORF1756)
(ORF350)
ycf2g
6,411
6,942
6,165
6,897
5,658
6,885
6,396
6,843
6,843
(ORF2136)
(ORF2054)
ycf3
51
3e
516
510
513
510
513
381
381
381
498
507
477
(ORF167)
(ORF169)
(ORF170)
(ORF170)
(37
8)
ycf4
555
555
555
558
558
558
603
576
555
555
555
558
(ORF184)
(ORF184)
(ORF185)
(ORF185)
Continued
437
-
8/8/2019 Dating the MonocotDicot Divergence and the Origin of Core Eudicots Using Whole Chloroplast Genomes
15/18TableA1.
Continued
Taxon
Genea
Marchantia
Psilotum
Pinus
Triticum
Oryza
Zea
Lotus
Medicago
Ara
bidopsis
Spinacia
Nicotiana
Oenot
hera
ycf9
189
189
189
189
189
189
189
189
189
189
189
189
(ORF62)
(ORF62)
(ORF62)
(ORF62)
ycf12
102
102
102
(ORF33)
(ORF33)
ycf15
300
204
234
192
264
Y
(ORF99)
(140818141021)
(ORF77)
(9084891039)
ycf66
408
(ORF135)
ycf68
228
435
402
405
(ORF75a)
(9299193425)
(ORF133)
(ORF133)
ycf69
177
216
177
396
(124696124872)
(ORF72)
(ORF58)
(ORF131)
ycf70
129
270
210
(1453814666)
(ORF91)
(ORF69)
ycf71
153
249
225
(8077380925)
(ORF82)
(ORF75)
ycf72
414
414
414
(8104881461)
(ORF137)
(ORF137)
ycf73
750
750
522
(8375884507)
(ORF249)
(ORF173)
ycf74
150
330
150
(9446794616)
(ORF109)
(ORF49)
ycf75
192
192
(ORF63)
(ORF63)
ycf76
255
258
258
(124382124636)
(ORF85)
(ORF85)
TotalNo.genes
87
81
73
84
85
86
77
74
79
79
80
78
Totallength
71,509
68,355
60,470
58,095
58,677
58,581
61,908
60
,296
63,543
67,839
64,551
70,110
TotalNo.genes9
8
Averagelength,63,661
4,764
a
GenenamesfollowthoseofMartinetal.(2002)and
Swiss-ProtProteinKnowlegebase(2003)andeachNCBIaccessionofagiv
entaxon(refertoTable2).
b
Genelength(bp
)isgivenaftereachgenenameunder
eachspecies.Withinparenthesesare
thepositionranges(wherenoannotationwasavailablebutaputativelyres
pectivegenehomologuewas
detectedusingtheBLASTXinNCBI),ororiginalgenenames,orORFnamesinagiventaxon,respectively.
c
Absenceofthegeneinagivenchloroplastgenome.
d
Pseudogene.
e
Thegenelength
weusedwasdifferentfromtheNCBIannotationofagivenspeciesdueto
anearlierstoporlongerreadingfram
edetected.
f
Martinetal.(20
02)consideredthatthisgeneisnotrelatedtoprokaryoticgenesanddesign
atedityc
f78.
g
AFtsH-likepro
teingenedesignatedyc
f77byMartin
etal.(2002).
438
-
8/8/2019 Dating the MonocotDicot Divergence and the Origin of Core Eudicots Using Whole Chloroplast Genomes
16/18
References
APG (Angiosperm Phylogeny Group) (1998) An ordinal classifi-
cation for the families of flowering plants. Annal Mo Bot Gard
85:531533
asinger JF, Dilcher DL (1984) Ancient bisexual flowers. Science
224:511513
ousquet J, Strauss SH, Doerksen AH, Price RA (1992) Extensive
variation in evolutionary rate of rbcL gene sequences among
seed plants. Proc Natl Acad Sci USA 89:78447848
owe LM, Coat G, dePamphilis CW (2000) Phylogeny of seed
plants based on all three genomic compartments: Extant gym-
nosperms are monophyletic and Gnetales closest relatives are
conifers. Proc Natl Acad Sci USA 97:40924097
randl R, Mann W, Sprintzl M (1992) Estimation of the monocot
dicot age through t-RNA sequences from the chloroplast. Proc
R Soc Lond B 249:1317
remer K (2000) Early Cretaceous lineages of monocots flowering
plants. Proc Natl Acad Sci USA 97:47074711
hase MW, et al. (1993) Phylogenetics of seed plants: An analysis
of nucleotide sequences from the plastid gene rbcL. Annal Ma
Bot Gard 80:528580
hase MW, et al. (2000) Higher-level systematics of the mono-cotyledons: An assessment of current knowledge and a new
classification. In: Wilson KL, Morrison DA (eds) Monocots:
Systematics and evolution. Commonwealth Scientific and In-
dustrial Research Organization, Collingwood, Australia, pp 3
16
haw SM, Zharkikh HA, Sung HM, Lau TC, Li WH (1997)
Molecular phylogeny of extant gymnosperms and seed plant
evolution: analysis of nuclear 18S rRNA sequences. Mol Biol
Evol 14:5668
haw SM, Parkinson CL, Cheng Y, Vincent TM, Palmer JD
(2000) Seed plant phylogeny inferred from all three plant ge-
nomes: Monophyly of extant gymnosperms and origin of
Gnetales from conifers. Proc Natl Acad Sci USA 97:40864091legg MT, Gaut BS, Learn GH Jr, Morton BR (1994) Rates and
patterns of chloroplast DNA evolution. Proc Natl Acad Sci
USA 91:67956801
rane PR (1988) Major clades and relationships in higher
gymnosperms. In: Beck CB (ed) Origin and evolution of gym-
nosperms. Columbia University Press, New York, pp 218272
repet WL, Feldman GD (1991) The earliest remains of grasses in
the fossil record. Am J Bot 78:10101014
ronquist A (1988) The evolution and classification of fowering
plants, 2nd ed. New York Botanical Garden, Bronx, NY
Darwin C, Darwin F, Seward AC (eds) (1903) More letters from
Charles Darwin. D. Appleton, New York
Delevoryas T, Hope RC (1973) Fertile coniferophyte remains from
the Late Triassic Deep River Basin, North Carolina. Am J Bot
60:810818
Doyle JA (1992) Revised palynological correlations of the lower
Potomac Group (USA) and the Cocobeach sequence of Gabon
(Barremian-Aptian). Cretaceous Res 13:337349
Doyle JA (1998) Molecules, morphology, fossils, and the rela-
tionship of angiosperms and Gnetales. Mol Phylogenet Evol
9:448462
Doyle JA, Donoghue MJ (1987) The origin of angiosperms: a
cladistic approach. In: Friis EM, Chaloner WG, Crane PR (eds)
The origins of angiosperms and their biological consequences.
Cambridge University Press, Cambridge, pp 1749
Gallois JL, Achard P, Green G, Mache R (2001) The Arabidopsis
chloroplast ribosomal protein L21 is encoded by a nuclear gene
of mitochondrial origin. Gene 274:179185
Gantt JS, Baldauf SL, Caline PJ, Weeden NF, Palmer JD (1991)
Transfer of rpl22 to the nucleus greatly preceded its loss from
the chloroplast and involved the gain of an intron. EMBO
J 10:30734078
Gaut BS, Muse SV, Clark WD, Clegg MT (1992) Relative rates of
nucleotide substitution at the rbcL locus of monocotyledonous
plants. J Mol Evol 35:292303
Gaut BS, Muse SV, Clegg MT (1993) Relative rates of nucleotide
substitution in the chloroplast genome. Mol Phylogenet Evol
2:8996
Gaut BS, Clark LG, Wendel JF, Clegg MT, Muse SV (1997)
Comparisons of the molecular evolutionary process at rbcL and
ndhF in the grass family (Poaceae). Mol Biol Evol 14:769777
Goremykin VV, Hansmann S, Martin WF (1997) Evolutionary
analysis of 58 proteins encoded in six completely sequenced
chloroplast genomes: Revised molecular estimates of two seed
plant divergence times. Pl Syst Evol 206: 337351
Goremykin VV, Hirsch-Ernst KI, Wolfl S, Hellwig FH (2003)
Analysis of the Amborella trichopoda chloroplast genome se-
quence suggests that Amborella is not a basal angiosperm. Mol
Biol Evol 20:14991505
GPWG (Grass Phylogeny Working Group) (2001) Phylogeny and
subfamilial classification of the grasses (Poaceae). Ann Mo Bot
Gard 88:373373
Gu Z, Cavalcanti ARO, Chen FC, Bouman P, Li WH (2002) Ex-tent of gene duplication in the genomes of Drosophila, nema-
tode, and yeast. Mol Biol Evol 19:256262
Hallick RB, Bairoch A (1994) Proposals for the naming of chlo-
roplast genes. III. Nomenclature for open reading frames
encoded in chloroplast genomes. Plant Mol Biol Rep 12:S29
S30
Hart JA (1987) A cladistic analysis of conifers: Preliminary results.
J Arnold Arbor 68:296307
Herendeen PS, Crane, PR (1995) The fossil history of the mono-
cotyledons. In: Rudall PJ, Cribb PJ, Cutler DF, Humphries CJ
(eds) Monocotyledons: Systematics and evolution. Royal Bo-
tanic Gardens, Kew, pp 121
Hiratsuka J, et al. (1989) The complete sequence of the rice ( Oryzasativa) chloroplast genome: Intermolecular recombination be-
tween distinct tRNA genes accounts for a major plastid DNA
inversion during the evolution of the cereals. Mol Gen Genet
217:185194
Hughes NF (1994) The enigma ofangiosperm origins. Cambridge
University Press, Cambridge
Hupfer H, Swiatek M, Hornung S, Hermann RG, Maier RM, Chiu
WL, Sears B (2000) Complete nucleotide sequence of the Oe-
nothera elata plastid chromosome, representing plastome I of
the five distinguishable euoenothera plastomes. Mol Gen Genet
263:581585
Ikeo K, Ogihara Y (2000) Triticum aestivum chloroplast, complete
genome (unpublished)Katayama H, Ogihara Y (1996) Phylogenetic affinities of the
grasses to other monocots as revealed by molecular analysis of
chloroplast DNA. Curr Genet 29:572581
Kato T, Kaneko T, Sato S, Nakamura Y, Tabata S (2000) Com-
plete structure of the chloroplast genome of a legume, Lotus
japonicus. DNA Res 7:323330
Kenrick P, Crane PR (1997) The origin and early evolution of land
plants. Nature 389:3339
Kumar S, Tamura K, Jakobsen IB, Nei M (2001) MEGA 2: Mo-
lecular evolutionary genetics analysis software. Arizona State
University, Tempe
Laroche J, Li P, Bousquet J (1995) Mitochondrial DNA and
monocotdicot divergence time. Mol Biol Evol 12:11511156
Li WH, Graur D (1991) Fundamentals of molecular evolution.
Sinauer Associates, Sunderland, MA
Li WH, Tanimura M (1987) The molecular clock runs more slowly
in man than in apes and monkeys. Nature 326:9396
439
-
8/8/2019 Dating the MonocotDicot Divergence and the Origin of Core Eudicots Using Whole Chloroplast Genomes
17/18
Lin S, Wu H, Jia H, Zhang P, Dixon R, May G, Gonzales R, Roe
BA (2000) Medicago truncatula variety Jema Long A-17 chlo-
roplast, complete sequence (unpublished)
Lockhart PJ, Howe CJ, Barbrook AC, Larkum AWD, Penny D
(1999) Spectral analysis, systematic bias, and the evolution of
chloroplasts. Mol Biol Evol 16:573576
Magallo n S, Sanderson MJ (2001) Absolute diversification rates in
angiosperm clades. Int J Org Evol 55:17621780
Magallo n S, Crane PR, Herendeen PS (1999) Phylogenetic pattern,
diversity, and diversification of eudicots. Ann Mo Bot Gard
86:297372
Maier RM, Neckermann K, Igloi GL, Kossel H (1995) Complete
sequence of the maize chloroplast genome: Gene content, hot-
spots of divergence and fine tuning of genetic information by
transcript editing. J Mol Biol 251:614628
Martin W, Gierl A, Saedler H (1989) Molecular evidence for pre-
Cretaceous angiosperm origin. Nature 339:4648
Martin W, Lagrange T, Li YF, Bisanz-Seyer C, Mache R (1990)
Hypothesis for the evolutionary origin of the chloroplast ri-
bosomal protein L21 of spinach. Curr Genet 18:553556
Martin W, Lydiate D, Brinkmann H, Forkmann G, Saedler H,
Cerff R (1993) Molecular phylogenies in angiosperm evolution.
Mol Biol Evol 10:140162Martin W, Stoebe B, Goremykin V, Hansmann S, Hasegawa M,
Kowallik KV (1998) Gene transfer to the nucleus and the
evolution of chloroplasts. Nature 393:162165
Martin W, et al. (2002) Evolutionary analysis of Arabidopsis, cy-
anobacterial, and chloroplast genomes reveals plastid phylog-
eny and thousands of cyanobacterial genes in the nucleus. Proc
Natl Acad Sci USA 99:1224612251
Mathews S, Donoghue MJ (1999) The root of angiosperm phy-
logeny inferred from duplicate phytochrome genes. Science
286:947950
Matsuoka Y, Yamazaki Y, Ogihara Y, Tsunewaki K (2002) Whole
chloroplast genome comparison of rice, maize, and wheat: im-
plications for chloroplast gene diversification and phylogeny ofcereals. Mol Biol Evol 19:20842091
Millen RS, Olmstead RG, Adams KL, Palmer JD, Lao NT, Heggie
L, Kavanagh TA, Hibberd JM, Gray JC, Morden CW, Calie
PJ, Jermiin LS, Wolfe KH (2001) Many parallel losses of infA
from chloroplast DNA during angiosperm evolution with
multiple independent transfers to the nucleus. Plant Cell
13:645658
Miller Jr CN (1977) Mesozoic conifers. Bot Rev 43:217280
Miller Jr CN (1988) The origin of modern conifer families. In: Beck
CB (ed) Origin and evolution of gymnosperms. Columbia
University Press, New York, pp 448486
Muse SV, Gaut BS (1997) Interlocus comparisons of the nucleotide
substitution process in the chloroplast genome. Genetics146:393399
Nicholas KB, Nicholas HB Jr (1997) GeneDoc: Analysis and vis-
ualization of genetic variation. http://www.cris.com/Ketchup/
genedoc.shtml
Nicholas KJ, Tiffney BH, Knoll AH (1983) Patterns in vascular
land plant diversification. Nature 303:614616
Nickrent DL, Parkinson CL, Palmer JD, Duff RJ (2000) Multigene
phylogeny of land plants with special reference to bryophytes
and the earliest land plants. Mol Biol Evol 17:18851895
Ogihara Y, Isono K, Kojima T, Endo A, Hanaoka M, Shiina T,
Terachi T, Utsugi S, Murata M, Mori N, Takumi S, Ikeo K,
Gojobori T, Murai R, Murai K, Matsuoka Y, Ohnishi Y, Tajiri
H, Tsunewaki K (2002) Structural features of a wheat plastome
as revealed by complete sequencing of chloroplast DNA. Mol
Gen Genomics 266:740746
Ohyama K, Fukuzawa H, Kohchi T, Shirai H, Sano T, Sano S,
Umesono K, Shiki Y, Takeuchi M, Chang Z, Aota S, Inokuchi
H, Ozeki H (1986) Chloroplast gene organization deduced from
complete sequence of liverwort Marchantia polymorpha chlo-
roplast DNA. Nature 322:572574
Palmer JD (1985a) Comparative organization of chloroplast ge-
nomes. Annu Rev Genet 19:325354
Palmer JD (1985b) Evolution of chloroplast and mitochondrial
DNA in plants and algae. In: MacIntyre RJ (ed) Molecular
evolutionary genetics. Plenum Press, New York, pp 131240
Parkinson CL, Adams KL, Palmer JD (1999) Multigene analyses
identify the three earliest lineages of extant flowering plants.
Curr Biol 9:14851488
Price RA, Thomas J, Strauss SH, Gadek PA, Quinn CJ, Palmer JD
(1993) Familial relationships of the conifers from rbcL sequence
data. Am J Bot 80:172
Pryer KM, Schneider H, Smith AR, Cranfill R, Wolf PG, Hunt JS,
Sipes SD (2001) Horsetails and ferns are a monophyletic group
and the closest livingrelatives to seed plants. Nature 409:618622
Qiu YL, Lee J, Bernasconi-Quadroni F, Soltis DE, Soltis PS, Zanis
M, Chen Z, Savolainen V, Chase MW (1999) The earliest an-
giosperms: Evidence from mitochondrial, plastid and nuclear
genomes. Nature 402:404407
Rai HS, OBrien HE, Reeves PA, Olmstead RG, Graham SW
(2003)1 Inference of higher-order relationships in the cycadsfrom a large chloroplast data set. Mol Phylogenet Evol 29:350
359
Ramshaw JAM, Richardson DL, Meatyard BT, Brown RH, Ri-
chardson M, Thompson EW, Boulter D (1972) The time of
origin of the flowing plants determined by using amino acid
sequence data of cytochrome C. New Phytol 71:773779
Renzaglia KS, Johnson TH, Gates HD, Whittier DP (2001) Ar-
chitecture of the sperm cell ofPsilotum. Am J Bot 88:11511163
Rost B (1999) Twilight zone of protein sequence alignments. Pro-
tein Eng 12:8594
Saito N, Nei M (1987) The neighbor-joining method: A new
method for reconstructing phylogenetic trees. Mol Biol Evol
4:406425Sanderson MJ (1997) A nonparametric approach to estimating
divergence times in the absence of rate constancy. Mol Biol
Evol 14:12181231
Sanderson MJ, Doyle JA (2001) Sources of error and confidence
intervals in estimating the age of angiosperms from rbcL and
18S rDNA data. Amer J Bot 88:14991516
Sato S, Nakamura Y, Kaneko T, Asamizu E, Tabata S (1999)
Complete structure of the chloroplast genome of Arabidopsis
thaliana. DNA Res 6:283290
Schmitz-Linneweber C, Maier RM, Alcaraz JP, Cottet A, Herr-
mann RG, Mache R (2001) The plastid chromosome of spinach
(Spinacia oleracea): Complete nucleotide sequence and gene
organization. Plant Mol Biol 45:307315
Shinozaki K, et al. (1986) The complete nucleotide sequence of
tobacco chloroplast genome: Its gene organization and ex-
pression. EMBO J 5:20432049
Soltis DE, et al. (2000) Angiosperm phylogeny inferred from 18S
rDNA, rbcL, and atpB sequendes. Bot J Linn Soc 133:381461
Soltis PS, Soltis DE, Chase MW (1999) Angiosperm phylogeny
inferred from multiple genes: A research tool for comparative
biology. Nature 402:402404
Soltis PS, Soltis DE, Savolainen V, Crane PR, Barraclough TG
(2002) Rate heterogeneity among lineages of tracheophytes:
Integration of molecular and fossil data and evidence for
molecular living fossils. Proc Natl Acad Sci USA 99:4430
4435
Stebbins GL (1981) Coevolution of grasses and herbivores. Ann
Mo Bot Gard 68:7576
Stebbins GL (1987) Grass systematics and evolution: Past, present
and future. In: Sonderstrom TR, Hilu KH, Campbell CS,
440
-
8/8/2019 Dating the MonocotDicot Divergence and the Origin of Core Eudicots Using Whole Chloroplast Genomes
18/18
Varkworth ME (eds) Grass systematics and evolution.
Smithsonian Institution Press, Washington, DC, pp 359367
tewart WN, Rothwell GW (1993) Paleobotany and the evolution
of plants, 2nd ed. Cambridge University Press, Cambridge
toebe B, Martin W, Kowallik KV (1998) Distribution and no-
menclature of protein-coding genes in 12 chloroplast genomes.
Plant Mol Biol Rep 16:243255
un G, Ji Q, Dilcher DL, Zheng S, Nixon KC, Wang X (2002)
Archaefructaceae, a new basal angiosperm family. Science
296:899904
wiss-Prot Protein Knowledgebase (2003) List of chloropl