genome evolution: it's all relative

2
T he final battle of the War of 1812 between Britain and the United States was the Battle of New Orleans. It was fought on 8 January 1815, but this date by itself is not particularly informative (and most students don’t remember it). What is instructive is that the battle took place after the Treaty of Ghent, which ended the war — so the timing shows that the battle had no effect on the conclusion of the war. In other words, the relative time of the battle (before or after the treaty) is more interesting to the historian than is the absolute date. Two groups have now applied similar logic to identifying and dating gene-duplication events in yeast 1 and in the flowering plant Arabidopsis thaliana (page 433 of this issue 2 ). Rather than trying to pin down the precise number of years since the last big duplication event, the authors instead used evolutionary trees to work out the relative time, before or after other events in evolu- tionary history. Organisms frequently duplicate their genes. This can happen during the produc- tion of sex cells (meiosis), when segments of one chromosome are frequently swapped with the complementary segments in its matching pair; errors in this process can lead to the copying of one or a few genes at a time. The same outcome may result from trans- position, where copies of genes are inserted in other parts of the genome. Alternatively, every gene in the genome can be copied simultaneously, in a single wholesale dupli- cation event. This latter case is particularly common in flowering plants, but is thought to have happened in animals 3,4 and fungi 5 as well. Hybridization — mating between two species — frequently precedes whole- genome duplication in plants. Duplicate genes provide raw material for gene diversification; an extra gene copy can acquire a new function, or the two copies can specialize by dividing up the ancestral functions. But this, together with the fact that extraneous genes are gradually lost over time, can make detection of the duplication event difficult. By analysing the evolutionary history of genes, however, Bowers et al. 2 have found at least three whole-genome duplications in the history of Arabidopsis, while Langkjaer et al. 1 document at least one major event in the recent history of the yeast Saccharomyces cerevisiae. Most previous studies have assessed genome duplication by using absolute time, analogous to trying to find the exact date of a battle. The average mutation rate of two genes that are assumed to have originated by duplication is estimated by counting the differences in base pairs — the building- blocks of genes — between the two, with each mutation interpreted as a tick of a mol- ecular clock. The mutation rate speeds up, slows down and varies among genes, so it is clock-like only on average, but the intrinsic variability can be accommodated statistically; estimated rates and times of mutation for each gene can have standard errors attached to them. If the estimated divergence times of all gene pairs in a species fall into a single broad group within acceptable confidence limits, a genome duplication is inferred. Many interesting evolutionary questions are raised by knowing the time of genome duplications relative to other events in evolutionary or geological time. So the molecular clock is generally calibrated to determine how many years are represented by the average mutation. Calibration involves comparing genes in two species, and estimating the time since their common ancestor. The problem is that the variance of the mutation process is modest compared with the uncertainty introduced by calibration. The date of the common ancestor of two species is estimated from a fossil, or from a geological event such as the separation of two continents. For example, the common ancestor of maize and rice must have lived somewhat after the origin of the grass family, for which the earliest fossils are pollen grains. Pollen that is unquestionably from a grass occurs in sediments dated to 55 million years ago, but there are also grains that could be grasses in rocks that are about 70 million years old 6 . If the maize/rice ancestor lived after that, then a reasonable date might be 50 or 60 million years ago. The calibration of the clock, in other words, includes what can only NATURE | VOL 422 | 27 MARCH 2003 | www.nature.com/nature 383 news and views It’s all relative Elizabeth A. Kellogg By constructing evolutionary trees of genes, researchers have detected three big genome duplications in the history of the plant Arabidopsis, and one in the recent history of yeast. A B C D E Species Species Species Species Species F Species A A B C D E D D F E F F F A C F F a b c d Genome duplication Figure 1 Effect of genome duplication on the history of individual genes. a, The evolutionary history of six hypothetical species, A–F. An ancestral species contained the green, red and yellow genes, and gave rise to species A–C, which contain one of each of these genes. Genome duplication later occurred in the ancestral species that gave rise to D, E and F. b, An evolutionary tree based on any one gene will show the duplication as a branch point (arrow) leading to two lines of descent, one to each copy of the gene. The tree will be the same for every gene in the genome. c, If a particular gene is sequenced only from species A, C and F, the two copies from F will appear more closely related to each other than to the gene from species C; so the duplication must have occurred after C and F diverged. d, An evolutionary tree that includes only one gene of a duplicate pair may still permit the relative time of duplication to be determined. In the tree shown, sequences were obtained for the one copy in species A, the two copies in F and one of the two copies in D. One copy from F appears more closely related to the gene from D than to the other copy from F; so the duplication in F must have occurred before D and F diverged. © 2003 Nature Publishing Group

Upload: elizabeth-a

Post on 23-Jul-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Genome evolution: It's all relative

The final battle of the War of 1812between Britain and the United Stateswas the Battle of New Orleans. It was

fought on 8 January 1815, but this date byitself is not particularly informative (andmost students don’t remember it). What isinstructive is that the battle took place afterthe Treaty of Ghent, which ended the war —so the timing shows that the battle had noeffect on the conclusion of the war. In otherwords, the relative time of the battle (beforeor after the treaty) is more interesting to thehistorian than is the absolute date. Twogroups have now applied similar logic toidentifying and dating gene-duplicationevents in yeast1 and in the flowering plantArabidopsis thaliana (page 433 of this issue2). Rather than trying to pin down theprecise number of years since the last bigduplication event, the authors instead usedevolutionary trees to work out the relativetime, before or after other events in evolu-tionary history.

Organisms frequently duplicate theirgenes. This can happen during the produc-tion of sex cells (meiosis), when segments of one chromosome are frequently swappedwith the complementary segments in itsmatching pair; errors in this process can leadto the copying of one or a few genes at a time.The same outcome may result from trans-position, where copies of genes are insertedin other parts of the genome. Alternatively,every gene in the genome can be copiedsimultaneously, in a single wholesale dupli-cation event. This latter case is particularlycommon in flowering plants, but is thoughtto have happened in animals3,4 and fungi5

as well. Hybridization — mating betweentwo species — frequently precedes whole-genome duplication in plants.

Duplicate genes provide raw material forgene diversification; an extra gene copy canacquire a new function, or the two copies can specialize by dividing up the ancestralfunctions. But this, together with the factthat extraneous genes are gradually lost overtime, can make detection of the duplicationevent difficult. By analysing the evolutionaryhistory of genes, however, Bowers et al.2

have found at least three whole-genomeduplications in the history of Arabidopsis,while Langkjaer et al.1 document at least onemajor event in the recent history of the yeastSaccharomyces cerevisiae.

Most previous studies have assessed

genome duplication by using absolute time,analogous to trying to find the exact date of a battle. The average mutation rate of twogenes that are assumed to have originated by duplication is estimated by counting thedifferences in base pairs — the building-blocks of genes — between the two, witheach mutation interpreted as a tick of a mol-ecular clock. The mutation rate speeds up,slows down and varies among genes, so it isclock-like only on average, but the intrinsicvariability can be accommodated statistically;estimated rates and times of mutation foreach gene can have standard errors attachedto them. If the estimated divergence times ofall gene pairs in a species fall into a singlebroad group within acceptable confidencelimits, a genome duplication is inferred.

Many interesting evolutionary questionsare raised by knowing the time of genomeduplications relative to other events in evolutionary or geological time. So the molecular clock is generally calibrated to

determine how many years are representedby the average mutation. Calibrationinvolves comparing genes in two species, and estimating the time since their commonancestor.

The problem is that the variance of themutation process is modest compared withthe uncertainty introduced by calibration.The date of the common ancestor of twospecies is estimated from a fossil, or from ageological event such as the separation of two continents. For example, the commonancestor of maize and rice must have livedsomewhat after the origin of the grass family,for which the earliest fossils are pollen grains.Pollen that is unquestionably from a grassoccurs in sediments dated to 55 million yearsago, but there are also grains that could begrasses in rocks that are about 70 millionyears old6. If the maize/rice ancestor livedafter that, then a reasonable date might be 50or 60 million years ago. The calibration of theclock, in other words, includes what can only

NATURE | VOL 422 | 27 MARCH 2003 | www.nature.com/nature 383

news and views

It’s all relativeElizabeth A. Kellogg

By constructing evolutionary trees of genes, researchers have detectedthree big genome duplications in the history of the plant Arabidopsis, andone in the recent history of yeast.

A B C D ESpecies Species Species Species Species

FSpecies

AA

B C D E DD

F E FF FA C F F

a

b c d

Genome duplication

Figure 1 Effect of genome duplication on the history of individual genes. a, The evolutionary history ofsix hypothetical species, A–F. An ancestral species contained the green, red and yellow genes, and gaverise to species A–C, which contain one of each of these genes. Genome duplication later occurred in theancestral species that gave rise to D, E and F. b, An evolutionary tree based on any one gene will showthe duplication as a branch point (arrow) leading to two lines of descent, one to each copy of the gene.The tree will be the same for every gene in the genome. c, If a particular gene is sequenced only fromspecies A, C and F, the two copies from F will appear more closely related to each other than to the genefrom species C; so the duplication must have occurred after C and F diverged. d, An evolutionary treethat includes only one gene of a duplicate pair may still permit the relative time of duplication to bedetermined. In the tree shown, sequences were obtained for the one copy in species A, the two copies in F and one of the two copies in D. One copy from F appears more closely related to the gene from Dthan to the other copy from F; so the duplication in F must have occurred before D and F diverged.

© 2003 Nature Publishing Group

Page 2: Genome evolution: It's all relative

be called a fudge factor, in this case of 10 or 15million years — close to 20%.

The precise date of major genome dup-lications (measured by a molecular clock)can thus be compared with major events inevolutionary history (generally measured by a different molecular clock), using one ormore calibration points (fossils). The errorof the estimate is high, so correlations are difficult, if not impossible, to demonstraterigorously.

Langkjaer et al.1 and Bowers et al.2 circum-vent this problem by using relative time.Bowers et al. compare pairs of genes in Ara-bidopsis with those in cabbage (Brassica; fromthe same family), cotton (from a differentfamily), pine (a seed plant, but not a flower-ing plant) and moss (a very distant relative),and for each gene they compute an evolu-tionary tree — the gene’s pedigree. From the pattern of the evolutionary tree, they candetermine when a duplication occurred relative to the evolutionary origin of otherspecies (Fig. 1). The evolutionary tree (seeFig. 2b on page 436) shows a clear duplication event, affecting many genes in the genome,that occurred before the Brassica/Arabidopsissplit, and before the members of the familyBrassicaceae started to diverge. Similarly,Langkjaer et al. show that the yeast genomewas duplicated before the divergence of Saccharomyces and Kluveromyces.

After duplication, one copy of many ofthe genes in a duplicated genome segment islost. Once duplicate segments have beenidentified, comparisons between the twoallow the gene composition of the commonancestor to be estimated (Fig. 2). Havingdone this, duplicated regions that are evenmore ancient become apparent — pairs ofgenes and gene regions that were not initiallyidentified because too many puzzle pieceswere missing. At the same time, it is possibleto identify the pattern and relative rate ofgene loss. Repeating their evolutionaryanalysis for the newly identified duplicatedsegments, Bowers et al. were able to identify amore ancestral duplication event early in theevolution of the flowering plants, after the

ancestor of cotton and Arabidopsis (whichare both dicotyledonous plants) divergedfrom the ancestor of rice and maize (whichare monocotyledons). Another round ofanalyses revealed a duplication that was stillmore ancient, possibly occurring before theorigin of the seed plants.

A historian, trying to dissect cause andeffect, needs to know the relative times ofbattles and treaties. Similarly, the biologistneeds to know the relative times of geneduplications, speciation events, majorspecies diversifications, and events of Earthhistory. Approaches that involve the con-struction of evolutionary trees are designedspecifically to assess relative time. Incorpo-rating such an approach into future genomestudies will undoubtedly lead to a clearer picture of the role of gene and genome duplication in the evolutionary process. By

increasingly dense sampling of evolutionarytrees, even without complete genomesequences for every species, it is possible to distinguish single-gene duplications fromwhole-genome duplication. So the approachholds the promise of dissecting the dynamicprocesses by which genes and genomesevolve. ■

Elizabeth A. Kellogg is in the Department of Biology,University of Missouri-St Louis, 8001 NaturalBridge Road, St Louis, Missouri 63121, USA.e-mail: [email protected]. Langkjaer, R. B., Cliften, P. F., Johnston, M. & Piskur, J. Nature

421, 848–852 (2003).

2. Bowers, J. E., Chapman, B. A., Rong, J. & Paterson, A. H. Nature

422, 433–438 (2003).

3. Gu, X., Wang, Y. & Gu, J. Nature Genet. 31, 205–209 (2002).

4. McLysaght, A., Hokamp, K. & Wolfe, K. H. Nature Genet. 31,

200–204 (2002).

5. Wolfe, K. H. & Shields, D. C. Nature 387, 708–713 (1997).

6. Jacobs, B. F., Kingston, J. D. & Jacobs, L. L. Ann. Missouri Bot.

Garden 86, 590–643 (1999).

Chaos and control are often seen asopposite poles of the spectrum. But thetheory of how to control dynamical

chaos is evolving, and, in Physical ReviewLetters, Wei, Zhan and Lai1 present a welcomecontribution.

Chaos is a feature in all sciences: fromlasers and meteorological systems, to chem-ical reactions (such as the Belouzov–Zhabotinski reaction) and the biology of living organisms. In most deterministicdynamical systems that display chaoticbehaviour, selecting the initial conditionscarefully can drive the system along a trajec-tory towards much simpler dynamics, suchas equilibrium or periodic behaviour. Butsensitive dependence on initial conditions— the well-known ‘butterfly effect’ — andthe effects of noise in the system mean that inpractice this is not so easy to do.

The aim of chaos control is to be able toperturb chaotic systems so as to ‘remove’ or atleast ‘control’ the chaos. For example, in a spatially extended system, the aim may be to achieve regular temporal and/or spatialbehaviour. Techniques introduced2,3 anddeveloped by several researchers over the pastdecade have sought to make unstable behav-iour robust against both noise and uncertain-ties in initial conditions by stabilizing the system (using feedback3, for instance) close todynamically unstable trajectories. These tech-niques have been very successful in controllingchaos, at least for low-dimensional systems.

Synchronization is a good example of a

chaos-control problem: synchronizing anarray of coupled (interdependent) systems— such as the coherent power-output froman array of lasers — is of interest for techno-logical applications. In biology, synchro-nization of coupled systems is a commonlyused model4, and the presence, absence ordegree of synchronization can be an impor-tant part of the function or dysfunction of a biological system. For example, epilepticseizures are associated with a state of thebrain in which too many neurons are syn-chronized for the brain to function correctly.

In the simplest case, synchronization of two identical coupled systems (such asperiodic oscillators) can be achieved throughtheir coupling as long as it is strong enoughto overcome the divergence of trajectorieswithin either individual system. The re-quired strength is indicated by the most positive Lyapunov exponent of the system: aLyapunov exponent is an exponential rate ofconvergence or divergence of trajectories of a dynamical system, and the most positiveLyapunov exponent measures the fastestpossible rate of divergence of trajectories. Inparticular, the fact that the individual sys-tems have chaotic dynamics before they arecoupled together means that the most posi-tive Lyapunov exponent is greater than zero,and there is always a threshold below whichsynchronization cannot be achieved.

Synchronization in more general arrayscan be done similarly, although with localcoupling this can only be achieved with a

news and views

384 NATURE | VOL 422 | 27 MARCH 2003 | www.nature.com/nature

Figure 2 Duplicated chromosomal segments,showing some gene pairs. This pattern ofduplication suggests that all seven genes mayhave been present and in the same order in thecommon ancestor.

Duplicated segments

Inferred ancestral chromosomal segment

Nonlinear dynamics

Synchronization from chaosPeter Ashwin

It isn’t easy to create a semblance of order in interconnected dynamicalsystems. But a mathematical tool could be the means to synchronizesystems more effectively — and keep chaos at bay.

© 2003 Nature Publishing Group