extreme conservation and non-neutral evolution of the cpma circadian locus in a globally distributed...

9
Research article Extreme Conservation and Non-Neutral Evolution of the cpmA Circadian Locus in a Globally Distributed Chroococcidiopsis sp. from Naturally Stressful Habitats Volodymyr Dvornyk* and Akhee Sabiha Jahan School of Biological Sciences, University of Hong Kong, Pokfulam, Hong Kong S.A.R., People’s Republic of China *Corresponding author: E-mail: [email protected]. Associate editor: Jennifer Wernegreen DNA sequences reported in this article have been deposited in the Genbank accessions HQ113413–HQ113425, HQ113427–HQ113442, HQ113448–HQ113458, JQ669824–JQ669833. Abstract Cyanobacteria are among the most ancient organisms known to have circadian rhythms. The cpmA gene is involved in controlling the circadian output signal. We studied polymorphism and divergence of this gene in six populations of a stress-tolerant cyanobacterium, Chroococcidiopsis sp., sampled in extreme habitats across the globe. Despite high haplotype diversity (0.774), nucleotide diversity of cpmA is very low ( = 0.0034): the gene appears to be even more conserved than housekeeping genes. Even though the populations were sampled thousands kilometers apart, they manifested virtually no genetic differentiation at this locus (F ST = 0.0228). Using various tests for neutrality, we determined that evolution of cpmA significantly departures from the neutral model and is governed by episodic positive selection. Key words: circadian phase modifier A, Chroococcidiopsis, selection, diversity, populations. Introduction The circadian system controls the expression of many genes in a genome and thus is essential for maintaining cellular homeostasis. Circadian rhythms have been observed in cya- nobacteria since the 1990s, making them one of the simplest known organisms to possess circadian rhythmicity (Kondo et al. 1993). This circadian programming can enhance the ability of an organism to anticipate important cyclic changes in the environment and generate appropriate responses (Woelfle et al. 2004). Cyanobacteria are a major model system in prokaryotes for analyzing and testing predictions about circadian rhythms and possible effects of evolutionary forces on them (Mackey et al. 2011). Having an internal clock that can match the external light/ dark cycle is thought to be advantageous for cyanobacteria as photosynthetic organisms (Johnson et al. 1998; Ouyang et al. 1998). Cyanobacteria are quite versatile and have a high adap- tive potential that allows them to occupy the most extreme habitats on our planet (Whitton 1987). Given its significance for controlling essential physiological processes, such as cell division, regulation of nitrogen fixation, and photosynthesis (Johnson and Golden 1999), the circadian system is important for adaptability of cyanobacteria. Macroevolutionary studies of various circadian genes showed that their evolution was shaped by many factors, including duplications, lateral transfers, and various types of selection (Dvornyk et al. 2003; Dvornyk 2006, 2009; Baca et al. 2010). However, very little is known about the factors, which govern evolution of circadian genes at the population level. Previously, we reported multiple duplications and an elevated neutral mutation rate in the core circadian genes, kaiB and kaiC, from a filamentous cyanobacterium Nostoc linckia under acute environmental stress (Dvornyk et al. 2002). Apart from this, no other reports are available. The cpmA (circadian phase modifier) gene is an element of the output pathway of the circadian clock (Katayama et al. 1999). Mutations in the cpmA gene alter the phasing of the circadian rhythm of a restricted subset of genes, which results in severe growth defects (Katayama et al. 1999). The gene was reported to consist of two domains of similar length (N- and C-terminal domains), which harbor several highly conserved regions of potential circa- dian function (Dvornyk 2006). The C-terminal domain shares partial homology with the N-terminus of the AIR carboxylase (AIRC) (pfam00731) and the NCAIR mutase (COG0041) (Dvornyk 2006). Chroococcidiopsis is a coccoid unicellular cyanobacterium first reported from hot deserts (Fewer et al. 2002). Chroococcidiopsis is not only capable of surviving extreme desiccation but also shows a remarkable resistance to high doses of ultraviolet radiation (Cordoba-Jabonero et al. 2005). Resemblance of morphological characteristics of Chroococci- diopsis to proterozoic microfossils suggests it as one of the evolutionary oldest cyanobacteria (Friedmann 1980; Friedmann and Ocampo-Friedmann 1995). This study was designed to examine polymorphism and diversity of the cpmA gene in the Chroococcidiopsis strains sampled from various extremely stressful habitats around the Earth and to determine factors that govern microevolution of this gene. ß The Author 2012. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: [email protected] Mol. Biol. Evol. doi:10.1093/molbev/mss191 Advance Access publication July 25, 2012 1 MBE Advance Access published August 25, 2012 at Indian Institute Of Chemical Biology (Iicb) on November 5, 2012 http://mbe.oxfordjournals.org/ Downloaded from

Upload: utpalmtbi

Post on 18-Jul-2016

217 views

Category:

Documents


1 download

DESCRIPTION

Non-Neutral Evolution of the cpmA

TRANSCRIPT

Page 1: Extreme Conservation and Non-Neutral Evolution of the CpmA Circadian Locus in a Globally Distributed Chroococcidiopsis Sp. From Naturally Stressful Habitats

Research

articleExtreme Conservation and Non-Neutral Evolution of the cpmACircadian Locus in a Globally Distributed Chroococcidiopsissp. from Naturally Stressful HabitatsVolodymyr Dvornyk* and Akhee Sabiha JahanSchool of Biological Sciences, University of Hong Kong, Pokfulam, Hong Kong S.A.R., People’s Republic of China

*Corresponding author: E-mail: [email protected].

Associate editor: Jennifer Wernegreen

DNA sequences reported in this article have been deposited in the Genbank accessions

HQ113413–HQ113425, HQ113427–HQ113442, HQ113448–HQ113458, JQ669824–JQ669833.

Abstract

Cyanobacteria are among the most ancient organisms known to have circadian rhythms. The cpmA gene is involved incontrolling the circadian output signal. We studied polymorphism and divergence of this gene in six populations of astress-tolerant cyanobacterium, Chroococcidiopsis sp., sampled in extreme habitats across the globe. Despite high haplotypediversity (0.774), nucleotide diversity of cpmA is very low (�= 0.0034): the gene appears to be even more conserved thanhousekeeping genes. Even though the populations were sampled thousands kilometers apart, they manifested virtually nogenetic differentiation at this locus (FST = 0.0228). Using various tests for neutrality, we determined that evolution of cpmAsignificantly departures from the neutral model and is governed by episodic positive selection.

Key words: circadian phase modifier A, Chroococcidiopsis, selection, diversity, populations.

IntroductionThe circadian system controls the expression of many genesin a genome and thus is essential for maintaining cellularhomeostasis. Circadian rhythms have been observed in cya-nobacteria since the 1990s, making them one of the simplestknown organisms to possess circadian rhythmicity (Kondoet al. 1993). This circadian programming can enhance theability of an organism to anticipate important cyclic changesin the environment and generate appropriate responses(Woelfle et al. 2004). Cyanobacteria are a major modelsystem in prokaryotes for analyzing and testing predictionsabout circadian rhythms and possible effects of evolutionaryforces on them (Mackey et al. 2011).

Having an internal clock that can match the external light/dark cycle is thought to be advantageous for cyanobacteria asphotosynthetic organisms (Johnson et al. 1998; Ouyang et al.1998). Cyanobacteria are quite versatile and have a high adap-tive potential that allows them to occupy the most extremehabitats on our planet (Whitton 1987). Given its significancefor controlling essential physiological processes, such as celldivision, regulation of nitrogen fixation, and photosynthesis(Johnson and Golden 1999), the circadian system is importantfor adaptability of cyanobacteria.

Macroevolutionary studies of various circadian genesshowed that their evolution was shaped by many factors,including duplications, lateral transfers, and various types ofselection (Dvornyk et al. 2003; Dvornyk 2006, 2009; Baca et al.2010). However, very little is known about the factors, whichgovern evolution of circadian genes at the population level.Previously, we reported multiple duplications and an elevated

neutral mutation rate in the core circadian genes, kaiB andkaiC, from a filamentous cyanobacterium Nostoc linckiaunder acute environmental stress (Dvornyk et al. 2002).Apart from this, no other reports are available.

The cpmA (circadian phase modifier) gene is an elementof the output pathway of the circadian clock (Katayamaet al. 1999). Mutations in the cpmA gene alter the phasingof the circadian rhythm of a restricted subset of genes,which results in severe growth defects (Katayama et al.1999). The gene was reported to consist of two domainsof similar length (N- and C-terminal domains), whichharbor several highly conserved regions of potential circa-dian function (Dvornyk 2006). The C-terminal domainshares partial homology with the N-terminus of the AIRcarboxylase (AIRC) (pfam00731) and the NCAIR mutase(COG0041) (Dvornyk 2006).

Chroococcidiopsis is a coccoid unicellular cyanobacteriumfirst reported from hot deserts (Fewer et al. 2002).Chroococcidiopsis is not only capable of surviving extremedesiccation but also shows a remarkable resistance to highdoses of ultraviolet radiation (Cordoba-Jabonero et al. 2005).Resemblance of morphological characteristics of Chroococci-diopsis to proterozoic microfossils suggests it as one of theevolutionary oldest cyanobacteria (Friedmann 1980;Friedmann and Ocampo-Friedmann 1995).

This study was designed to examine polymorphism anddiversity of the cpmA gene in the Chroococcidiopsis strainssampled from various extremely stressful habitats around theEarth and to determine factors that govern microevolution ofthis gene.

� The Author 2012. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, pleasee-mail: [email protected]

Mol. Biol. Evol. doi:10.1093/molbev/mss191 Advance Access publication July 25, 2012 1

MBE Advance Access published August 25, 2012 at Indian Institute O

f Chem

ical Biology (Iicb) on N

ovember 5, 2012

http://mbe.oxfordjournals.org/

Dow

nloaded from

Page 2: Extreme Conservation and Non-Neutral Evolution of the CpmA Circadian Locus in a Globally Distributed Chroococcidiopsis Sp. From Naturally Stressful Habitats

Materials and Methods

Environmental Samples and Cyanobacterial Strains

Environmental samples were collected from four loca-tions representative of extreme arid climate regimes(Bahl et al. 2011; Caruso et al. 2011). Major environmentalparameters and geographical coordinates are given in table 1.To standardize the nature of samples collected, hypolithswere targeted. These comprise biofilms that colonize the ven-tral surface of quartz, a relatively inert, and ubiquitous sub-strate in deserts worldwide. Hypoliths develop independentlyof the surrounding soil and are dominated by cyanobacteria(Pointing et al. 2009). The sampling sites were Antarctic DryValleys (McKelvey Valley), Canadian Arctic (Devon Island),cold desert (Taklimakan, China), and hot desert (Kalahari,Southern Africa). Climate was delineated according to thestandards based on the long-term mean annual precipitationand temperature (Peel et al. 2007).

The collected environmental samples were subculturedto establish cyanobacterial cultures containing predominantlyChroococcidiopsis sp. The cultures were maintained on BG-11medium in conical flasks at 25�C in sterilized incubators withconstant light. In addition to the environmental cultures, astrain of Chroococcidiopsis ATCC 27 900 was used in the ana-lyses for identification purposes and comparisons.

DNA Extraction, Amplification, Cloning,and Sequencing

Genomic DNA was isolated by the cetyltrimethylammoniumbromide DNA extraction protocol with minor modifications(Stewart and Via 1993).

The polymerase chain reaction (PCR) primers weredesigned using Net Primer (http://www.premierbiosoft.com/netprimer) and the cpmA nucleotide sequences of thefollowing cyanobacterial species: Acaryochloris marinaMBIC11017 (GenBank accession no. NC_009925, locus

tag: AM1_4350), Trichodesmium erythraeum IMS101(NC_008312, locus tag: Tery_4722), Microcystis aeruginosaNIES-843 (NC_010296, locus tag: MAE_62660),Thermosynechococcus elongatus BP-1 (NC_004113, locustag: tll1189), and Nostoc sp. PCC 7120 (BA000019, locus tag:alr3885).

The PCR was conducted using forward primer 5 F (50-ACCGGATTTCCCGAAGTGATTTGG-30) and reverse primer 4 R(50-GCCGCACCAAATCCATTATC-30). The master mix of50�l contained 38.5�l of water, 5�l of 10� RED Taqbuffer (Sigma-Aldrich), 2�l of dNTPs (Sigma-Aldrich), 1�lof each primer, 0.3�l of Red Taq polymerase (Sigma-Aldrich), 2�l of MgCl2, and 1.2�l of Tween-20 were used.The PCR was conducted using the following profile: denatur-ation at 92�C for 1 min, annealing at 50�C, and extension at72�C both for 2 min with 40 cycles. The PCR products werethen purified using a Qiagen purification kit and cloned into apDrive cloning vector (Qiagen, Inc.) following the manufac-turer’s protocol.

The cloned gene fragments were isolated and sequencedusing an automated sequencer (3730xl DNA Analyzer,Applied Biosystems) in Genome Research Center(University of Hong Kong) using the dye primer method(Applied Biosystems).

Data Analyses

The sequence chromatograms were proofread and trimmedfor unresolved bases. The obtained nucleotide sequenceswere aligned using ClustalW (Thompson et al. 1994) asimplemented in the BioEdit software v. 7.0.9 (Hall 1999).

Phylogenetic Analysis and Taxonomical Identificationof Sequences

The likelihood ratio test as implemented in the ProtTest 3.0software (Darriba et al. 2011) was used to determine the mostappropriate substitution model for the data set. According to

Table 1. Characteristics of the Four Sampling Sites Used in This Study.

Sampling Sites Population Name Coordinates Environmental Conditions at the Site

McKelvey Valley, Antarctica AC (chasmolith) S77�24.5950 E161�11.7470 Strong windsAE (endolith) Ice cap depositsAH (hypolith) Mean annual rainfall <100 mm

Average yearly temperature �20�C to �35�C

Canadian Arctic Circle, Devon Island CH (hypolith) N75�23.2240 W89�40.3350 High levels of solar radiationVariation in salinityMean annual rainfall <500 mmAverage yearly temperature less than �5�C

Taklimakan Desert, China TH (hypolith) N38�24.2330 E88�53.8060 Large fluctuation in temperatureShifting sandSalt accumulationMean annual rainfall <25 mmAverage temperature less than �20�C

(winter) and <40�C (summer)

Kalahari Desert, Botswana SH (hypolith) S 22�11.00 E29�7.00 Dry airShifting sandsLarge fluctuation in temperatureMean annual rainfall <200 mmAverage yearly temperature more than �11�C

(winter) and >40�C (summer)

2

Dvornyk and Jahan . doi:10.1093/molbev/mss191 MBE at Indian Institute O

f Chem

ical Biology (Iicb) on N

ovember 5, 2012

http://mbe.oxfordjournals.org/

Dow

nloaded from

Page 3: Extreme Conservation and Non-Neutral Evolution of the CpmA Circadian Locus in a Globally Distributed Chroococcidiopsis Sp. From Naturally Stressful Habitats

the Bayesian information criterion, the LG model (Le andGascuel 2008) with gamma distribution (�= 0.55) turnedout to fit our data best.

Because the cpmA gene was amplified from enriched butnonaxenic cultures, it was necessary to verify which of theobtained cpmA gene sequences belong to Chroococcidiopsissp. This was done by constructing a maximum likelihoodphylogenetic tree using the CpmA sequences of various cya-nobacteria (supplementary table S1, Supplementary Materialonline) and that of Chroococcidiopsis thermalis PCC 7203(GenBank accession no. HQ113459.1) as a control for com-parison. The statistical significance of the tree was evaluatedby a bootstrap resampling with 100 replications. Thesequences, which appeared on the tree outside the 100%bootstrap support clade with the control sequence, wereassumed not to belong to Chroococcidiopsis sp. Accordingto the sequencing of �4 kbp of the SSU/ITS/LSU region,the used strains are all Chroococcidiopsis, which currently isconsidered as a monospecific genus (Bahl et al. 2011). Finally,a total of 50 sequences were retained for the analyses. Fromthese sequences, 26 were unique haplotypes. This analysis wasconducted using PhyML 3.0 (Guindon and Gascuel 2003).

Analysis of Sequence Polymorphism and PopulationGenetic Diversity

The sequences were grouped into six populations accordingto their location and type of microbial community (table 1).Intra- and interpopulation nucleotide diversity was com-puted assuming the Kimura two-parameter model (Kimura1980). The analyses were carried out using MEGA5 (Tamuraet al. 2011).

For each population, the following parameters were calcu-lated: average number of nucleotide differences (K), numberof haplotypes (H), haplotype diversity (Hd), and � per site(Watterson 1975). These analyses were performed usingDnaSP v5.10 (Librado and Rozas 2009). Interpopulation dif-ferentiation was estimated by FST computed with the analysisof molecular variance (AMOVA) (Excoffier et al. 1992) asimplemented in the Arlequin 3.5 software (Excoffier et al.2005) and with Jost’s D (Jost 2008) as implemented inSPADE (Chao and Shen 2010).

The recombination parameter R and the minimumnumber of recombination events in the sample, Rm, wereestimated according to Hudson (Hudson 1987). The geneticassociation between polymorphic sites in the whole samplewas measured by the ZnS statistic (Kelly 1997), and the effectof intragenic recombination on the observed DNA variationwas estimated by the ZZ statistics (Rozas et al. 2001). Theconfidence intervals for the above estimates were obtained bycoalescent simulations with 10,000 replicates under an as-sumption of no recombination. DnaSP v5.10 (Librado andRozas 2009) was used for these computations.

Tests for Neutrality and Positive Selection

Correspondence of the obtained data to the neutral expect-ations was examined using several estimates: Tajima’s D(Tajima 1989), Fay and Wu’s H (Fay and Wu 2000), Fu’s Fs

test of selective neutrality (Fu 1997), Fu and Li’s D* andF* (Fu and Li 1993), and Achaz’ Y/Y* (Achaz 2008) as imple-mented online at http://wwwabi.snv.jussieu.fr/achaz/neutralitytest.html. To detect positive selection, we applieda compound DHEW test (Zeng et al. 2007), which combinesTajima’s D, Fay and Wu’s normalized H, and Ewens–Watterson estimates of neutrality. The synonymous (dS)and nonsynonymous diversity (dN), and the dN/dS ratiowere calculated using the modified Nei–Gojobori method(Nei and Gojobori 1986). The analyses were conductedusing DnaSP v5.10 (Librado and Rozas 2009).

Analysis of Population History

We analyzed the data for signs of historical population sizechanges using two estimates: Fu’s Fs (Fu 1997) and R2 statistics(Ramos-Onsins and Rozas 2002). Extensive computer simula-tions suggested that these tests are most robust for detectingpopulation growth/decline (Ramos-Onsins and Rozas 2002).We also used a Bayesian analysis as implemented in LAMARC2.1.6 to compute the exponential population growth rate, g(Kuhner 2006). The growth rate relates to the scaled time-dependent mutation parameter � as follows: �t = �present time

exp(�gt), where t is time before present (Kuhner 2006).

Results

Nucleotide Polymorphism and IntrapopulationDiversity of the cpmA Locus in Chroococcidiopsis sp.

The phylogenetic analysis of the obtained cpmA sequences ofChroococcidiopsis sp. yielded a completely unresolved tree(supplementary fig. S1, Supplementary Material online). Forthe determined 26 unique haplotypes (alleles), there were 35segregating sites out of 513 sequenced. Among the poly-morphisms, 11 were synonymous and 23 nonsynonymous.One polymorphism, a singleton at position 283, was eitherone depending on the evolutionary path. Figure 1 shows dis-tribution of the polymorphism along the sequenced region ofthe gene.

FIG. 1. The distribution of nucleotide polymorphism along the 50 par-tial sequences of the Chroococcidiopsis sp. cpmA gene. Sliding window of100 bp with increments of 10 bp. The putative functional domains areshaded with tints of gray. The two hydrophobic motifs (positions376–408 and 433–513) are depicted by black boxes .

3

cpmA Locus in Chroococcidiopsis . doi:10.1093/molbev/mss191 MBE at Indian Institute O

f Chem

ical Biology (Iicb) on N

ovember 5, 2012

http://mbe.oxfordjournals.org/

Dow

nloaded from

Page 4: Extreme Conservation and Non-Neutral Evolution of the CpmA Circadian Locus in a Globally Distributed Chroococcidiopsis Sp. From Naturally Stressful Habitats

The average total nucleotide diversity of the gene for thewhole species (�� SE) was 0.0034� 0.0005; the synonymousdiversity was 0.0036, almost the same as the nonsynonymousdiversity (0.0033). The total DNA diversity of the two genedomains was slightly different: 0.0030 and 0.0036 for N- andC-terminal domains, respectively. The dN values were a bithigher in the N-terminal domain (0.0037� 0.0013 and0.0032� 0.0008, respectively), whereas the rate of synonym-ous substitutions was significantly higher at the C-terminaldomain (0.0049� 0.0014) than at the N-terminal domain(0.0010� 0.0010). Interestingly, six replacement substitutionsversus only one synonymous substitution occurred in one ofthe two hydrophobic motifs at the C-terminal domain (pos-itions 126–146 and 155–171 in the translated CpmA proteinsequence of Chroococcidiopsis sp.). Some of thesereplacements are radical. For example, highly hydrophobiccysteine at position 138 is replaced by hydrophilic positivelycharged arginine, and hydrophobic nonpolar leucine at pos-ition 159 is replaced by polar hydrophilic serine.

All studied populations and the whole species manifestedvery low genetic diversity at the cpmA locus (table 2). Thehighest value of total nucleotide diversity (�= 0.0049) wasdetermined in populations AH and SH, and the lowest inpopulation Canadian Arctic Circle (CH; �= 0.0021).Populations AH and SH also showed the highest rate of non-synonymous substitutions (0.0056 and 0.0053, respectively),whereas the highest rate at synonymous sites (0.0070) wasrecorded for population AE. Overall, the populations fromCanadian Arctic Circle (CH) and Taklimakan Desert (TH)were the least polymorphic.

On the other hand, all populations manifested high haplo-type diversity, which ranged from 0.641 to 0.993, averaging at0.774 for the whole species.

Recombination and Linkage Disequilibrium

The analysis yielded the overall R value of 61.2 and Rm value of2. The values of ZnS and ZZ were 0.0247 and�0.0092, respect-ively. The coalescent simulations showed that Rm, ZnS, and ZZare not significant. These results suggest that intragenicrecombination is not a significant factor for the observednucleotide variation at the cpmA gene.

Between-Population Diversity

The between-population differentiation at the cpmA locuswas very low (table 3). The genetic distances between thepopulations ranged from 0.0024 to 0.0049 with the meaninterpopulation distance value of 0.0034� 0.0006, that is,the same as the mean intrapopulation nucleotide diversity.The matrix of pairwise FST and Jost’s D values (table 3) pro-vides further support for the low between-population differ-entiation. The average interpopulation FST was only 0.0228and Jost’s D was only 0.011. Overall, the results of AMOVAindicated that 97.72% of the variability resided withinpopulations.

Tests for Neutrality

The results of all tests for neutrality suggested that the cpmAlocus did not follow neutral expectations at the species level(table 4). The compound DHEW test, which analyzes data forpresence of positive selection (Zeng et al. 2007), yielded ahighly significant P value of 0.0034. There were signs ofnon-neutral evolution and positive selection in particularpopulations too. The dN/dS ratio was above 1 in three popu-lations: AC, AH, and SH (table 4).

Population History

Both results of the Fu’s Fs (Fu 1997) and R2 (Ramos-Onsinsand Rozas 2002) tests showed significant departure from theconstant population size. The F’s and R2 values (�30.0980 and0.0244, respectively) were far outside their 95% confidenceintervals as determined by coalescent simulations: (�4.2967to 4.7621) and (0.0501 to 0.1966), respectively. Along with thehighly significant Tajima’s D (table 4), these results point outto the population expansion of Chroococcidiopsis sp. Further

Table 3. Genetic Differentiation between the Studied Populations.

Population AC AE AH CH SH TH

AC �0.0064/0 0.0127/0.103 0.0018/0.153 0.0201/0.125 0.0065/0.141

AE 0.0035 0.0370/0.068 0.0266/0.086 0.0355/0.066 0.0109/0.075

AH 0.0042 0.0046 0.0466*/0.185 �0.0237/0.167 0.0254/0.175

CH 0.0026 0.0030 0.0037 0.0889/0.033 0.0034/0

SH 0.0041 0.0044 0.0049 0.0035 0.0515/0.029

TH 0.0030 0.0033 0.0040 0.0024 0.0038

NOTE.—Below diagonal, pairwise between-population genetic distance; above diagonal, pairwise FST values/Jost’s D.

*P< 0.01

Table 2. Genetic Diversity Parameters of the Studied Populations.

Population H Hd K dS dN hg p

AC 5 0.933 1.67 0.0033 0.0035 0.0043 0.0033

AE 6 0.893 1.93 0.0070 0.0026 0.0053 0.0038

AH 6 0.889 2.56 0.0034 0.0056 0.0072 0.0049

CH 6 0.641 1.08 0.0023 0.0020 0.0044 0.0021

SH 3 0.833 2.50 0.0038 0.0053 0.0053 0.0049

TH 5 0.647 1.40 0.0030 0.0026 0.0048 0.0028

Averagefor the species

4.3 0.774 1.74 0.0036 0.0033 0.0152 0.0034

NOTE.—H, number of haplotypes; Hd, haplotype (gene) diversity; K, average no. ofdifferences; ��, theta (per site) from the total number of mutations; �, intrapopula-tion diversity; dS, synonymous substitutions; dN, non-synonymous substitutions.

4

Dvornyk and Jahan . doi:10.1093/molbev/mss191 MBE at Indian Institute O

f Chem

ical Biology (Iicb) on N

ovember 5, 2012

http://mbe.oxfordjournals.org/

Dow

nloaded from

Page 5: Extreme Conservation and Non-Neutral Evolution of the CpmA Circadian Locus in a Globally Distributed Chroococcidiopsis Sp. From Naturally Stressful Habitats

support for this conclusion comes from the Bayesian esti-mates of the exponential population growth parameter,g, which ranged from 851 to 933 for all six studiedpopulations.

Discussion

Low Level of Intra- and Interpopulation NucleotideDiversity at the cpmA Locus of Chroococcidiopsis sp.

Our results are in further support of very low nucleotide di-versity at circadian genes of cyanobacteria. Despite theextreme environments and the very large geographical dis-tances between the sampling locations, both theintrapopulation (table 2) and interpopulation diversity(table 3) of Chroococcidiopsis sp. at the cpmA locus werequite low. Previously, we studied microevolution of the twocore circadian genes, kaiB and kaiC, in a filamentous cyano-bacterium Nostoc linckia from the environmentally contrast-ing slopes of the ecological model microsites, EvolutionCanyons I and II (Israel) (Dvornyk et al. 2002). These genesmanifested approximately 1,000-fold higher mutation rate inthe cyanobacterial strains from the environmentally stressfulsouth-facing slopes when compared with the strains from thetemperate north-facing slopes.

Data on intra- and interpopulation DNA diversity of pro-tein-coding bacterial genes are very limited and fragmentary;some available information of several housekeeping genes ofsome bacterial species in comparison with the respectivevalues for cpmA is provided in table 5. The cpmA gene ap-pears to be even less polymorphic than housekeeping genes,which are thought to be extremely conserved due to theirsignificance for basic functions of an organism (Jordan et al.2002). For example, the total nucleotide diversity of the cpmAgene is at least 2-fold lower than that of the most conservedhousekeeping genes among those presented in table 5, rplLand gyrA of Clostridium perfringens.

More comparable evidence comes from the study of an-other cyanobacterium, toxic Microcystis aeruginosa (Tanabeet al. 2007). The average nucleotide diversity reported forseven housekeeping genes of this prokaryote was 0.023, ran-ging from 0.013 (recA) to 0.043 (pgi), which is up to 10-foldhigher than that for the cpmA gene (table 5).

Recent multilocus study using the 16S rRNA-5.8S ITS-23SrRNA gene region sequences of globally distributedChroococcidiopsis sp. reported differentiation between theecotypes of the cyanobacterium from cold and hot deserts(Bahl et al. 2011). It was suggested that this differentiationwas driven by adaptation to specific environmental condi-tions rather than by geographic isolation. We did not find anysuch differentiation at the cpmA locus. This inconsistencymay be due to the much higher conservation of cpmAeven when compared with the rRNA genes.

The observed extreme conservation of the circadian gene,cpmA, may result from two main factors. First, cpmA belongsto the AIRC superfamily. Members of this superfamily are ofcritical importance, as they are involved into de novo biosyn-thesis of purines, an essential component of DNA (Luckensand Buchanan 1959). Taking account of this, cpmA mayT

able

4.R

esul

tsfo

rT

ests

ofN

eutr

alit

y.

Pop

ula

tion

DH

F sD

*F*

YY

*E

DH

EWd

N/d

S

AC

�1.

3369

,P

=0.

060.

9361

,P

=0.

783

�2.

5180

,P

=0.

012

�1.

3683

,P>

0.10

�1.

4501

,P>

0.10

NA

NA

0.90

440.

7331

1.06

AE

�1.

3593

,P

=0.

100�

0.28

82,

P=

0.23

06�

2.72

62,

P=

0.01

2�

1.36

04,

P>

0.10

�1.

5029

,P>

0.10

�2.

4510

,P

=0.

0419

�1.

0615

,P

=0.

1746

0.71

670.

5792

0.37

AH

�1.

4219

,P

=0.

076

0.96

85,

P=

0.88

99�

1.54

90,

P=

0.11

5�

1.37

48,

P>

0.10

�1.

5486

,P>

0.10

�0.

7952

,P

=0.

2173

�1.

3463

,P

=0.

0910

0.06

280.

3723

1.64

CH

�1.

9819

,P

=0.

006

0.62

75,

P=

0.66

98�

2.90

45,

P=

0.00

5�

2.49

88,

P<

0.02

�2.

6914

,P<

0.02

NA

NA

0.43

860.

3416

0.87

SH�

0.79

68,

P=

0.17

90.

2531

,P

=0.

3782

0.46

11,

P=

0.50

1�

0.79

68,

P>

0.10

�0.

7529

,P>

0.10

�1.

2248

,P

=0.

1922

NA

0.15

251

1.39

TH

�1.

8391

,P

=0.

015

0.76

43,

P=

0.73

24�

1.43

01,

P=

0.08

3�

2.13

69,

P<

0.05

�2.

3170

,P<

0.05

NA

NA

0.54

890.

4369

0.87

Ave

rage

for

the

spec

ies�

2.64

43,

P<

0.00

1�

1.74

18,

P=

0.04

9�

30.0

980,

P<

0.00

1�

4.77

81,

P<

0.02

�4.

7860

,P<

0.02

�2.

4699

,P

=0.

001

�2.

3260

,P

=0.

002

0.01

430.

0034

0.92

NO

TE.—

D,T

ajim

a’s

D(T

ajim

a19

89);

H,F

ayan

dW

u’s

H(F

ayan

dW

u20

00);

F s,F

u’s

F ste

stof

sele

ctiv

ene

utra

lity

(Fu

1997

);D

*an

dF*

,Fu

and

Li’s

D*

and

F*te

stst

atis

tics

(Fu

and

Li19

93);

Yan

dY

*,A

chaz

’Yan

dY

*te

stst

atis

tics

(Ach

az20

08);

E,te

stfo

rdi

rect

iona

lse

lect

ion

(Pva

lues

)(Z

eng

etal

.20

06);

DH

EW,

com

pou

nd

test

for

pos

itiv

ese

lect

ion

(Pva

lues

)(Z

eng

etal

.20

07);

dN/d

S,ra

tio

ofno

nsyn

onym

ous

tosy

nony

mou

ssu

bsti

tuti

ons;

NA

,no

tav

aila

ble.

5

cpmA Locus in Chroococcidiopsis . doi:10.1093/molbev/mss191 MBE at Indian Institute O

f Chem

ical Biology (Iicb) on N

ovember 5, 2012

http://mbe.oxfordjournals.org/

Dow

nloaded from

Page 6: Extreme Conservation and Non-Neutral Evolution of the CpmA Circadian Locus in a Globally Distributed Chroococcidiopsis Sp. From Naturally Stressful Habitats

actually be considered as a housekeeping gene. Furthermore,the circadian system as a whole is critically important formaintenance of intracellular homeostasis and adaptation toenvironment (Johnson 2005). According to the estimates bydifferent methods, a proportion of genes that are expressedrhythmically in the cyanobacterial genome varies from 2% inSynechocystis sp. PCC 6803 (Kucho et al. 2005) to up to 30%in Synechococcus elongatus PCC 7942 (Liu et al. 1995). Theimportance of the circadian system assumes the importanceof all its elements. Indeed, mutations in cpmA significantlyreduce a growth rate of the cyanobacterium (Katayamaet al. 1999)

The high haplotype diversity of the cpmA gene is not quitesurprising. It has been frequently reported for many genes inbacteria (table 5 and references therein). However, high haplo-type diversity usually results in significant interpopulationdifferentiation of bacteria (Silva et al. 2005; Vinuesa et al.2005), which is not a case for Chroococcidiopsis sp. Even theJost’s D test (Jost 2008), which is particularly sensitive to “pri-vate” alleles, failed to detect significant differentiation be-tween the populations of Chroococcidiopsis sp. (table 3).

One of the possible reasons for the high gene diversity ofbacteria and, particularly, Chroococcidiopsis sp. may be theexistence of multiple ecotypes in the same habitat, whichare adapted to ecological microniches (Tanabe et al. 2007).This assumption is plausible, given the extreme environmen-tal conditions and quite pronounced fluctuations in environ-mental factors (temperature, light, humidity, etc.) at thesampling sites.

Non-Neutral Evolution of the cpmA Locus inChroococcidiopsis sp.

CpmA is one of the genes, which control the circadian outputin cyanobacteria (Katayama et al. 1999). There is limited in-formation about its physiological function. It was reportedthat inactivation of cpmA results in a significantly earlierphase and lower amplitude of rhythmic expression of acore circadian gene, kaiA, and severe decrease in growth ofa cyanobacterium (Katayama et al. 1999). Based on thesedata, it was suggested that cpmA might contribute to basiccell metabolism (Katayama et al. 1999). Although there is nodirect evidence of whether cpmA itself has adaptive signifi-cance or does not, but signals for positive selection in thewhole gene and, specifically, the character of some replace-ment substitutions in its hydrophobic motifs suggest that thispossibility should not be excluded. Moreover, this seems likelyconsidering the adaptive significance of the circadian systemas a whole (Johnson et al. 1998; Ouyang et al. 1998).

Although data about primarily non-neutral evolution andpositive selection in various genes of bacteria are abundant,they are quite limited with respect to the circadian genes. Asto prokaryotes, positive selection at the population level wasdocumented for two core circadian genes, kaiB and kaiC, in afilamentous cyanobacterium, Nostoc linckia, from two modelecological microsites known as Evolution Canyons I and II(Dvornyk et al. 2002). Results of other studies (Dvornyket al. 2004; Dvornyk 2005) suggested positive selection atthe above-species level for two other circadian genes, ldpA

Table 5. Polymorphism of Some Housekeeping and Symbiotic Genes of Bacteria in Comparison with cpmA of Chroococcidiopsis sp.

Species Genes Hd dS dN h/hg p References

Clostridium perfringens pfoS — 0.042 0.004 9.349/— 0.014 Rooney et al. (2006)

rplL — 0.019 0.004 4.415/— 0.007

gyrA — 0.029 0.001 9.868/— 0.007

Bradyrhizobium elkanii recA 0.263 — — — 0.0350 Perrineau et al. (2011)

dnaK 0.238 — — — 0.0257

glnII 0.238 — — — 0.0386

Bradyrhizobium canariense recA 0.928 0.1003 0.0016 —/0.0246 0.0271 Vinuesa et al. (2005)

atpD 0.902 0.0473 0.0012 —/0.0103 0.0127

glnII 0.863 0.0455 0.0014 —/0.0108 0.0117

Rhizobium gallicum glnII 0.939 — — —/0.0248 0.0230 Silva et al. (2005)

atpD 0.923 — — —/0.0256 0.0209

nifH 0.903 — — —/0.0442 0.0458

nodB 0.855 — — �/0.0876 0.1243

Microcystis aeruginosa ftsZ 0.939 — — — 0.026 Tanabe et al. (2007)

glnA 0.948 — — — 0.025

gltX 0.969 — — — 0.023

gyrB 0.938 — — — 0.017

pgi 0.956 — — — 0.043

recA 0.957 — — — 0.013

tpi 0.951 — — — 0.019

Chroococcidiopsis sp. cpmA 0.774 0.0036 0.0033 7.814/0.0152 0.0034

NOTE.—atpD, beta subunit of ATPase; ftsZ, cell division protein FtsZ; glnA, glutamine synthetase; gltX, glutamyl-tRNA synthetase; gyrB, DNA gyrase subunit B; pgi,glucose-6-phosphate isomerase; recA, recombination protein RecA; tpi, triosephosphate isomerase; gyrA, DNA gyrase subunit A; pfoS, regulatory protein PfoS; rplL, 50 S ribosomalprotein; dnaK, heat shock chaperone protein; glnII, glutamine synthetase II; nifH, dinitrogenase reductase; nodB, N-acetylglucosmine deacetylase; �, theta (per gene) from the totalnumber of mutations; ��, theta (per site) from the total number of mutations; —, not provided.

6

Dvornyk and Jahan . doi:10.1093/molbev/mss191 MBE at Indian Institute O

f Chem

ical Biology (Iicb) on N

ovember 5, 2012

http://mbe.oxfordjournals.org/

Dow

nloaded from

Page 7: Extreme Conservation and Non-Neutral Evolution of the CpmA Circadian Locus in a Globally Distributed Chroococcidiopsis Sp. From Naturally Stressful Habitats

and sasA, which are elements of the circadian input andoutput, respectively. On the other hand, the previous macro-evolutionary study of cpmA in prokaryotes did not determineany positive selection on the gene (Dvornyk 2006). The dis-crepancies between the results of the cpmA studies at micro-and macrolevels may be due to fairly low intensity and anepisodic character of selection pressure on the gene. As such,the existing methods may not be powerful enough to detectpositive selection at the above-species level, unless this selec-tion is sufficiently strong and operates during reasonably longevolutionary period. On the other hand, selection may bepopulation specific, as was suggested for the human period2 gene (Cruciani et al. 2008).

Data about selection on circadian genes in eukaryotes aremore abundant. In this respect, gene period of Drosophila hasreceived much attention due to the report about a latitudinalcline in its threonine–glycine repeat polymorphism inEuropean Drosophila melanogaster, which was implicated inselection for clock temperature compensation (Costa et al.1992). A majority of the studies, which subsequently exam-ined this and other polymorphisms of period in differentpopulations and species of Drosophila, reported varioustypes of selection, which might shape the observed poly-morphism (Kliman and Hey 1993; Rosato et al. 1994;Sawyer et al. 1997, 2006). Natural selection was also reportedto operate on another Drosophila circadian gene, timeless(Tauber et al. 2007). Recently, six clock-associated genes ofa plant, Populus tremula, were found to experience positiveselection (Hall et al. 2011).

Altogether, the above data suggest that selection is amongthe most important factors operating on circadian genes ofprokaryotes and eukaryotes. Evidence for positive selectionfurther supports the previous reports about the adaptive sig-nificance of the circadian clock system (Johnson et al. 1998;Ouyang et al. 1998).

Small Effective Population Size and Mutation-DriftEquilibrium

The obtained results are in favor of a relatively small effectivepopulation size of Chroococcidiopsis sp. We found high link-age disequilibrium and very low recombination rate in ourdata set. Given that linkage disequilibrium at neutral sites of ahaploid organism depends on effective population size Ne anda recombination rate c and is determined by 1/(2Nec). Since cis very small, Ne should also be small to explain the highdisequilibrium. Using the obtained values for ZnS (0.0247)and c (0.0004), the Ne was estimated to be only �50,000individuals. Populations with small Ne should experience sig-nificant genetic drift, which reduces their diversity. Indeed, theobserved nucleotide diversity of the studied populations wasvery low. However, despite the very large distances betweenthe sampled populations, they show almost no genetic dif-ferentiation. This suggests virtually unlimited gene flowbetween them.

We also examined population history of the data, that is,whether there is any departure from demographic equilib-rium. One of the signs for a recent expansion is an excess of

singleton mutations, which can be measured by Tajima’s D orFu’s Fs. Both these statistics yielded statistically significantnegative values (table 4). Two other tests, R2 (Ramos-Onsins and Rozas 2002) and g (Kuhner 2006), also supportedrecent expansion of the Chroococcidiopsis sp. populations. Inaddition, the obtained results suggest that Chroococcidiopsissp. probably consists of many relatively small populationsoccupying various environmental microniches within a par-ticular habitat.

Such population structure of Chroococcidiopsis sp. is notsomething unusual. Many bacteria, particularly endosymbi-otic and parasitic ones, may have extremely large populationsize at the global scale and small effective size at the popula-tion level due to bottlenecks experienced by bacterial popu-lations during transmission between hosts (Sharp et al. 2005).Given the harsh environmental conditions of their habitats,micropopulations of Chroococcidiopsis sp. may experience abottleneck during their recovery after various climaticextremes, such as low temperature or high UV.

Concluding Remarks

This study provides compelling evidence for extremely highconservation and non-neutral evolution of a circadian locus,cpmA, in a stress-tolerant cyanobacterium Chroococcidiopsissp. from extreme environments around the globe. The signsfor positive selection detected in some of the studied popu-lations and the whole species suggest that cpmA evolution islikely of adaptive nature.

Supplementary MaterialSupplementary figure S1 and table S1 are available atMolecular Biology and Evolution online http://www.mbe.oxfordjournals.org.

Acknowledgments

The authors express their deepest gratitude to Dr StephenPointing (University of Hong Kong) for providing theChroococcidiopsis sp. strains and his expert opinion on thearticle, and Dr Claus Vogl (University of Veterinary MedicineVienna) for his comments on the early draft of the manu-script. This work was supported by grant 10208127 from theUniversity of Hong Kong.

ReferencesAchaz G. 2008. Testing for neutrality in samples with sequencing errors.

Genetics 179:1409–1424.

Baca I, Sprockett D, Dvornyk V. 2010. Circadian input kinases and theirhomologs in cyanobacteria: Evolutionary constraints vs architecturaldiversification. J Mol Evol. 70:453–465.

Bahl J, Lau MC, Smith GJ, et al. (12 co-authors). 2011. Ancient originsdetermine global biogeography of hot and cold desert cyanobac-teria. Nat Commun. 2:163.

Caruso T, Chan Y, Lacap DC, Lau MC, McKay CP, Pointing SB. 2011.Stochastic and deterministic processes interact in the assembly ofdesert microbial communities on a global scale. ISME J. 5:1406–1413.

Chao A, Shen T-J. 2010. Program SPADE (Species Prediction AndDiversity Estimation). Program and User’s Guide. Distributed bythe author. Available from: http://chao.stat.nthu.edu.tw.

7

cpmA Locus in Chroococcidiopsis . doi:10.1093/molbev/mss191 MBE at Indian Institute O

f Chem

ical Biology (Iicb) on N

ovember 5, 2012

http://mbe.oxfordjournals.org/

Dow

nloaded from

Page 8: Extreme Conservation and Non-Neutral Evolution of the CpmA Circadian Locus in a Globally Distributed Chroococcidiopsis Sp. From Naturally Stressful Habitats

Cordoba-Jabonero C, Zorzano MP, Selsis F, Patel MR, Cockell CS. 2005.

Radiative habitable zones in martian polar environments. Icarus 175:

360–371.

Costa R, Peixoto AA, Barbujani G, Kyriacou CP. 1992. A latitudinal cline

in a Drosophila clock gene. Proc R Soc Lond B. 250:43–49.

Cruciani F, Trombetta B, Labuda D, Modiano D, Torroni A, Costa R,

Scozzari R. 2008. Genetic diversity patterns at the human clock gene

period 2 are suggestive of population-specific positive selection. Eur J

Hum Genet. 16:1526–1534.

Darriba D, Taboada GL, Doallo R, Posada D. 2011. ProtTest 3: fast selec-

tion of best-fit models of protein evolution. Bioinformatics 27:

1164–1165.

Dvornyk V. 2005. Molecular evolution of ldpA, a gene mediating circa-

dian input signal in cyanobacteria. J Mol Evol. 60:105–112.

Dvornyk V. 2006. Subfamilies of cpmA, a gene involved in circadian

output, have different evolutionary histories in cyanobacteria.

Microbiology 152:75–84.

Dvornyk V. 2009. The circadian clock gear in cyanobacteria: assembled

by evolution. In: Ditty JL, Mackey S, Johnson CH, editors. Bacterial

circadian programs. Berlin-Heidelberg: Springer. p. 241–258.

Dvornyk V, Deng HW, Nevo E. 2004. Structure and molecular

phylogeny of sasA genes in cyanobacteria: insights into

evolution of the prokaryotic circadian system. Mol Biol Evol. 21:

1468–1476.

Dvornyk V, Vinogradova ON, Nevo E. 2002. Long-term

microclimatic stress causes rapid adaptive radiation of kaiABC

clock gene family in a cyanobacterium, Nostoc linckia, from the

"Evolution Canyons" I and II, Israel. Proc Natl Acad Sci U S A. 99:

2082–2087.

Dvornyk V, Vinogradova ON, Nevo E. 2003. Origin and evolution of

circadian clock genes in prokaryotes. Proc Natl Acad Sci U S A. 100:

2495–2500.

Excoffier L, Laval G, Schneider S. 2005. Arlequin (version 3.0): an inte-

grated software package for population genetics data analysis. Evol

Bioinform Online. 1:47–50.

Excoffier L, Smouse PE, Quattro JM. 1992. Analysis of molecular variance

inferred from metric distances among DNA haplotypes: application

to human mitochondrial DNA restriction data. Genetics 131:

479–491.

Fay JC, Wu CI. 2000. Hitchhiking under positive Darwinian selection.

Genetics 155:1405–1413.

Fewer D, Friedl T, Budel B. 2002. Chroococcidiopsis and heterocyst-

differentiating cyanobacteria are each other’s closest living relatives.

Mol Phylogenet Evol. 23:82–90.

Friedmann EI. 1980. Endolithic microbial life in hot and cold deserts. Orig

Life. 10:223–235.

Friedmann EI, Ocampo-Friedmann R. 1995. A primitive cyanobacterium

as pioneer microorganism for terraforming Mars. Adv Space Res. 15:

243–246.

Fu YX. 1997. Statistical tests of neutrality of mutations against popula-

tion growth, hitchhiking and background selection. Genetics 147:

915–925.

Fu Y-X, Li W-H. 1993. Statistical tests of neutrality of mutations. Genetics

133:693–709.

Guindon S, Gascuel O. 2003. A simple, fast, and accurate algorithm to

estimate large phylogenies by maximum likelihood. Syst Biol. 52:

696–704.

Hall D, Ma XF, Ingvarsson PK. 2011. Adaptive evolution of the Populus

tremula photoperiod pathway. Mol Ecol. 20:1463–1474.

Hall TA. 1999. BioEdit: a user-friendly biological sequence alignment

editor and analysis program for Windows 95/98/NT. Nucl AcidsSymp Ser. 41:95–98.

Hudson RR. 1987. Estimating the recombination parameter of a finitepopulation model without selection. Genet Res. 50:245–250.

Johnson CH. 2005. Testing the adaptive value of circadian systems.Methods Enzymol. 393:818–837.

Johnson CH, Golden SS. 1999. Circadian programs in cyanobacteria:

adaptiveness and mechanism. Annu Rev Microbiol. 53:389–409.

Johnson CH, Golden SS, Kondo T. 1998. Adaptive significance of circa-

dian programs in cyanobacteria. Trends Microbiol. 6:407–410.

Jordan IK, Rogozin IB, Wolf YI, Koonin EV. 2002. Essential genes are more

evolutionarily conserved than are nonessential genes in bacteria.

Genome Res. 12:962–968.

Jost L. 2008. GST and its relatives do not measure differentiation. Mol

Ecol. 17:4015–4026.

Katayama M, Tsinoremas NF, Kondo T, Golden SS. 1999. cpmA, a gene

involved in an output pathway of the cyanobacterial circadiansystem. J Bacteriol. 181:3516–3524.

Kelly JK. 1997. A test of neutrality based on interlocus associations.

Genetics 146:1197–1206.

Kimura M. 1980. A simple method for estimating evolutionary rates of

base substitutions through comparative studies of nucleotide se-quences. J Mol Evol. 16:111–120.

Kliman RM, Hey J. 1993. DNA sequence variation at the period locuswithin and among species of the Drosophila melanogaster complex.

Genetics 133:375–387.

Kondo T, Strayer CA, Kulkarni RD, Taylor W, Ishiura M, Golden SS,Johnson CH. 1993. Circadian rhythms in prokaryotes: luciferase as

a reporter of circadian gene expression in cyanobacteria. Proc Natl

Acad Sci U S A. 90:5672–5676.

Kucho K, Okamoto K, Tsuchiya Y, Nomura S, Nango M, Kanehisa M,

Ishiura M. 2005. Global analysis of circadian expression in the cyano-bacterium Synechocystis sp. strain PCC 6803. J Bacteriol. 187:

2190–2199.

Kuhner MK. 2006. LAMARC 2.0: maximum likelihood and Bayesianestimation of population parameters. Bioinformatics 22:768–770.

Le SQ, Gascuel O. 2008. An improved general amino acid replacementmatrix. Mol Biol Evol. 25:1307–1320.

Librado P, Rozas J. 2009. DnaSP v5: a software for comprehensive analysisof DNA polymorphism data. Bioinformatics 25:1451–1452.

Liu Y, Tsinoremas NF, Johnson CH, Lebedeva NV, Golden SS, Ishiura M,

Kondo T. 1995. Circadian orchestration of gene expression in cya-nobacteria. Genes Dev. 9:1469–1478.

Luckens LN, Buchanan JM. 1959. Biosynthesis of the purines. XXIV. Theenzymatic synthesis of 5-amino-1-ribosyl-4-imidazolecarboxylic acid

5’-phosphate from 5-amino-1-ribosylimidazole 5’-phosphate and

carbon dioxide. J Biol Chem. 234:1799–1805.

Mackey SR, Golden SS, Ditty JL. 2011. The itty-bitty time machine gen-

etics of the cyanobacterial circadian clock. Adv Genet. 74:13–53.

Nei M, Gojobori T. 1986. Simple methods for estimating the numbers of

synonymous and nonsynonymous nucleotide substitutions. Mol

Biol Evol. 3:418–426.

Ouyang Y, Andersson CR, Kondo T, Golden SS, Johnson CH. 1998.

Resonating circadian clocks enhance fitness in cyanobacteria. ProcNatl Acad Sci U S A. 95:8660–8664.

Peel MC, Finlayson BL, McMahon TA. 2007. Updated world map of the

Koppen-Geiger climate classification. Hydrol Earth Syst Sci. 11:1633–1644.

8

Dvornyk and Jahan . doi:10.1093/molbev/mss191 MBE at Indian Institute O

f Chem

ical Biology (Iicb) on N

ovember 5, 2012

http://mbe.oxfordjournals.org/

Dow

nloaded from

Page 9: Extreme Conservation and Non-Neutral Evolution of the CpmA Circadian Locus in a Globally Distributed Chroococcidiopsis Sp. From Naturally Stressful Habitats

Perrineau MM, Le Roux C, de Faria SM, de Carvalho Balieiro F, Galiana A,Prin Y, Bena G. 2011. Genetic diversity of symbiotic Bradyrhizobiumelkanii populations recovered from inoculated and non-inoculatedAcacia mangium field trials in Brazil. Syst Appl Microbiol. 34:376–384.

Pointing SB, Chan Y, Lacap DC, Lau MC, Jurgens JA, Farrell RL. 2009.Highly specialized microbial diversity in hyper-arid polar desert. ProcNatl Acad Sci U S A. 106:19964–19969.

Ramos-Onsins SE, Rozas J. 2002. Statistical properties of new neutralitytests against population growth. Mol Biol Evol. 19:2092–2100.

Rooney AP, Swezey JL, Friedman R, Hecht DW, Maddox CW. 2006.Analysis of core housekeeping and virulence genes reveals crypticlineages of Clostridium perfringens that are associated with distinctdisease presentations. Genetics 172:2081–2092.

Rosato E, Peixoto AA, Barbujani G, Costa R, Kyriacou CP. 1994.Molecular polymorphism in the period gene of Drosophila simulans.Genetics 138:693–707.

Rozas J, Gullaud M, Blandin G, Aguade M. 2001. DNA variation at therp49 gene region of Drosophila simulans: evolutionary inferencesfrom an unusual haplotype structure. Genetics 158:1147–1155.

Sawyer LA, Hennessy JM, Peixoto AA, Rosato E, Parkinson H, Costa R,Kyriacou CP. 1997. Natural variation in a Drosophila clock gene andtemperature compensation. Science 278:2117–2120.

Sawyer LA, Sandrelli F, Pasetto C, Peixoto AA, Rosato E, Costa R,Kyriacou CP. 2006. The period gene Thr-Gly polymorphism inAustralian and African Drosophila melanogaster populations: impli-cations for selection. Genetics 174:465–480.

Sharp PM, Bailes E, Grocock RJ, Peden JF, Sockett RE. 2005. Variation inthe strength of selected codon usage bias among bacteria. NucleicAcids Res. 33:1141–1153.

Silva C, Vinuesa P, Eguiarte LE, Souza V, Martinez-Romero E. 2005.Evolutionary genetics and biogeographic structure of Rhizobiumgallicum sensu lato, a widely distributed bacterial symbiont of diverselegumes. Mol Ecol. 14:4033–4050.

Stewart CN Jr, Via LE. 1993. A rapid CTAB DNA isolation techniqueuseful for RAPD fingerprinting and other PCR applications.Biotechniques 14:748–750.

Tajima F. 1989. Statistical method for testing the neutral mutation

hypothesis by DNA polymorphism. Genetics 123:585–596.

Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S. 2011.

MEGA5: Molecular Evolutionary Genetics Analysis using maximum

likelihood, evolutionary distance, and maximum parsimony meth-

ods. Mol Biol Evol. 28:2731–2739.

Tanabe Y, Kasai F, Watanabe MM. 2007. Multilocus sequence typing

(MLST) reveals high genetic diversity and clonal population struc-

ture of the toxic cyanobacterium Microcystis aeruginosa.

Microbiology 153:3695–3703.

Tauber E, Zordan M, Sandrelli F, et al. (13 co-authors). 2007. Natural

selection favors a newly derived timeless allele in Drosophila mela-

nogaster. Science 316:1895–1898.

Thompson JD, Higgins DG, Gibson TJ. 1994. CLUSTAL W: improving the

sensitivity of progressive multiple sequence alignment through se-

quence weighting, position-specific gap penalties and weight matrix

choice. Nucleic Acids Res. 22:4673–4680.

Vinuesa P, Silva C, Werner D, Martinez-Romero E. 2005. Population

genetics and phylogenetic inference in bacterial molecular

systematics: the roles of migration and recombination in

Bradyrhizobium species cohesion and delineation. Mol Phylogenet

Evol. 34:29–54.

Watterson GA. 1975. On the number of segregating sites

in genetical models without recombination. Theor Pop Biol. 7:

256–276.

Whitton BA. 1987. Survival and dormancy of algae. In: Hennis Y, editor.

Survival and dormancy of microorganisms. New York: John Wiley.

p. 109–167.

Woelfle MA, Ouyang Y, Phanvijhitsiri K, Johnson CH. 2004. The adaptive

value of circadian clocks: an experimental assessment in cyanobac-

teria. Curr Biol. 14:1481–1486.

Zeng K, Fu YX, Shi S, Wu CI. 2006. Statistical tests for detecting positive

selection by utilizing high-frequency variants. Genetics 174:

1431–1439.

Zeng K, Shi S, Wu CI. 2007. Compound tests for the detection of

hitchhiking under positive selection. Mol Biol Evol. 24:1898–1908.

9

cpmA Locus in Chroococcidiopsis . doi:10.1093/molbev/mss191 MBE at Indian Institute O

f Chem

ical Biology (Iicb) on N

ovember 5, 2012

http://mbe.oxfordjournals.org/

Dow

nloaded from