transcriptome prowling, sequence characterization, and … · transcriptome prowling, sequence...

15
Mol Genet Genomics (2007) 278:539–553 DOI 10.1007/s00438-007-0270-9 123 ORIGINAL PAPER Transcriptome proWling, sequence characterization, and SNP-based chromosomal assignment of the EXPANSIN genes in cotton Chuanfu An · Sukumar Saha · Johnie N. Jenkins · Brian E. ScheZer · Thea A. Wilkins · David M. Stelly Received: 27 April 2007 / Accepted: 15 June 2007 / Published online: 28 August 2007 © Springer-Verlag 2007 Abstract The knowledge of biological signiWcance asso- ciated with DNA markers is very limited in cotton. SNPs are potential functional marker to tag genes of biological importance. Plant expansins are a group of extracellular proteins that directly modify the mechanical properties of cell walls, enable turgor-driven cell extension, and likely aVect length and quality of cotton Wbers. Here, we report the expression proWles of EXPANSIN transcripts during Wber elongation and the discovery of SNP markers, assess the SNP characteristics, and localize six EXPANSIN A genes to chromosomes. Transcriptome proWling of cotton Wber oligonucleotide microarrays revealed that seven EXPANSIN transcripts were diVerentially expressed when there was parallel polar elongation during morphogene- sis at early stage of Wber development, suggesting that major and minor isoforms perform discrete functions dur- ing polar elongation and lateral expansion. Ancestral and homoeologous relationships of the six EXPANSIN A genes were revealed by phylogenetic grouping and comparison to extant A- and D-genome relatives of contemporary AD- genome cottons. The average rate of SNP per nucleotide was 2.35% (one SNP per 43 bp), with 1.74 and 3.99% occurring in coding and noncoding regions, respectively, in the selected genotypes. An unequal evolutionary rate of the EXPANSIN A genes at the subgenome level of tetraploid cotton was recorded. Chromosomal locations for each of the six EXPANSIN A genes were established by gene-speciWc SNP markers. Results revealed a strategy for discovering SNP markers in a polyploidy species like cotton. These markers could be useful to associate candidate genes with the complex Wber traits in MAS. Keywords Gossypium spp. · EXPANSIN · Transcriptome proWling · Single-nucleotide polymorphism · Chromosomal assignment Communicated by R. Hagemann. Nucleotide sequence data reported are available in GenBank database under the accession numbers EF644199 to EF644335. Electronic supplementary material The online version of this article (doi:10.1007/s00438-007-0270-9) contains supplementary material, which is available to authorized users. Disclaimer Mention of trade names or commercial products in this manuscript does not constitute a guarantee or warranty of the product by the US Department of Agriculture and does not imply its approval to the exclusion of other products that may also be suitable. C. An Department of Plant and Soil Sciences, Mississippi State University, Mississippi State, MS 39762, USA S. Saha (&) · J. N. Jenkins USDA-ARS, Crop Science Research Laboratory, P.O. Box 5367, Mississippi State, MS 39762, USA e-mail: [email protected] B. E. ScheZer USDA-ARS-CGRU, Stoneville, MS 38776, USA T. A. Wilkins Department of Plant Sciences, University of California, Davis, CA 95616, USA Present Address: T. A. Wilkins Department of Plant and Soil Science, Texas Tech University, Lubbock, TX 79409, USA D. M. Stelly Department of Soil and Crop Sciences, Texas A&M University, College Station, TX 77843-2474, USA

Upload: trinhmien

Post on 27-Aug-2018

230 views

Category:

Documents


0 download

TRANSCRIPT

  • Mol Genet Genomics (2007) 278:539553

    DOI 10.1007/s00438-007-0270-9

    ORIGINAL PAPER

    Transcriptome proWling, sequence characterization, and SNP-based chromosomal assignment of the EXPANSIN genes in cotton

    Chuanfu An Sukumar Saha Johnie N. Jenkins Brian E. ScheZer Thea A. Wilkins David M. Stelly

    Received: 27 April 2007 / Accepted: 15 June 2007 / Published online: 28 August 2007 Springer-Verlag 2007

    Abstract The knowledge of biological signiWcance asso-ciated with DNA markers is very limited in cotton. SNPsare potential functional marker to tag genes of biologicalimportance. Plant expansins are a group of extracellularproteins that directly modify the mechanical properties ofcell walls, enable turgor-driven cell extension, and likelyaVect length and quality of cotton Wbers. Here, we reportthe expression proWles of EXPANSIN transcripts duringWber elongation and the discovery of SNP markers, assessthe SNP characteristics, and localize six EXPANSIN Agenes to chromosomes. Transcriptome proWling of cottonWber oligonucleotide microarrays revealed that seven

    EXPANSIN transcripts were diVerentially expressed whenthere was parallel polar elongation during morphogene-sis at early stage of Wber development, suggesting thatmajor and minor isoforms perform discrete functions dur-ing polar elongation and lateral expansion. Ancestral andhomoeologous relationships of the six EXPANSIN A geneswere revealed by phylogenetic grouping and comparison toextant A- and D-genome relatives of contemporary AD-genome cottons. The average rate of SNP per nucleotidewas 2.35% (one SNP per 43 bp), with 1.74 and 3.99%occurring in coding and noncoding regions, respectively, inthe selected genotypes. An unequal evolutionary rate of theEXPANSIN A genes at the subgenome level of tetraploidcotton was recorded. Chromosomal locations for each of thesix EXPANSIN A genes were established by gene-speciWcSNP markers. Results revealed a strategy for discoveringSNP markers in a polyploidy species like cotton. Thesemarkers could be useful to associate candidate genes withthe complex Wber traits in MAS.

    Keywords Gossypium spp. EXPANSIN Transcriptome proWling Single-nucleotide polymorphism Chromosomal assignment

    Communicated by R. Hagemann.

    Nucleotide sequence data reported are available in GenBank database under the accession numbers EF644199 to EF644335.

    Electronic supplementary material The online version of this article (doi:10.1007/s00438-007-0270-9) contains supplementary material, which is available to authorized users.

    Disclaimer Mention of trade names or commercial products in this manuscript does not constitute a guarantee or warranty of the product by the US Department of Agriculture and does not imply its approval to the exclusion of other products that may also be suitable.

    C. AnDepartment of Plant and Soil Sciences, Mississippi State University, Mississippi State, MS 39762, USA

    S. Saha (&) J. N. JenkinsUSDA-ARS, Crop Science Research Laboratory, P.O. Box 5367, Mississippi State, MS 39762, USAe-mail: [email protected]

    B. E. ScheZerUSDA-ARS-CGRU, Stoneville, MS 38776, USA

    T. A. WilkinsDepartment of Plant Sciences, University of California, Davis, CA 95616, USA

    Present Address:T. A. WilkinsDepartment of Plant and Soil Science, Texas Tech University, Lubbock, TX 79409, USA

    D. M. StellyDepartment of Soil and Crop Sciences, Texas A&M University, College Station, TX 77843-2474, USA

    123

    http://dx.doi.org/10.1007/s00438-007-0270-9

  • 540 Mol Genet Genomics (2007) 278:539553

    Introduction

    Cotton (Gossypium spp.) is the leading natural Wber crop ofthe world. Approximately 90% of its value resides in theWber (lint). Although lint production has increased recently,the Wber quality has been declining over the last decade.Fiber quality, determined by length, strength, elongation,and uniformity, is the most important factor in modernspinning technology and proWtability. Botanically, Wber is asingle-celled trichome developing from individual epider-mal cells on the outer integument of cotton ovules. Thedevelopment of Wber cells undergoes four discrete, yetoverlapping stages: diVerentiation, expansion/primary cellwall (PCW) synthesis, secondary cell wall (SCW) synthe-sis, and maturation (Wilkins and Jernstedt 1999; Wilkinsand Arpat 2005). So far, many genes involved in cottonWber development have been isolated and characterized(Arpat et al. 2004). There is strong interest from biologicaland economical standpoints to identify genes for whichexpression closely parallels the rate of cotton Wber expan-sion and elongation, since these will likely include genesthat impart major inXuences on cell wall development, Wberquality, and many other plant attributes. Most of the eco-nomically important Wber traits, including elongation, arecontrolled by quantitative trait loci (QTLs). Knowledge offunctional genes underlying Wber quality QTLs is very lim-ited in cotton.

    Expansins are a large family of extracellular proteinsthat loosen the components of rigid plant cell walls andthereby allow cell expansion (Darley et al. 2001; Li et al.2002; Sampedro and Cosgrove 2005). It had been foundthat the regulation of cell wall extensibility during cellexpansion was controlled, in part, by diVerent expression ofEXPANSIN genes in tomato (Vogler et al. 2003) and cotton(Arpat et al. 2004). These functions suggest that expansinsaVect economical signiWcant Wber properties such as lengthand elongation. Following the recommendation of anad hoc working group on nomenclature of EXPASIN genefamily, we refer to -expansin as EXPANSIN A (abbrevi-ated as EXPA; Kende et al. 2004). A key research model forcell biogenesis research is the cotton Wber, each of whicharises from a single epidermal cell of the ovule integumentthat commences extensive unipolar expansion on the day ofanthesis [0 days post-anthesis (dpa)] and lasts to around20 dpa (Smart et al. 1998; Wilkins and Jernstedt 1999).Elongation of cotton Wbers is highly polar and rapid;growth rates in cultivated species exceed 2 mm/day duringpeak growth (Wilkins and Jernstedt 1999). Expression ofseveral EXPANSIN genes parallels Wber elongation (Shi-mizu et al. 1997; Orford and Timmis 1998; Ruan et al.2001; Harmer et al. 2002; Ji et al. 2003). Comprehensiveanalyses of the cotton Wber transcriptome showed thatGhEXPA1 (AF043284) is one of the top 15% most highly

    expressed genes in G. arboreum L. cv. AKA8401 (Arpatet al. 2004). Given the inferred importance of EXPANSINgenes in cotton Wber elongation and the presence of multi-gene family of EXPANSIN in other plant species (Li et al.2002; Sampedro and Cosgrove 2005), it is reasonable tohypothesize that some EXPANSIN genes could be involvedin cotton Wber development and that more of these are yetto be discovered.

    Molecular markers used in cotton genome mapping andgenetic diversity analysis have evolved from hybridization-based RFLPs (Reinisch et al. 1994; Shappley et al. 1998;Rong et al. 2004) to PCR-based markers such as RAPDs(Kohel et al. 2001), AFLPs (Abdalla et al. 2001; Mei et al.2004), and microsatellites (Zhang et al. 2002; Han et al.2004, 2006; Park et al. 2005; Frelichowski et al. 2006). Forthese markers, relatively low levels of intraspeciWc poly-morphism and limited association with candidate geneshave hampered integrated genetic mapping and importantcandidate gene mapping. DNA markers speciWc to candi-date genes will help in the association of biological impor-tant genes with complex Wber QTLs. Single-nucleotidepolymorphism (SNP), including single-base changes orindels (insertion or deletion) at speciWc nucleotide posi-tions, has been shown to be the most abundant class ofDNA polymorphism in many organisms (Kwok et al. 1996;Wang et al. 1998; Brookes 1999; Cho et al. 1999). SNPvariation analysis and SNP marker development from can-didate genes could provide valuable information regardingtheir evolution and eVects on complex traits. The antici-pated value of SNPs for analysis of candidate gene evolu-tion and their eVects on complex traits have stimulatedlarge scale SNP characterization and marker mapping inrice (Feltus et al. 2004), wheat (Mochida et al. 2003; Som-ers et al. 2003; Zhang et al. 2003; Caldwell et al. 2004),maize (Ching et al. 2002; Batley et al. 2003), soybean (Zhuet al. 2003; Kim et al. 2005), and barley (Kanazin et al.2002; Bundock et al. 2003; Bundock and Henry 2004).Most cotton sequence variation analyses have been con-Wned to a single gene family or DNA fragments for phy-logenetic analysis (Small et al. 1998; Small and Wendel2000; Cronn et al. 2002; Alvarez et al. 2005). Candidategene-based association mapping using SNP markers hasemerged as a powerful tool to determine the role of genes incomplex traits (Glazier et al. 2002). Despite the importanceof SNPs in studies on human diseases, few SNP analyseshave been carried out in plants, especially in polyploidcrops compared to other types of markers (Kanazin et al.2002; Batley et al. 2003; Neale and Savolainen 2004). Theresearch on SNP analysis in cotton is almost nil due to thelarge genome size (Grover et al. 2004), tetraploid nature,the presence of high repetitive DNA content (Zhao et al.1998), and paucity of information on genomic sequences(Chee et al. 2004). Sequence-tagged sites (STS) sequencing

    123

  • Mol Genet Genomics (2007) 278:539553 541

    results indicated that the rate of variation per nucleotidewas 0.35% between G. hirsutum and G. barbadense, andthe variation per nucleotide was 0.14 and 0.37% withinthese two species, respectively (Rong et al. 2004).

    To further explore the expression patterns and the func-tion of EXPANSIN isoforms, we examined the expressionproWling of EXPANSIN related transcriptome using themicroarray technique (Arpat et al. 2004). Here, we applySNP analysis and marker-based gene chromosomal assign-ment to the six well-characterized cotton EXPANSIN Agenes reported by Harmer et al. (2002). The results func-tionally discriminate isoforms of the EXPANSIN genefamily and provide useful strategy for distinguishing homo-eologous sequences in allotetraploid cotton SNP discovery.The chromosomal locations and SNP markers derived fromthe six EXPANSIN A genes would be useful in integratedgenetic mapping and Wber quality-related QTLs analysis incotton.

    Materials and methods

    Plant materials, DNA and RNA isolation

    Two populations of TM-1 (G. hirsutum L.) were grown asbiological pools under a 30/21C day/night temperatureregime in a greenhouse at UC Davis. Flowers were taggedon the day of anthesis (0 dpa) to collect developing bollsfrom the Wrst fruiting positions at 5, 8, 10, 14, 17, 21, and24 dpa between 8:00 a.m. and 10:00 a.m. to eliminate diur-nal eVects. In order to minimize sampling variation, theharvesting scheme devised by Arpat et al. (2004) was usedand materials stored at 80C. Fibers were separated fromfrozen ovules/locules in liquid nitrogen using a mortar andpestle (Smart et al. 1998) to generate 23 pools represent-ing 818 diVerent bolls. Two total RNA samples were pre-pared independently from 0.5 g of Wber following amodiWcation of the hot borate method (Wilkins and Smart1996) as described (http://www.cfgc.ucdavis.edu/).

    Four tetraploid species, TM-1, HS46 and MARCABU-CAG8US-1-88 (MAR; G. hirsutum L., [AD]1), 379 (G.barbadense L., [AD]2), G. tomentosum Nuttall ex Seemann([AD]3), G. mustelinum Miers ex Watt ([AD]4), andtwo diploid genome species, G. arboreum L. (A2) andG. raimondii Ulbrich (D5), were used for PCR ampliWcationand SNP marker identiWcation. TM-1 and 379 served asgenetic standards for G. hirsutum and G. barbadense,respectively. HS46 and MAR are the two parents ofrecombinant inbred lines developed for mapping projects(Shappley et al. 1998; Ulloa et al. 2005). G. tomentosumand G. mustelinum are two wild cotton species. G. arbor-eum and G. raimondii are extant relatives of speciesthat donated the A and D genomes of the original AD

    allotetraploid that gave rise to the modern 52-chromosomeGossypium species (Brubaker et al. 1999; Wendel andCronn 2003). Two kinds of genetic stocks were used forchromosomal assignment of EXPANSIN A genes by dele-tion analysis: (1) Quasi-isogenic hypoaneuploid interspe-ciWc F1 hybrid chromosome substitution stocks, eachinvolving chromosomally identiWed primary monosomy,monotelodisomy, or tertiary monosomy (Liu et al. 2000;Ulloa et al. 2005), and (2) quasi-isogenic euploid CS-Blines, i.e. BC5S1-derived interspeciWc backcrossed chro-mosome substitution lines of 379 in TM-1 (Stelly et al.2005; Jenkins et al. 2006; Saha et al. 2006a, b). The pri-mary monosomic plants (2n = 51) lacked an entire chro-mosome of the normal G. hirsutum complement, whereasthe monotelodisomic plants (2n = 52) lacked most or all ofjust one G. hirsutum chromosome arm. Tertiary monoso-mics were deWcient in one of the two reciprocally translo-cated G. hirsutum chromosomes described in detail byBrown et al. (1981) and Menzel et al. (1985). Thus, eachtertiary monosomic plant lacked the centric segment fromone G. hirsutum chromosome and the distal acentricsegment from the other chromosome involved in thetranslocation. We used two sets of monosomic andmonotelodisomic F1 interspeciWc hybrids, from G. hirsu-tum aneuploids crossed with 379 (G. barbadense) versusG. tomentosum; and one set of tertiary monosomichybrids, from crosses with G. tomentosum only. DNAsamples of the diploid species (G. arboreum and G. rai-mondii) were kindly provided by Dr. John Yu (USDA-ARS, Crop Germplasm Research Unit, College Station,TX, USA). Genomic DNA of the diVerent genotypes andspecies were isolated from young leaves of individualplants using a DNeasy Plant Maxi Kit (Qiagen Inc., Valen-cia, CA, USA).

    Microarray expression analysis

    Cotton Wber 70-mer cDNA microarrays, based on the cot-ton Wber EST database, were fabricated by the UC DavisCotton Functional Genomics Center (Arpat et al. 2004;Table 1). Microarrays were printed by spotting oligonucle-otides (40 M) suspended in 50% DMSO on CorningGAPSII slides in duplication, arranged in 23 23 subar-rays using the OmniGrid arrayer (Genomic Solutions, AnnArbor, MI, USA) equipped with 16 (4 4) MicroQuill pins(Majer Precision Engineering, Tempe, AZ, USA). Slidepost-processing utilized UV exposure to chemically cross-link oligomers according to the slide manufacturersinstructions. Fluorescently tagged amino-allyl-labeledcDNA probes were generated by reverse transcription ofcotton Wber total RNA (30 g), which was mixed with ref-erence mRNA spike mix (10, 50, 200, 400, 1000, and 2000pg Alien Spike mRNAs 1 through 6, respectively, Stratagene,

    123

    http://www. cfgc.ucdavis.edu/

  • 542 Mol Genet Genomics (2007) 278:539553

    La Jolla, CA, USA), 5 g poly-T21VQ primers (Operon,Huntsville, AL, USA), 0.2 mM each of dATP, dCTP, anddGTP, and a 2:3 ratio of 0.2 mM amino-allyl-dUTP/dUTPmix (Sigma, St Louis, MO, USA) and SuperScript II (Invi-trogen, Carlsbad, CA, USA) in a total volume of 40 l andincubated at 42C for 4 h. Aminoallyl-labeled cDNA waspuriWed using the QiaQuick PCR puriWcation kit (QiagenInc., Valencia, CA, USA), and labeled with Cy3 or Cy5monofunctional dyes (Amersham, Piscataway, NJ, USA)according to the manufacturers instructions. LabeledcDNAs were quality controlled by measuring absorbance at260, 280, 555, and 650 nm. Concentrations of cDNAs werecalculated for each probe, and the frequency of incorpora-tion (FOI) for dye molecules was expressed as the numberof dye molecules per 100 bp. Probes having an FOI lessthan 20 were discarded. Printed slides were pre-hybridizedfor 45 min at 42C in a sealed coplin jar containing a 50 mlsolution of 5 SSC, 0.1% SDS, and 0.1 g/l BSA, andthen washed twice for 10 s each in Wltered ddH2O and driedwith Wltered compressed air. The Wnal hybridization solu-tion (50 l) contained amino-labeled cDNA in 30% form-amide, 5 SSC, 0.1% SDS, and 0.1 g/l sheared salmonsperm DNA and was applied to the slide surface by usingmSeries LifterSlips (Erie ScientiWc, Portsmouth, NH,USA). After incubating at 42C for 1620 h in a humidiWedhybridization chamber (Genetix, Boston, MA, USA), theLifterSlips were removed by dipping slides in 2 SSC at42C. Slides were washed sequentially at 42C in 2 SSC,0.2% SDS for 5 min, 0.5 SSC for 3 min, four times in0.1 SSC for 1 min per wash, and 0.05 SSC for 10 s,followed by a dip in ddH2O and drying with Wltered com-pressed air. Slides were scanned at a 10 m resolutionusing an AVymetrix Array Scanner 428 (AVymetrix, SantaClara, CA, USA) and signal intensities were quantiWed withImaGene 4.2 (BioDiscovery, El Segundo, CA, USA). The

    non-linear approach was employed for microarray dataanalysis by GNU statistical computing language R version2.0.1. A bi-directional double-loop experimental designwas adopted for microarray analysis (Kerr and Churchill2001; Glonek and Solomon 2004) to analyze all possiblesigniWcant changes in gene expression between any twodevelopmental time points. The results were interpretedusing the 5 dpa time point as the common reference. Con-trol experiments from the same population of plants indi-cated no signiWcant biological or technical variability(Arpat et al. 2004). Comparison of log 2 signal intensityratios derived from direct and indirect paths for each devel-opmental time point were well within acceptable rangesindicating good experimental reliability (Knig et al. 2004).

    PCR primer design, ampliWcation, cloning and sequencing

    A total of 13 primer pairs were designed for six cottonEXPANSIN A genes (Table 2) using Primer 3 software(http://www.frodo.wi.mit.edu/). For each gene, there was atleast one primer pair designed to amplify 400800 bp forsequencing. Gene-speciWcity of each primer was testedusing BLASTN against cotton genomic sequences in Gen-Bank. Pfu polymerase (Stratagene, La Jolla, CA, USA) wasused for PCR ampliWcation according to the manufacturersprotocol on a PTC-225 Peltier Thermal Cycler (MJResearch Inc, Waltham, MA, USA). The PCR productswere excised from agarose gels following electrophoresis,puriWed using QIAEX II gel extraction kit (Qiagen Inc,Valencia, CA, USA), and cloned into the pCR4-TOPO vec-tor (Invitrogen, Carlsbad, CA, USA) after adding the 3-Ato the puriWed DNA fragment according to the manufac-turers protocol. Plasmid DNA isolated from kanamycin-resistant colonies was bi-directionally sequenced with ABIPrism BigDye Terminator Cycle Sequencing Kit v3.1(Applied Biosystems, Foster City, CA, USA) on an ABIautomated sequencer. PCR products were initially cloned toenable distinction among related loci and alleles expectedin allotetraploid cotton, including pairs of homoeologousAt and Dt loci, and any additional duplications or heterozy-gosity. In this study, 12 recombinant colonies weresequenced for each EXPANSIN A gene generated from eachgenotype to avoid possible complications due to PCRrecombination (Cronn et al. 2002). Forward and reversematched sequences from at least three clones were used todetermine the sequence for each duplicated copy (Cedroniet al. 2003; Rong et al. 2004).

    Sequence analysis and SNP primer design

    Alignment of diverse sequences was conducted usingClustalX (Thompson et al. 1997), and phylogenetic relation-ships were analyzed by the neighbor joining (NJ) method

    Table 1 List of seven cotton Wber EXPANSIN transcripts and relatedmicroarray oligonucleotide probes from the Ga (G. arboreum) cottonWber dbEST (http://www.cfgc.ucdavis.edu)

    a Probe generated from EST GA_Ed0103C11f exactly matched ESTDQ023525, AF043284, AY189969, and the two EXPANSIN A genes[AF512539 (GhEXPA1), AF512540 (GhEXPA2)] for which SNP anal-ysis was performed in this study

    Probe database reference EST name

    Cotton12_16168_001a GA_Ed0103C11f

    Cotton12_12779_001 GA_Eb0009P18

    GA_Ed0108H10f GA_Ed0108H10f

    gi|908878|gb|108878|GA_Eb0026E18f GA_Eb0026E18

    gi|7502129|gb|AW667749|GA_Ea0010H23 GA_Ea0010H23

    gi|8381468|gb|BE054412|GA_Ea0032B18 GA_Ea0032B18

    gi|907832|gb|107832|GA_Eb0025K12f GA_Eb0025K12

    123

    http://www.cfgc.ucdavis.eduhttp://www.frodo.wi.mit.edu/

  • Mol Genet Genomics (2007) 278:539553 543

    using MEGA version 3.1 (Kumar et al. 2004). The putativeassignment of a sequence to a particular locus or subge-nome was based on phylogenetic analysis and the relation-ship to diploid ancestral species (A2 and D5) of thetetraploid cotton. DnaSP 4.0 software (Rozas et al. 2003)was used to identify SNPs based on a comparative align-ment of sequences at a putative locus, and estimate ofnucleotide and haplotype diversity. Final average nucleo-tide diversities () were calculated from all pairwise com-parisons (Tajima 1983; Nei 1987). Interspecies SNPprimers were designed based on single nucleotide diVer-ences in the sequences at a putative locus between TM-1and 379 or TM-1 and G. tomentosum for chromosomalassignment of SNP markers. Primers were selected toanneal immediately upstream or downstream of the SNPsite as the forward or reverse primer, respectively, so thatthe polymorphism could be detected by one-base extensiontechnology with the ABI Prism SNaPshotTM multiplex kit.Selected primers were evaluated by Primer 3 (http://www.frodo.wi.mit.edu/) using criteria of a primer length of1826 nt (20 nt as the optimum), an optimum annealingtemperature of 50C, and a 4060% GC content.

    SNP genotyping

    The ABI Prism SNaPshotTM multiplex kit and an ABI 3100capillary electrophoresis system were used for screeningSNP markers following a slight modiWcation of the manu-facturers protocol. PCR products ampliWed using Pfu poly-merase were puriWed by incubation with SAP and Exo I(2 units of SAP and 4 units of Exo I in a 20 l PCR reactionvolume) at 37C for 1 h and then at 75C for 15 min. Thethermal cycle reaction mixture (7 l reaction system in 384-well plates) contained 1.5 l of SNaPshot multiplex readyreaction mix, 0.5 l of puriWed PCR product, 0.2 l of SNPprimer (10 M), and 4.3 l of distilled water. The thermalcycle reaction was carried out for 25 cycles at 96C for10 s, 50C for 5 s, and 60C for 30 s. After treatment withSAP (1 unit) at 37C for 1 h, and followed by incubatingthe reactions at 75C for 15 min, 1 l of 10-fold dilutedSNaPshot product was mixed with 0.2 l of GeneScan-120LIZ size standard, and 8.8 l of Hi-Di formamide, dena-tured at 95C for 5 min, and then loaded onto the ABI 3100Genetic Analyzer (Applied Biosystems, Foster City, CA,USA) in SNaShot mode for SNP marker analysis.

    Table 2 PCR primers designed from six EXPANSIN A cotton Wber genes

    Gene NCBI accessiona Primer Primer sequence (5!3)

    GhEXPA1 AF512539 GhEXPA1 F: CCCCACGAGAACACTTTGAT

    R: CTAATGGCACTTGCTTGCCT

    GhEXPA2 AF512540 GhEXPA2-1 F: GTCAGCCAATTGTTTGAGCTA

    R: TAGATAAAGCATAGTTAGGGG

    GhEXPA2-2 F: ACAGCCACCAACTTTTGTCCC

    R: AGTTTGTCCGAATTGCCAAC

    GhEXPA3 AF512541 GhEXPA3-1 F: GTATGCTTTTTGGTATGCAG

    R: GTGTGTCGGTGGAAAATG

    GhEXPA3-2 F: TTTGACAATGGCTTGAGCTG

    R: TCCGTTACTGGTTGTGACGA

    GhEXPA4 AF512542 GhEXPA4-1 F: TACGCCCGATATTCAACACA

    R: CGTTTGCGCACTTAATCTCA

    GhEXPA4-2 F: TGAGATTAAGTGCGCAAACG

    R: GCCAGTTTTGACCCCAGTTA

    GhEXPA4-3 F: TGAAGGTGAAGGGAACCAAC

    R: CCCAACCCCCATTTTTACTT

    GhEXPA5 AF512543 GhEXPA5-1 F: GTGCCCACCCAATAATTAAA

    R: ATGTTAATCGTACCTCCGAT

    GhEXPA5-2 F: TGATTTTCGAAGGGTGCCAT

    R: TATTGCATGCTCCCAAACAC

    GhEXPA6 AF512544 GhEXPA6-1 F: CTGTTGTTTGTTCGCAGGAA

    R: GAAGCAGCAAAAGGCAAAAC

    GhEXPA6-2 F: TGGCCACTCCTACTTCAACC

    R: GACCCGAAAGTCCCACTACA

    GhEXPA6-3 F: GTTTTGCCTTTTGCTGCTTC

    R: GAGAGGCTTTGTCCGTTGAG

    In this report we used the standard nomenclature of the EXPANSIN genes (Kende et al. 2004)a The NCBI accession numbers represent the original genes iso-lated from G. hirsutum L. cv Siokra 14 (Harmer et al. 2002)

    123

    http://www.frodo.wi.mit.edu/http://www.frodo.wi.mit.edu/

  • 544 Mol Genet Genomics (2007) 278:539553

    Chromosomal assignment

    The deletion analysis method was used for identifyingchromosomal locations of six EXPANSIN A genes (Liuet al. 2000; Ulloa et al. 2005). All of the aneuploid chromo-some substitution F1 lines, except the particular aneuploidline missing a speciWc chromosome or chromosome arm,would expectedly show two alleles originating from bothparents, similar to an F1 heterozygous locus. The absence ofthe TM-1 (G. hirsutum) allele in any one of the aneuploidF1 plant indicated that the missing chromosome or chromo-some arm was the most likely location of the gene of inter-est. In euploid CS-B stocks, the absence of the TM-1 allelebut presence of the 379 allele suggested that the gene wasprobably located in the substituted chromosome or chromo-some arm.

    Results

    Fiber transcriptome expression proWles of EXPANSIN genes

    A search of >38,000 cotton (G. arboreum) Wber ESTsrevealed the presence of seven diVerent EXPANSIN tran-scripts (Table 1). In silico expression analysis of Wber ESTsrevealed that cell wall-related genes, including two EXPAN-SIN genes (two Ga_Ea clones in Table 1), represent, by far,the most abundant gene transcripts during this stage of devel-opment (Arpat et al. 2004). Developmental transcriptomeproWles of all EXPANSIN genes were extracted from micro-array data spanning 524 dpa cotton Wbers. Changes in thetemporal expression of all seven EXPANSIN genes in devel-oping Wbers of the tetraploid TM-1, relative to 5 dpa areshown in Fig. 1. The expression proWles were an average ofthe At- and Dt-homoeologs. In general, developmental regu-lation of EXPANSIN gene expression (Fig. 1) closely paral-lels that of the growth rate during the period of rapid polarelongation (Wilkins and Arpat 2005). The EXPANSIN geneswere diVerentially regulated and could be classiWed as majorand minor isoforms. Temporal patterns revealed two majorexpression trends. After 17 dpa, expression decreased dra-matically for the four major EXPANSIN isoforms that aremost abundant during rapid Wber elongation, speciWcallyGA_Eb0026E18, GA_Ea0032B18, GA_Ed0103C11f, andGA_Eb0025K12 (Fig. 1). By contrast, expression remainedrelatively high after 21 dpa and well into the stage of second-ary cell wall synthesis of relatively minor isoforms,GA_Eb0009P18, GA_Ed0108H10f, and GA_Ea0010H23.Thus, the Wber EXPANSIN genes were diVerentiallyexpressed and followed discrete temporal patterns. Theirexpression overlapped the termination of elongation and theonset of secondary cell wall biogenesis.

    Sequence characterization of six EXPANSIN A genes

    Sequence data were obtained from cloned PCR fragmentsampliWed from EXPANSIN A genes in selected cotton geno-types using gene-speciWc primer pairs (Table 2). First, theidentities of all the sequences in our results were conWrmed tothe respective original gene by BLASTN. Phylogenetic analy-ses were then used to discriminate homoeologous genesequences based on the relationship with the ancestrally relatedextant genomes A2 and D5. Within each genome cluster,sequences from the available genotypes were classiWed to oneputative locus based on the clustering results of the phylogram.No additional duplicated loci were found within either of theAt- or Dt-genomes for any of the six EXPANSIN A genes. Anylocus-speciWc sequence diVerence identiWed between theselected genotypes was considered as an SNP. Genome assign-ments and phylogenetic analyses of all PCR-ampliWed genefragments are summarized in supplementary Fig. 1.

    In total, 52.9 kb of DNA sequences (33.7 and 19.2kbfrom coding and noncoding regions, respectively) wereacquired from both genomes after eliminating overlappingsequences (Table 3). The available sequence from A- andD-genomes was 31.7 and 21.2 kb, respectively. BLASTNanalysis showed that ampliWed gene segments, ranging insize from 340 to 607 bp, had 98100% nucleotide identityto the target gene. A total of 222 SNPs, including 120 sin-gle-base changes and 102 indels, were identiWed in 134

    Fig. 1 Developmentally regulated cotton Wber. EXPANSIN genes arediVerentially expressed during the period of primary expansion andrapid polar elongation from anthesis (0 dpa) to 24 dpa Wbers. Theexpression of major isoforms of EXPANSIN (GA_Eb0026E18,GA_Ea0032B18, GA_Ed0103C11f, and GA_Eb0025K12) during po-lar elongation decreased dramatically after 17 dpa, while the expres-sion of relatively minor isoforms (GA_Eb0009P18, GA_Ed0108H10f,and GA_Ea0010H23) remained relatively high after 21 dpa and wellinto the stage of secondary cell wall synthesis. Peak growth occurredaround 1417 dpa. Log 2 expression ratios were obtained from micro-array data and expressed relative to the 5 dpa time point

    00+E05.1-

    00+E00.1-

    10-E00.5-

    00+E00.0

    10-E00.5

    00+E00.1

    00+E05.1

    00+E00.2

    42127141018

    sisehtna tsop syaD

    Exp

    ress

    ion

    ratio

    f11C3010dE__AG

    81P9000bE__AG

    f01H8010dE__AG

    81E6200bE__AG

    32H0100aE__AG

    81B2300aE__AG

    21K5200bE__AG

    123

  • Mol Genet Genomics (2007) 278:539553 545

    Tab

    le3

    Dis

    trib

    utio

    n an

    d ty

    pes

    of S

    NP

    s in

    PC

    R a

    mpl

    icon

    s of

    six

    cot

    ton

    EX

    PA

    NSI

    N A

    gen

    es

    aW

    hen

    the

    num

    ber

    of S

    NPs

    in th

    e am

    pliW

    ed f

    ragm

    ent w

    as le

    ss th

    an tw

    o, h

    aplo

    type

    ana

    lysi

    s w

    as n

    ot c

    ondu

    cted

    . Site

    s w

    ith a

    lignm

    ent g

    aps

    wer

    e no

    t con

    side

    red

    for

    hapl

    otyp

    te a

    naly

    sis

    bH

    mea

    ns h

    aplo

    type

    num

    ber

    cH

    d re

    fers

    to h

    aplo

    type

    div

    ersi

    tyd

    G. t

    omen

    tosu

    m a

    nd G

    . rai

    mon

    dii h

    ave

    one

    gap

    in th

    e co

    ding

    reg

    ion,

    whi

    ch w

    as n

    ot c

    onsi

    dere

    d in

    the

    amin

    o ac

    id c

    hang

    e an

    alys

    ise

    The

    re w

    as o

    ne tr

    iall

    elic

    SN

    P (

    A/C

    /G)

    site

    . Her

    e, w

    e re

    cord

    ed it

    at t

    he A

    /G tr

    ansi

    tions

    pos

    ition

    fT

    hirt

    y W

    ve g

    aps

    appe

    ared

    bet

    wee

    n G

    . tom

    ento

    sum

    and

    G. r

    aim

    ondi

    i wer

    e no

    t con

    side

    r in

    the

    amin

    o ac

    id c

    hang

    e an

    alys

    is

    Fra

    gmen

    tG

    enom

    eN

    o. o

    f av

    iaila

    ble

    sequ

    ence

    s

    Hap

    loty

    pesa

    Tra

    nsiti

    ons

    Tra

    nsve

    rsio

    nsIn

    dels

    Cod

    ing

    regi

    onT

    otal

    Hb

    Hdc

    A/G

    T/C

    A/T

    G/C

    A/C

    G/T

    AC

    GT

    Len

    gth

    (bp)

    No.

    of

    SNP

    s (n

    on-

    syno

    nym

    ous

    chan

    ges)

    Len

    gth

    (bp)

    No.

    of

    SNP

    s

    EX

    PA1

    A4

    30.

    833

    (0.

    222

    31

    00

    01

    11

    01

    392

    4(3)

    607

    8

    D7

    50.

    857

    0.13

    74

    40

    20

    20

    00

    139

    27(

    6)60

    713

    EX

    PA2-

    1A

    1N

    /AN

    /AN

    /AN

    /AN

    /AN

    /AN

    /AN

    /AN

    /AN

    /AN

    /AN

    /AN

    /AN

    /AN

    /AN

    /A

    D7

    30.

    667

    0.16

    01

    00

    01

    00

    00

    035

    82(

    1)47

    42

    EX

    PA2-

    2A

    44

    1.00

    0

    0.17

    74

    10

    10

    11

    10

    043

    56(

    3)51

    19

    D7

    50.

    857

    0.13

    74

    20

    20

    21

    00

    043

    510

    (5)d

    511

    11

    EX

    PA3-

    1A

    73

    0.52

    4

    0.20

    91

    00

    11

    01

    20

    041

    62(

    1)59

    56

    D1

    N/A

    N/A

    N/A

    N/A

    N/A

    N/A

    N/A

    N/A

    N/A

    N/A

    N/A

    N/A

    N/A

    N/A

    N/A

    N/A

    EX

    PA3-

    2A

    63

    0.60

    0

    0.21

    01

    00

    01

    00

    00

    044

    42(

    2)53

    12

    D5

    30.

    700

    0.21

    81

    10

    01

    00

    00

    044

    42(

    1)52

    73

    EX

    PA4-

    1A

    74

    0.81

    0

    0.13

    00

    20

    01

    00

    00

    129

    33(

    1)55

    24

    D1

    N/A

    N/A

    N/A

    N/A

    N/A

    N/A

    N/A

    N/A

    N/A

    N/A

    N/A

    N/A

    N/A

    N/A

    N/A

    N/A

    EX

    PA4-

    2A

    66

    1.00

    0

    0.09

    61

    22

    20

    00

    00

    038

    34(

    3)53

    87

    D6

    50.

    933

    0.12

    20

    32

    01

    10

    00

    038

    32(

    0)53

    87

    EX

    PA4-

    3A

    3N

    /AN

    /A0

    01

    00

    00

    00

    020

    40(

    0)34

    01

    D7

    50.

    905

    0.10

    31

    13

    10

    10

    00

    020

    43(

    2)34

    07

    EX

    PA5-

    1A

    74

    0.71

    4

    0.18

    03e

    00

    11

    11

    00

    131

    06(

    5)42

    08

    D5

    30.

    700

    0.21

    81

    20

    00

    011

    127

    1031

    037

    (1)f

    447

    43

    EX

    PA5-

    2A

    64

    0.86

    7

    0.12

    90

    01

    02

    01

    00

    031

    32(

    2)50

    94

    D3

    20.

    667

    0.31

    40

    11

    00

    01

    00

    031

    32(

    2)51

    03

    EX

    PA6-

    1A

    54

    0.90

    0

    0.16

    13

    40

    20

    10

    01

    044

    08(

    4)57

    211

    D7

    50.

    857

    0.13

    74

    21

    00

    10

    00

    044

    07(

    5)57

    28

    EX

    PA6-

    2A

    53

    0.70

    0

    0.21

    81

    00

    00

    10

    00

    026

    52(

    2)41

    72

    D6

    20.

    333

    .215

    11

    00

    00

    02

    01

    265

    2(2)

    420

    5

    EX

    PA6-

    3A

    74

    0.71

    4

    0.18

    12

    30

    01

    212

    28

    1920

    42(

    2)59

    149

    D7

    40.

    810

    0.13

    02

    12

    00

    21

    00

    120

    44(

    3)55

    09

    123

  • 546 Mol Genet Genomics (2007) 278:539553

    amplicons (Table 3). Transitions accounted for 69 (57.5%)and transversions for 51 (42.5%) of the total 120 single-base changes. The ratio of A/G to T/C transitions was1.23:1, with no signiWcant diVerence between the four typesof transversions. Analysis of indel sequences indicated abias toward A and T nucleotides, which is similar tothe A nucleotide bias in maize (Batley et al. 2003). Theaverage rate of SNPs per nucleotide was 2.35%, with 1.74and 3.99% occurring in the coding regions and noncodingregions, respectively, in selected genotypes. Based on theaverage rate of SNP per nucleotide, a higher nucleotidechange rate was discovered in the D-genome (2.90%) thanin the A-genome (1.98%). Furthermore, an uneven distribu-tion of SNPs was observed among the six genes, suggestingthat the occurrence of SNP varies among the six EXPAN-SIN A genes. Amplicons of EXPA2-2 and EXPA5-1 in G.tomentosum and G. raimondii had a 46- and 35-nucleotidedeletion, respectively, thereby contributing to a higher rateof nucleotide polymorphism in the Dt-genome of tetraploidspecies. One putative triallelic SNP site (A/C/G) was dis-covered in the A-genome amplicon of EXPA5-1, similar totriallelic SNP loci observed in soybean ESTs (Van et al.2005). The simple sequence repeat (TA) motif is presentin the intron of the EXPA5-1 amplicon, as previouslyreported (Kumar et al. 2006). The 119 cSNPs included 83single-base changes and 36 indels in the 33.7 kb of codingsequence. Among the single-base changes, 26, 31, and 26were detected in the Wrst, second, and third codon positions,respectively. Results revealed that a total of 56 out of the119 cSNPs (47%) were nonsynonymous changes, providedthat the aforementioned fragment deletions of EXPA2-2and EXPA5-1 in G. tomentosum and G. raimondii were notconsidered.

    Detailed results of putative haplotype organization areincluded in the supplementary Tables 122. The number ofSNP haplotypes present in each group was determined when

    the available sequence number exceeded two and more thanone SNP variable site presented in the aligned sequences.Results showed that the haplotype number ranged from twoto six out of the maximum seven available genotypesequences, with an estimated haplotype diversity that variedbetween 0.33 and 1 (Table 3). The relatively high haplotypenumber indicated the distinct and diverse SNP charactersamong the selected cotton species, especially at the inter-species level, which represents valuable sources of germ-plasm diversity for upland cotton improvement.

    Nucleotide diversity

    Nucleotide diversity () across all possible comparisonsamong the eight genotypes was measured using availablesequences from A- or D-genomes. Because only twosequences were compared at a time, and no segregating pop-ulations were considered, we only report the value for thenucleotide diversity assay, which measures the average num-ber of nucleotide diVerences per site between two sequences(Nei 1987). The mean value of in the A- and D-genome issummarized in Table 4. Results showed that nucleotidediversity was lowest among the three G. hirsutum lines (TM-1, HS46, and MAR), revealing that polymorphisms amongthe six EXPANSIN A genes were much higher at the interspe-ciWc level than intraspeciWc level in Gossypium species. Thelowest nucleotide diversity was found between HS46 andMAR in both At- and Dt-genomes, observed as 0.0207 and0.1018, respectively. Nucleotide diversities ( values) amongthe three G. hirsutum lines in the Dt-genome were signiW-cantly higher than those in the At-genome, indicating thatloci in the Dt-genome exhibit a faster evolutionary rateamong the six EXPANSIN A genes compared to the At-genome in tetraploid cotton.

    The independent and incongruent evolution of the twosubgenomes (At and Dt) was also revealed by the diVerent

    Table 4 The nucleotide diversity ( values 102) among six PCR-ampliWed cotton EXPANSIN A gene fragments in A- and D-genomes

    value, a measure of the average number of nucleotide diVerences per site between two sequences. The numbers below and above the diagonalrepresent A- and D-genome sequences pairwise comparisons, respectivelya G. arboreum and G. raimondii were considered related to the diploid ancestral A- and D-genome progenitors of tetraploid species; TM-1 wasconsidered as genetic standard of G. hirsutum and 379 was includes as a representative sample of G. barbadense; HS46 and MAR (abbreviationof MARCABUCAG8US-1-88) were used as two diverse G. hirsutum lines. One accession of G. mustelinum and G. tomentosum were included torepresent another two allotetraploid species

    Genotypea G. arboreum TM-1 HS46 MAR 379 G. mustelinum G. tomentosum

    G. raimondii 0.5622 0.5457 0.7105 0.6166 0.7731 0.6812

    TM-1 0.5137 0.1243 0.2167 0.5916 0.6417 0.4699

    HS46 0.4759 0.031 0.1018 0.5087 0.5491 0.4080

    MAR 0.5175 0.071 0.0207 0.5994 0.6694 0.4160

    379 0.6353 0.2964 0.4073 0.3826 0.3659 0.3663

    G. mustelinum 0.5461 0.3103 0.2363 0.2967 0.3841 0.4595

    G. tomentosum 0.6689 0.3494 0.3027 0.3029 0.3209 0.392

    123

  • Mol Genet Genomics (2007) 278:539553 547

    phylogenetic topologies detected in the duplicated genes orfragments of the selected tetraploid species. The highestnucleotide diversity in the A-genome was observed betweenthe ancestral A-genome species (G. arboreum) and G.tomentosum ( = 0.6689), while in the D-genome, the high-est nucleotide diversity was obtained from G. raimondii(ancestral D-genome donor) and G. mustelinum ( = 0.7731)(Table 4). When considering all other genotypes, the diVer-ent phylogenetic patterns of At- and Dt-genomes becameeven more evident. Analyses of the six EXPANSIN A genessuggest that polyploid speciation in cotton was accompaniedby an independent subgenome molecular evolution.

    Chromosomal assignment

    We used single-nucleotide extension with primers shown inTable 5 to genotype 25 gene-speciWc SNP markers in the

    six EXPANSIN A genes. The chromosomal location of eachSNP locus was delimited by deletion analysis with one ormore hypoaneuploid or euploid stocks, and the results werecompared for SNPs within and across the six EXPANSIN Agenes. The individual SNP loci were assigned to the longarms of chromosomes 20, 10, 9, 1, and 3 (Table 5). For thefour genes (GhEXPA3, GhEXPA4, GhEXPA5, and GhEX-PA6) and the At-genome locus of gene GhEXPA2, repre-sented here by multiple SNPs, chromosomal assignmentswere concordant among SNPs within a gene. SNP markersExp1-1_Gbmt_193F, and Exp2-1_Gbmt_378R from geneGhEXPA1 and Dt-genome locus of gene GhEXPA2, respec-tively, were both localized to the chromosome arm 20Lo,which indicated the chromosome locations of the twogenes. Chromosomes 10 and 20 are homoeologous, so it ispossible that we detected At- and Dt- homoeologous loci ofGhEXPA2 on the long arms of chromosomes 10 and 20. In

    Table 5 Chromosomal locations of six EXPANSIN A genes in cotton

    a The nomenclature of the SNP markers followed the order: ampliWed fragment, polymorphic character, SNP site position in the ampliWed frag-ment, and forward or reverse primer. For example, Exp2-2_Gbmt_422R means: this marker is located at position 422 of ampliWed fragmentEXPA2-2; this SNP site is polymorphic between TM-1 and 379, G. mustelinum or G. tomentosum; it is a reverse ampliWcation primerb 20Lo means on the long arm of chromosome 20

    Gene SNP markera SNP primer sequence (5!3) Chromosome locationb

    Aneuploid G. barbadense

    Aneuploid G. tomentosum

    Euploid CS-B

    GhEXPA1 Exp1-1_Gbmt_193F GTCCGAATTGCCAACCAGC 20Lo 20Lo N/A

    GhEXPA2 Exp2-1_Gbmt_378R CACTTTTCTTCTTTTTGTTCAGT 20Lo 20Lo N/A

    Exp2-2_Gbt_58F CAGCAGGCACTACATTGTAG 10Lo 10Lo 10

    Exp2-2_Gbt_59R AGCGATGGCAGGACTATCACA 10Lo 10Lo 10

    Exp2-2_Gbt_93F CATCGCTGGCAGTCACTTT 10Lo 10Lo 10

    Exp2-2_Gbt_108R AGAGCAATGCTTACCTTAACGG 10Lo 10Lo 10

    Exp2-2_Gbt_175F CTGGACATAGGTAGCCATCCTGTT 10Lo 10Lo 10

    Exp2-2_Gb_182R GATATAACGTCAGTGTCCATCAAG N/A N/A 10

    Exp2-2_Gbt_312F TGACACCCTGCAAAAGGT 10Lo 10Lo 10

    Exp2-2_Gbmt_345R AAACTCAATTCAAATCATCAC 10Lo 10Lo 10

    Exp2-2_Gbt_415F ACGATTCCAGCTCGATATTC 10Lo 10Lo 10

    Exp2-2_Gbmt_422R CCGAACCGGCATTCTTGC 10Lo 10Lo 10

    Exp2-2_Gbt_508R ACAGCCACCAACTTTTGTCC N/A 10Lo 10

    GhEXPA3 Exp3-1_Gb_489F TGATCTCTCTCAGCCTATTTTT 10Lo N/A 10

    Exp3-2_Gb_372F TCTGTATTGGGCAATGTGTT N/A N/A 10

    GhEXPA4 Exp4-1_Gbmt_65F GCCATTATTGAAAAGTGCAG 9Lo 9Lo 9

    Exp4-1_Gbmt_147R ATGGTGTTTGAATTTTTTT 9Lo N/A 9

    Exp4-1_Gbm_412F GCATTCACCACAGCCAAAAA 9Lo N/A 9

    Exp4-2_Gbmt_244F CAAAATCTTTCCCCTTTTACT 9Lo 9Lo 9

    GhEXPA5 Exp5-1_Gmt_205R CTACCAACTCAGGTGCGATTAC N/A 1 N/A

    Exp5-2_Gbt_444F ATACGTCATTAAATTTTCCC 1Lo N/A 1

    GhEXPA6 Exp6-1_Gb_77F CTCGCGATGGTCTCTGGT 3Lo N/A N/A

    Exp6-1_Gbm_89F GGTCTCTGGTGTTCAGGGATAT 3Lo N/A N/A

    Exp6-1_Gbmt_96R TTGCATGTGCATTAGTCCAA 3Lo 3Lo N/A

    Exp6-1_Gb_156R CTAAAAGATGGCTTCATTTGAAGC 3Lo N/A N/A

    123

  • 548 Mol Genet Genomics (2007) 278:539553

    a few cases where aneuploid analysis could not be con-Wrmed by euploid CS-B lines or visa versa, chromatinlosses during backcrossing or other types of cytologicalabnormalities in the development of these cytogeneticstocks could explain the results, suggesting that these par-ticular cytogenetic stocks warrant further characterization.

    Discussion

    Cotton Wber elongation is the net result of the complexinterplay between cell turgor and cell wall extensibility.The elongation is coupled with the expression of manygenes, among which EXPANSIN is one of most highlyexpressed (Arpat et al. 2004). To more fully explore anddiVerentiate among EXPANSIN genes, we characterizedEXPANSIN transcriptome expression proWles during cottonWber polar elongation, assessed SNPs and developed SNPmarkers in the six well-characterized cotton EXPANSIN Agenes and chromosomally assigned each of them. Successof the SNP marker development approach used here indi-cates a workable strategy to distinguish homoeologoussequences and thus SNP marker development for geneswithin multigene family. The chromosome localizationresults will contribute to the comparative map of cottonchromosomes and facilitate research that speciWcally testswhether any of the six EXPANSIN A genes impact Wberquality, e.g., by enabling Wber quality SNP-QTL associa-tion analysis.

    EXPANSIN transcriptome expression proWling

    Our proWling of EXPANISN transcripts by quantitativeanalysis of expression cDNA microarrays was inferablymore accurate than previous proWling eVorts (Shimizu et al.1997; Orford and Timmis 1998; Ruan et al. 2001; Harmeret al. 2002; Ji et al. 2003), which relied on visually compar-ing the brightness of the bands from RT-PCR or RNA gelblotting assays. The microarray-based expression proWlesrevealed parallels between seven diVerent EXPANSIN tran-scripts and polar elongation during Wber morphogenesis.The time diVerence between the highest rate of cotton Wberelongation (approximately 1012 dpa; Wilkins and Arpat2005) and EXPANSIN transcriptome expression peak(around 17 dpa) indicated that the correlation betweenEXPANSIN gene transcripts accumulation and maximumWber growth rate was limited, as was observed in tomato forLeExp2 and LeExp18 gene expression (Caderas et al.2000). In addition to being developmentally regulated, ourresults showed that the stage-speciWc EXPANSIN genes arealso diVerentially expressed (Fig. 1). Wilkins and Arpat(2005) proposed that the members of gene families likeEXPANSIN were temporally regulated, indicating that

    closely related genes play functional roles in cotton Wberdevelopment, e.g., stage-speciWc expression. Wilkins andArpat (2005) divided the EXPANSIN isoforms into majorand minor groups, which were responsible for primarilypolar elongation and lateral expansion, respectively. In thiscase, transcripts GA_Eb0026E18, GA_Ea0032B18,GA_Ed0103C11f, and GA_Eb0025K12 were the majorisoforms, and the minor isoforms came from the other threetranscripts in the EXPANSIN multigene family (Fig. 1).

    In cotton, more than 60 EXPANSIN genes or gene-likesequences have been deposited to the NCBI database so far.Considering the EXPANSIN ESTs in Ga (G. arboreum)cotton Wber dbEST (http://www.cfgc.ucdavis.edu), it is rea-sonable to conclude that cotton EXPANSIN genes are also acomplex superfamily. The EXPANSIN transcriptome pro-Wles assay made the study of EXPANSIN genes involved incotton Wber development possible. BLASTN analysisshowed that the probe generated from EST GA_Ed0103C11f exactly matched EST DQ023525, AF043284,AY189969, and the two EXPANSIN A genes [AF512539(GhEXPA1), AF512540 (GhEXPA2)] for which SNP analy-sis was performed in this study. So, GhEXPA1 or GhEX-PA2 may be classiWed as the major isoforms of EXPANSINand function primarily in polar elongation, which may wellaccount for the bulk of genetic variability associated withmajor QTLs for Wber quality (Harmer et al. 2002; Wilkinsand Arpat 2005). The other probes seem unique to theremaining six transcripts because there is no signiWcantidentiWed match in the database. The gap between tran-scriptome assay and gene isolation and characterization ofEXPANSIN in cotton indicated the complexity of this multi-gene family and furthermore their important role in Wberdevelopment. Cloning new gene members from theEXPANSIN superfamily is still possible and necessary forstudying developing cotton Wber. Considering the tetraploidnature of upland cotton, discerning the expression amongall the EXPANSIN genes and their homoeologous genes isrecommended.

    SNP marker discovery in tetraploid cotton

    Polyploidy complicates the identiWcation of SNPs by thefact that highly similar sequences exist in diVerent subge-nomes (homoeologous sequences) or duplicate copieswithin a genome (paralogous sequences; Mochida et al.2003; Somers et al. 2003). So far, two common methodolo-gies have been employed to overcome this barrier. One islocus-speciWc PCR ampliWcation and the other is an in sil-ico approach based on clustering that distinguishes paralogs(Richert et al. 2002; Mochida et al. 2003; Somers et al.2003; Caldwell et al. 2004). However, the high sequenceconservation of homoeologous loci that occurs in bothintergenic and genic regions of tetraploid cotton hampered

    123

    http://www.cfgc.ucdavis.edu

  • Mol Genet Genomics (2007) 278:539553 549

    the design of genome-speciWc PCR primers for SNP identi-Wcation (Grover et al. 2004).

    Phylogenetic analyses of duplicated low- and single-copysequences in cotton showed that homoeologs exhibit inde-pendent evolution. Cronn et al. (1999) proposed that mostgenes duplicated during polyploidization in Gossypium wereexpected to exhibit independent evolution in the allopoly-ploid nucleus. The phylogenetic analysis on six cotton MYBtranscriptional factors supported this conclusion (Cedroniet al. 2003). Results of this study, which were obtained in 13ampliWed fragments from six EXPANSIN A genes, also indi-cated an independent evolution pattern. Here, we identiWedSNPs based on the theory of independent gene evolution inthe At- and Dt-genomes of tetraploid species, combinedwith an integrated phylogenetic approach that incorporatedorthologous sequences of ancestral diploid species repre-sented in the closest living descendants (extant species) ofthe donor species as a base reference. Putative SNP identiW-cation was validated by deletion analysis for chromosomalassignment of the six EXPANSIN A genes. The success ofthis approach was facilitated by the inbred nature of theplant materials (highly inbred lines or doubled haploid line)and use of gene-speciWc PCR primers, which avoided prob-lems inherent to heterozygous alleles and orthologoussequences. Sequence assembly of G. arboreum Wber ESTsgenerated around 14,000 candidate genes, which oVer a vastpotential for identifying genes that aVect Wber quality (Arpatet al. 2004). The identiWcation of candidate genes isrelatively diYcult in tetraploid species. Only a very smallpercent (5%) of cotton Wber genes have been geneticallymapped (Rong et al. 2004) due to the low polymorphismnature. Even more disappointing is the fact that

  • 550 Mol Genet Genomics (2007) 278:539553

    sequence divergence than in A-At comparisons (Senchinaet al. 2003). Reinisch et al. (1994) reported that Dt-genomeRFLP marker polymorphism levels were 10% higher thanthose of the At-genome. Moreover, inferences of the loca-tion of QTLs such as Wber-related traits (Jiang et al. 1998;Lacape et al. 2005), disease resistance (Wright et al. 1998),and leaf morphology (Jiang et al. 2000) repeatedly implieda higher evolutionary rate in the Dt-genome than in the At-genome of upland cotton. However, independent evolutionof duplicated low- and single-copy sequences after poly-ploid formation in cotton has also been found in severalindependent phylogenetic analyses (Cronn et al. 1999;Small and Wendel 2000; Cedroni et al. 2003). These obser-vations collectively indicate that the evolutionary forces onthe two genomes may have been fundamentally diVerent,and result in independent evolution of the homoeologousloci in tetraploid cotton.

    Chromosomal locations of six EXPANSIN A genes

    We utilized interspeciWc SNP markers by single nucleotideextension technology and deletion analysis (Liu et al. 2000)for chromosomal assignment of six EXPANSIN A genes. Incotton, the development of candidate gene markers basedon conventional PCR methods has been limited by the pau-city of genomic sequence data and relatively high levels ofmonomorphism for PCR amplicon mobility (Chee et al.2004; Kumar et al. 2006). SNP markers derived from can-didate genes would facilitate not only chromosomal assign-ment of these genes, but also QTL mapping and discoveryof the roles of candidate genes in complex traits. In thisstudy, we did not detect the chromosomal locations ofduplicated homoeologous copies of GhEXPA1, GhEXPA3,GhEXPA4, GhEXPA5, and GhEXPA6 genes. This could bedue to the incomplete genome coverage in cytogeneticstocks, lacking homoeologous duplicated sequences, andrelatively low interspeciWc nucleotide diversity. Our resultsrevealed that all six EXPANSIN A genes may share a poly-ploid duplication event like GhEXPA2 genes on the longarm of two homoeologous chromosomes 10 and 20 in tetra-ploid cotton. However, segmental duplication, an indepen-dent duplication event after polyploidization, might alsoplay a role in gene evolution in cotton. For example, GhEX-PA2 and GhEXPA3 both localize to the long arm of chro-mosome 10. Pfeil et al. (2004) suggested that the fate of theduplicated loci could be one of the several forms: (1) long-term maintenance of the same or similar function; (2) diver-gence in function; (3) loss of one copy; or (4) intralocus orinterlocus gene conversion. Our results, showing chromo-somal locations of some EXPANSIN A genes in non-homo-eologous chromosomes, support earlier studies thatpolyploid nature in cotton created some unique avenues forresponse to selection (Wright et al. 1998; Rong et al. 2004).

    However for rpb2 gene evolution in Gossypium speciesPfeil et al. (2004) reported that single gene or segmentalduplication was more likely the cause than ancient poly-ploidy. Rong et al. (2004) observed several duplicationevents within each subgenome, in addition to homoeologusduplication in cotton. They suggested that this could be dueto retrotransposition, or present-day cotton may be derivedfrom a putative ancestor containing six or seven chromo-somes. Segmental duplications, as a part of polyploidiza-tion events, account for 12 out of 21 EXPANSIN genes inArabidopsis and 16 out of 44 in rice (Sampedro et al.2005). IdentiWcation of chromosomal locations of SNPmarkers using deletion analysis further conWrmed the trueallelic nature of the SNP markers and supported the meritof the strategy for discovery of SNP based on the phyloge-netic and comparative analysis of sequences from tetraploidand closely related ancestral diploid species sequences forother candidate genes in allotetraploid cotton.

    Fiber length is critical to Wber quality in the global cot-ton market. Identifying the genetic variants, diagnosticmarkers, and chromosomal locations of candidate genesthat are associated with Wber cell elongation is likely toaccelerate cotton improvement. IdentiWcation of SNPs forsix EXPANSIN A genes may also help in functional genom-ics analysis because any changes in the nucleotides of thecoding or regulatory regions of the gene may also havefunctional consequences. Paterson et al. (2003) reportedone Wber length-related QTL on chromosome 20 with anLOD threshold of 3.75. They also found another Wberlength QTL, signiWcant in water-limited treatment, locatedon chromosome 9. We located one SNP marker speciWc toGhEXPA1 and GhEXPA2, respectively, on the long arm ofchromosome 20 and four diVerent SNP markers derivedfrom GhEXPA4 on the long arm of chromosome 9(Table 5). A QTL associated with Wber elongation wasdetected on chromosome 9 (Mei et al. 2004). Another QTLfor Wber elongation was mapped on chromosome 3 (Ulloaet al. 2005), the same chromosome where we assigned fourSNP markers generated from diVerent regions of geneGhEXPA6. Furthermore, QTLs aVecting Wber elongationwere found on chromosome 1 and 20 where GhEXPA1,GhEXPA2, and GhEXPA5 genes were also localized (Cheeet al. 2005). Lacape et al. (2005) reported the association ofWber length QTLs with chromosome 3 and 10, and Wberelongation with chromosome 9, 10, and 20. QTL analysisby newly developed microsatellite markers suggested thatloci on chromosome 3, 9, and 10 aVect the Wber elongation,and 2.5 and 50% Wber span length, respectively (Frelichow-ski et al. 2006). The coincidence of chromosomal locationsof the six EXPANSIN A genes and Wber length and elonga-tion QTLs leaves open the possibility that the EXPANSIN Agenes aVect the important QTLs, especially consideringtheir functional role in Wber cell expansion and elongation

    123

  • Mol Genet Genomics (2007) 278:539553 551

    (Smart et al. 1998; Arpat et al. 2004). SNP makers derivedfrom candidate genes may be useful in exploring the rolesof candidate genes in the complex traits.

    In conclusion, we characterized EXPANSIN transcrip-tome expression proWles during cotton Wber polar elonga-tion, developed an SNP marker discovery strategy intetraploid cotton, characterized SNPs in the six cottonEXPANSIN A genes, and localized each of them to thechromosome. The overlap of the EXPANSIN A genes chro-mosomal locations and previously reported Wber qualityQTLs for Wber length and elongation may reXect the puta-tive roles of some EXPANSIN A genes in these QTLs.These markers will have potential as candidate gene mark-ers in marker-assisted selection program of Wber traits.

    Acknowledgments We gratefully acknowledge the technical help ofMr. Douglas A. Dollar and the suggestions from Dr. Z. T. Burive andMs. Yufang Guo. We also thank Dr. John Yu for providing G. arbore-um and G. raimondii DNA samples.

    References

    Abdalla AM, Reddy OUK, Pepper AE (2001) Genetic diversity andrelationships of diploid and tetraploid cottons using AFLP. TheorAppl Genet 102:222229

    Alvarez I, Cronn R, Wendel JF (2005) Phylogeny of the New Worlddiploid cottons (Gossypium L., Malvaceae) based on sequenc-es of three low-copy nuclear genes. Plant Syst Evol 252:199214

    Arpat AB, Waugh M, Sullivan JP, Gonzales M, Frisch D, Main D,Wood T, Leslie A, Wing RA, Wilkins TA (2004) Functional ge-nomics of cell elongation in developing cotton Wbers. Plant MolBiol 54:911929

    Batley J, Barker G, OSullivan H, Edwards KJ, Edwards D (2003)Mining for single nucleotide polymorphisms and insertions/dele-tions in maize expressed sequence tag data. Plant Physiol 132:8491

    Brookes AJ (1999) The essence of SNPs. Gene 234:177186Brown MS, Menzel MY, Hasenkampf CA, Naqi S (1981) Chromo-

    some conWgurations and orientations in 58 heterozygous translo-cations in Gossypium hirsutum. J Hered 72:161168

    Brubaker CL, Paterson AH, Wendel JF (1999) Comparative geneticmapping of allotetraploid cotton and its diploid progenitors. Ge-nome 42:184203

    Bundock PC, Christopher JT, Eggler P, Ablett G, Henry RJ, Holton TA(2003) Single nucleotide polymorphisms in cytochrome P450genes from barley. Theor Appl Genet 106:676682

    Bundock PC, Henry RJ (2004) Single nucleotide polymorphism,haplotype diversity and recombination in the Isa gene of barley.Theor Appl Genet 109:543551

    Caderas D, Muster M, Vogler H, Mandel T, Rose JKC, McQueen-Ma-son S, Kuhlemeier C (2000) Limited correlation between EX-PANSIN gene expression and elongation growth rate. PlantPhysiol 123:13991413

    Caldwell KS, Dvorak J, Lagudah ES, Akhunov E, Luo MC, Wolters P,Powell W (2004) Sequence polymorphism in polyploid wheat andtheir D-genome diploid ancestor. Genetics 167:941947

    Cedroni ML, Cronn RC, Adams KL, Wilkins TA, Wendel JF (2003)Evolution and expression of MYB genes in diploid and polyploidcotton. Plant Mol Biol 51:313325

    Chee PW, Draye X, Jiang CX, Decanini L, Delmonte TA, BredhauerR, Smith CW, Paterson AH (2005) Molecular dissection of inter-speciWc variation between Gossypium hirsutum and Gossypiumbarbadense (cotton) by a backcross-self approach: I. Fiber elon-gation. Theor Appl Genet 111:757763

    Chee PW, Rong J, Williams-Coplin D, Schulze SR, Paterson AH(2004) EST derived PCR-based markers for functional genehomologues in cotton. Genome 47:449462

    Ching A, Caldwell KS, Jung M, Dolan M, Smith OS, Tingey S, Morg-ante M, Rafalski AJ (2002) SNP frequency, haplotype structureand linkage disequilibrium in elite maize inbred lines. BMC Gen-et 3:19

    Cho RJ, Mindrinos M, Richards DR, Sapolsky RJ, Anderson M,Drenkard E, Dewdney J, Reuber TL, Stammers M, Federspiel N,Theologis A, Yang WH, Hubbell E, Au M, Chung EY, LashkariD, Lemieux B, Dean C, Lipshutz RJ, Ausubel FM, Davis RW,Oefner PJ (1999) Genome-wide mapping with biallelic markersin Arabidopsis thaliana. Nat Genet 23:203207

    Cronn R, Cedroni M, Haselkorn T, Grover C, Wendel JF (2002) PCR-mediated recombination in ampliWcation products derived frompolyploid cotton. Theor Appl Genet 104:482489

    Cronn RC, Small RL, Wendel JF (1999) Duplicated genes evolve inde-pendently after polyploid formation in cotton. Proc Natl Acad SciUSA 96:1440614411

    Darley CP, Forrester AM, McQueen-Mason SJ (2001) The molecularbasis of plant cell wall expansion. Plant Mol Biol 47:179195

    Feltus FA, Wan J, Schulze SR, Estill JC, Jiang N, Paterson AH (2004)An SNP resource for rice genetics and breeding based on sub-species: India and Japonica genome alignments. Genome Res14:18121819

    Frelichowski JE, Palmer MB, Main D, Tomkins JP, Cantrell RG, StellyDM, Yu J, Kohel RJ, Ulloa M (2006) Cotton genome mappingwith new microsatellites from Acala Maxxa BAC-ends. MolGenet Genomics 275:479491

    Glazier AM, Nadeau JH, Aitman TJ (2002) Finding genes that underliecomplex traits. Science 298:23452349

    Glonek G, Solomon PJ (2004) Factorial and time course designs forcDNA microarray experiments. Biostatistics 5:89111

    Grover CE, Kim H, Wing RA, Paterson AH, Wendel JF (2004) Incon-gruent patterns of local and global genome size evolution in cot-ton. Genome Res 14:14741482

    Han Z, Wang C, Song X, Guo W, Gou J, Li C, Chen X, Zhang T (2006)Characteristics, development and mapping of Gossypium hirsu-tum derived EST-SSR in allotetraploid cotton. Theor Appl Genet112:430439

    Han ZG, Guo WZ, Song XL, Zhang TZ (2004) Genetic mapping ofEST-derived microsatellites from the diploid Gossypium arbore-um in allotetraploid cotton. Mol Genet Genomics 272:308327

    Harmer SE, Orford SJ, Timmis JN (2002) Characterisation of six -expansin genes in Gossypium hirsutum (upland cotton). MolGenet Genomics 268:19

    Holland JB, Helland SJ, Sharopova N, Rhyne DC (2001) Polymor-phism of PCR-based markers targeting exons, introns, promoterregions, and SSRs in maize and intron and repeat sequences inoat. Genome 44:10651076

    Jenkins JN, McCarty JC, Saha S, Gutierrez O, Hayes R, Stelly DM(2006) Genetic eVects of thirteen Gossypium barbadense L. chro-mosome substitution lines in topcrosses with Upland cotton culti-vars: I. Yield and yield components. Crop Sci 46:11691178

    Ji SJ, Lu YC, Feng JX, Wei G, Li J, Shi YH, Fu Q, Liu D, Luo JC, ZhuYX (2003) Isolation and analysis of genes preferentially ex-pressed during early cotton Wber development by subtractive PCRand cDNA array. Nucleic Acids Res 31:25342543

    Jiang C, Wright RJ, El-Zik K, Paterson AH (1998) Polyploid formationcreated unique avenues for response to selection in Gossypium(cotton). Proc Natl Acad Sci USA 95:44194424

    123

  • 552 Mol Genet Genomics (2007) 278:539553

    Jiang C, Wright RJ, Woo SS, DelMonte TA, Paterson AH (2000) QTLanalysis of leaf morphology in tetraploid Gossypium (cotton).Theor Appl Genet 100:409418

    Kanazin V, Talbert H, See D, DeCamp P, Nevo E, Blake T (2002) Dis-covery and assay of single-nucleotide polymorphisms in barley(Hordeum vulgare). Plant Mol Biol 48:529537

    Kende H, Bradford KJ, Brummell DA, Cho H-T, Cosgrove DJ, Flem-ing AJ, Gehring C, Lee Y, McQueen-Mason SJ, Rose JKC, Voe-senek LACJ (2004) Nomenclature for members of the expansinsuperfamily of genes and proteins. Plant Mol Biol 55:11051110

    Kerr MK, Churchill GA (2001) Bootstrapping cluster analysis: assess-ing the reliability of conclusions from microarray experiments.Proc Natl Acad Sci USA 98:89618965

    Kim MY, Van K, Lestart P, Moon JK, Lee SH (2005) SNP identiWcationand SNAP marker development for a GmNARK gene controllingsupernodulation in soybean. Theor Appl Genet 110:10031010

    Kohel RJ, Yu J, Park YH, Lazo GR (2001) Molecular mapping andcharacterization of traits controlling Wber quality in cotton.Euphytica 121:163172

    Knig R, Baldessari D, Pollet N, Niehrs C, Eils R (2004) Reliability ofgene expression ratios for cDNA microarrays in multiconditionalexperiments with a reference design. Nucleic Acids Res 32:e29

    Kumar P, Paterson AH, Chee PW (2006) Predicting intron sites byaligning cotton ESTs with Arabidopsis genomic DNA. J CottonSci 10:2938

    Kumar S, Tamura K, Nei M (2004) MEGA3: integrated software formolecular evolutionary genetics analysis and sequence align-ment. Brief Bioinformatics 5:150163

    Kwok PY, Deng Q, Zakeri H, Nickerson DA (1996) Increasing theinformation content of STS-based genome maps: identifyingpolymorphisms in mapped STSs. Genomics 31:123126

    Lacape JM, Nguyen TB, Courtois B, Belot JL, Giband M, Gourlot JP,Gawryziak G, Roques S, Hau B (2005) QTL analysis of cottonWber quality using multiple Gossypium hirsutum Gossypiumbarbadense backcross generations. Crop Sci 45:123140

    Li Y, Darley CP, Ongaro V, Fleming A, Schipper O, Baldauf SL,McQueen-Mason SJ (2002) Plant expansins are a complex multi-gene family with an ancient evolutionary origin. Plant Physiol128:854864

    Liu Q, Brubaker CL, Green AG, Marshall DR, Sharp PJ, Singh SP(2001) Evolution of the FAD2-1 fatty acid desaturase 5 UTR in-tron and the molecular systematics of Gossypium (Malvaceae).Am J Bot 88:92102

    Liu S, Saha S, Stelly DM, Burr B, Cantrell RG (2000) Chromosomalassignment of microsatellite loci in cotton. J Hered 91:326332

    Mei M, Syed NH, Gao W, Thaxton PM, Smith CW, Stelly DM, ChenZJ (2004) Genetic mapping and QTL analysis of Wber-relatedtraits in cotton (Gossypium). Theor Appl Genet 108:280291

    Menzel MY, Richmond KL, Dougherty BJ (1985) A chromosometranslocation breakpoint map of the Gossypium hirsutum genome.J Hered 76:406414

    Mochida K, Yamazaki Y, Ogihara Y (2003) Discrimination of homo-eologous gene expression in hexaploid wheat by SNP analysis ofcontigs grouped from a large number of expressed sequence tags.Mol Genet Genomics 270:371377

    Neale DB, Savolainen O (2004) Association genetics of complex traitsin conifers. Trends Plant Sci 9:325330

    Nei M (1987) Molecular evolutionary genetics. Columbia UniversityPress, New York

    Nickerson DA, Taylor SL, Weiss KM, Clark AG, Hutchinson RG,Stengard J, Salomaa V, Vartiainen E, Boerwinkle E, Sing CF(1998) DNA sequence diversity in a 9.7-kb region of the humanlipoprotein lipase gene. Nat Genet 19:233240

    Orford SJ, Timmis JN (1998) SpeciWc expressioin of an expansin geneduring elongation of cotton Wbers. BBAGene Struct Expr1398:342346

    Park YH, Alabady MS, Ulloa M, Sickler B, Wilkins TA, Yu J, StellyDM, Kohel RJ, El-Shihy OM, Cantrell RG (2005) Genetic map-ping of new cotton Wber loci using EST-derived microsatellite inan interspeciWc recombinant inbred line cotton population. MolGenet Genomics 274:428441

    Paterson AH, Saranga Y, Menz M, Jiang C, Wright RJ (2003) QTLanalysis of genotype environment interactions aVecting cottonWber quality. Theor Appl Genet 106:384396

    Pfeil BE, Brubaker CL, Craven LA, Crisp MD (2004) Paralogy and or-thology in the Malvaceae rpb2 gene family: investigation of geneduplication in Hibiscus. Mol Biol Evol 21:14281437

    Reinisch AJ, Dong J, Brubaker CL, Stelly DM, Wendel JF, PatersonAH (1994) A detailed RFLP map of cotton, Gossypiumhirsutum Gossypium barbadense: chromosome organizationand evolution in a disomic polyploid genome. Genetics 138:829847

    Richert AM, Premstaller A, Gebhardt C, Oefner PJ (2002) Genotypingof SNPs in a polyploid genome by pyrosequencing. Biotech-niques 32:592593,596598,600

    Rong J, Abbey C, Bowers JE, Brubaker CL, Chang C, Chee PW,Delmonte TA, Ding X, Garza JJ, Marler BS, Park C, Pierce GJ,Rainey KM, Rastogi VK, Schulze SR, Trolinder NL, Wendel JF,Wilkins TA, Williams-Coplin TD, Wing RA, Wright RJ, Zhao X,Zhu L, Pateson AH (2004) A 3347-locus genetic recombinationmap of sequence-tagged sites reveals features of genome organi-zation, transmission and evolution of cotton (Gossypium). Genet-ics 166:389417

    Rozas J, Snchez-DelBarrio JC, Messeguer X, Rozas R (2003) DnaSP,DNA polymorphism analysis by the coalescent and other meth-ods. Bioinformatics 19:24962497

    Ruan YL, Llewellyn DJ, Furbank RT (2001) The control of single-celled cotton Wber elongation by developmentally reversible gat-ing of plasmodesmata and coordinated expression of sucrose andK+ transporters and expansion. Plant Cell 13:4760

    Saha S, Jenkins JN, Wu J, McCarty JC, Gutierrez OA, Percy RG, Cant-rell RG, Stelly DM (2006a) EVects of chromosome-speciWc intro-gression in Upland cotton on Wber and agronomic traits. Genetics172:19271938

    Saha S, Raska DA, Stelly DM (2006b) Upland cotton (Gossypiumhirsutum L.) Hawaiian cotton (G. tomentosum Nutt. ex Seem.)F1 hybrid hypoaneuploid chromosome substitution series. J Cot-ton Sci 12:263272

    Salmaso M, Faes G, Segala C, Stefanini M, Salakhutdinov I, ZyprianE, Toepfer R, Grando MS, Velasco R (2004) Genome diversityand gene haplotypes in the grapevine (Vitis vinifera L.), as re-vealed by single nucleotide polymorphisms. Mol Breeding14:385395

    Sampedro J, Cosgrove DJ (2005) The expansin superfamily. GenomeBiol 6:242

    Sampedro J, Lee Y, Carey RE, dePanphilis C, Cosgrove DJ (2005) Useof genomic history to improve phylogeny and understanding ofbirths and deaths in a gene family. Plant J 44:409419

    Senchina DS, Alwarez I, Cronn RC, Liu B, Rong J, Noyes RD, Pater-son AH, Wing RA, Wilkins TA, Wendel JF (2003) Rate variationamong nuclear genes and the age of polyploidy in Gossypium.Mol Biol Evol 20:633643

    Shappley ZW, Jenkins JN, Meredith WR, McCarty JC (1998) AnRFLP linkage map of Upland cotton, Gossypium hirsutum L.Theor Appl Genet 97:756761

    Shimizu Y, Aotsuka S, Hasegawa O, Kawada T, Sakuno TF, Hayashi T(1997) Changes in the level of mRNAs for cell wall-related en-zymes in growing cotton Wber cells. Plant Cell Physiol 38:375378

    Small RL, Ryburn JA, Cronn RC, Seelanan T, Wendel JF (1998) Thetortoise and the hare: choosing between noncoding plastome andnuclear Adh sequences for phylogeny reconstruction in a recentlydiverged plant group. Am J Bot 85:13011315

    123

  • Mol Genet Genomics (2007) 278:539553 553

    Small RL, Wendel JF (2000) Copy number lability and evolutionarydynamics of the Adh gene family in diploid and tetraploid cotton(Gossypium). Genetics 155:19131925

    Smart LB, Vojdani F, Maeshima M, Wilkins TA (1998) Genes in-volved in osmoregulation during turgor-driven cell expansion ofdeveloping cotton Wbers are diVerentially regulated. Plant Physiol116:15391549

    Somers DJ, Kirkpatrick R, Moniwa M, Walsh A (2003) Mining single-nucleotide polymorphisms from hexaploid wheat ESTs. Genome49:431437

    Stelly DM, Saha S, Raska DA, Jenkins JN, McCarty JC, Gutierrez OA(2005) Registration of 17 Upland (Gossypium hirsutum) cottongermplasm lines disomic for diVerent G. barbadense chromo-some or arm substitutions. Crop Sci 45:26632665

    Tajima F (1983) Evolutionary relationship of DNA sequences in Wnitepopulations. Genetics 105:437460

    The Arabidopsis Genome Initiative (2000) Analysis of the genome se-quence of Xowering plant Arabidopsis thaliana. Nature 408:796815

    Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG(1997) The ClustalX windows interface: Xexible strategies formultiple sequence alignment aided by quality analysis tools.Nucleic Acids Res 24:48764882

    Ulloa M, Saha S, Jenkins JN, Meredith WR, McCarty JC, Stelly DM(2005) Chromosomal assignment of RFLP linkage groups harbor-ing important QTLs on an intraspeciWc cotton (Gossypium hirsu-tum L.) joinmap. J Hered 96:132144

    Van K, Hwang EY, Kim MY, Park HJ, Lee SH, Cregan PB (2005)Discovery of SNPs in soybean genotypes frequently used as theparents of mapping populations in the United States and Korea.J Hered 96:529535

    Vogler H, Caderas D, Mandel T, Kuhlemeier C (2003) Domains of ex-pansin gene expression deWne growth regions in the shoot apex oftomato. Plant Mol Biol 53:267272

    Wang DG, Fan JB, Siao CJ, Berno A, Young P, Sapolsky R, GhandourG, Perkins N, Winchester E, Spencer J, Kruglyak L, Stein L, HsieL, Topaloglou T, Hubbell E, Robinson E, Mittmann M, Morris MS,Shen N, Kilburn D, Rioux J, Nusbaum C, Rozen S, Hudson TJ,

    Lipshutz R, Chee M, Lander ES (1998) Large-scale identiWcation,mapping, and genotyping of single-nucleotide polymorphisms inthe human genome. Science 280:10771082

    Wendel JF, Cronn C (2003) Polyploidy and the evolutionary history ofcotton. Adv Agron 78:139186

    Wilkins TA, Arpat AB (2005) The cotton Wber transcriptome. Physio-logia Plantarum 124:295300

    Wilkins TA, Jernstedt JA (1999) Molecular genetics of developing cot-ton Wber. In: Basra AS (ed) Cotton Wber: development biology,quality improvement, and textile processing. Binghamton, NewYork, pp 231269

    Wilkins TA, Smart LB (1996) Isolation of RNA from plant tissue. In:Krieg PA (ed) A laboratory guide to RNA: isolation, analysis, andsynthesis. Wiley-Liss, New York, pp 2141

    Wilkins TA, Wan CY, Lu CC (1994) Ancient origin of the vacuolarH+-ATPase 69-kilodalton catalytic subunit superfamily. TheorAppl Genet 89:514524

    Wright RJ, Thaxton PM, El-Zik KM, Paterson AH (1998) D-subge-nome bias of Xcm resistance genes in tetraploid Gossypium (cot-ton) suggests that polyploid formation has created novel avenuesfor evolution. Genetics 149:19871996

    Zhang J, Guo W, Zhang T (2002) Molecular linkage mapping of allo-tetraploid cotton (Gossypium hirsutum L. Gossypium barba-dense L.) with a haploid population. Theor Appl Genet 105:11661174

    Zhang W, Gianibelli MC, Ma W, Rampling L, Gale KR (2003) Identi-Wcation of SNPs and development of allele-speciWc PCR markersfor -gliadin alleles in Triticum aestivum. Theor Appl Genet107:130138

    Zhao X, Si Y, Hanson RE, Crane CF, Price HJ, Stelly DM, Wendel JF,Paterson AH (1998) Dispersed repetitive DNA has spread to newgenome since polyploid formation in cotton. Genome Res 8:479492

    Zhu YL, Song QJ, Hyten DL, Van Tassell CP, Matukumalli LK,Grimm DR, Hyatt SM, Fickus EW, Young ND, Cregan PB (2003)Single-nucleotide polymorphisms in soybean. Genetics163:11231134

    123

    Transcriptome proWling, sequence characterization, and SNP-based chromosomal assignment of the EXPANSIN genes in cottonAbstractIntroductionMaterials and methodsPlant materials, DNA and RNA isolationMicroarray expression analysisPCR primer design, ampliWcation, cloning and sequencingSequence analysis and SNP primer designSNP genotypingChromosomal assignment

    ResultsFiber transcriptome expression proWles of EXPANSIN genesSequence characterization of six EXPANSIN A genesNucleotide diversityChromosomal assignment

    DiscussionEXPANSIN transcriptome expression proWlingSNP marker discovery in tetraploid cottonSNP features in cottonUnequal evolutionary rate of the At- and Dt-genome in cottonChromosomal locations of six EXPANSIN A genes

    References

    /ColorImageDict > /JPEG2000ColorACSImageDict > /JPEG2000ColorImageDict > /AntiAliasGrayImages false /DownsampleGrayImages true /GrayImageDownsampleType /Bicubic /GrayImageResolution 150 /GrayImageDepth -1 /GrayImageDownsampleThreshold 1.50000 /EncodeGrayImages true /GrayImageFilter /DCTEncode /AutoFilterGrayImages true /GrayImageAutoFilterStrategy /JPEG /GrayACSImageDict > /GrayImageDict > /JPEG2000GrayACSImageDict > /JPEG2000GrayImageDict > /AntiAliasMonoImages false /DownsampleMonoImages true /MonoImageDownsampleType /Bicubic /MonoImageResolution 600 /MonoImageDepth -1 /MonoImageDownsampleThreshold 1.50000 /EncodeMonoImages true /MonoImageFilter /CCITTFaxEncode /MonoImageDict > /AllowPSXObjects false /PDFX1aCheck false /PDFX3Check false /PDFXCompliantPDFOnly false /PDFXNoTrimBoxError true /PDFXTrimBoxToMediaBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXSetBleedBoxToMediaBox true /PDFXBleedBoxToTrimBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXOutputIntentProfile (None) /PDFXOutputCondition () /PDFXRegistryName (http://www.color.org?) /PDFXTrapped /False

    /Description >>> setdistillerparams> setpagedevice