author's personal copy - marie nydam · 2018. 5. 24. · 2005), a cellular slime mold...

10
This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution and sharing with colleagues. Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party websites are prohibited. In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. Authors requiring further information regarding Elsevier’s archiving and manuscript policies are encouraged to visit: http://www.elsevier.com/authorsrights

Upload: others

Post on 21-Feb-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Author's personal copy - Marie Nydam · 2018. 5. 24. · 2005), a cellular slime mold (Benabentos et al., 2009), fungi (Glass et al., 2000), a hydroid (Nicotra et al., 2009), and

This article appeared in a journal published by Elsevier. The attachedcopy is furnished to the author for internal non-commercial researchand education use, including for instruction at the authors institution

and sharing with colleagues.

Other uses, including reproduction and distribution, or selling orlicensing copies, or posting to personal, institutional or third party

websites are prohibited.

In most cases authors are permitted to post their version of thearticle (e.g. in Word or Tex form) to their personal website orinstitutional repository. Authors requiring further information

regarding Elsevier’s archiving and manuscript policies areencouraged to visit:

http://www.elsevier.com/authorsrights

Page 2: Author's personal copy - Marie Nydam · 2018. 5. 24. · 2005), a cellular slime mold (Benabentos et al., 2009), fungi (Glass et al., 2000), a hydroid (Nicotra et al., 2009), and

Author's personal copy

Molecular evolution of a polymorphic HSP40-like protein encoded in thehistocompatibility locus of an invertebrate chordate

Marie L. Nydam a,⇑, Tinya A. Hoang b, Katie M. Shanley a, Anthony W. De Tomaso a

a Department of Molecular, Cellular, and Developmental Biology, University of California, Santa Barbara, CA 93106, United Statesb Department of Biology, Stanford University, Stanford, CA 94305, United States

a r t i c l e i n f o

Article history:Received 12 December 2012Revised 9 March 2013Accepted 12 March 2013Available online 28 March 2013

Keywords:Botryllus schlosseriInvertebrateHsp40DnaJAllorecognition

a b s t r a c t

Allorecognition, the ability to distinguish self from non-self, occurs in most organisms. Despite the ubiq-uity of the allorecognition process, the genetic basis for allorecognition remains unexplored in most taxaoutside vertebrates and flowering plants. The allorecognition system in the colonial ascidian Botryllus sch-losseri is a notable exception. We have recently identified a polymorphic gene within the fuhc locus thatmay play a role in allorecognition. The encoded protein, called Hsp40-L, is a Type II member of the J-pro-tein family which usually functions as a co-chaperone with Hsp70. While many of the residues that inter-act with Hsp70 are conserved in Hsp40-L, it may not be a housekeeping protein because it is surprisinglypolymorphic and expressed in the ampullae, the site of allorecognition. While the majority of the Hsp40-Lprotein appears to evolve under purifying selection, a section of the C-terminal region likely experiencesbalancing/directional selection, characteristic of other allorecognition proteins.

� 2013 Elsevier Ltd. All rights reserved.

1. Introduction

Allorecognition, which allows an individual to distinguish kin/self from non-kin/non-self, occurs on every branch of the tree oflife (Grosberg and Plachetzki, 2010). This system of recognitioncontrols different biological functions, such as the immunologicalresponse against pathogens in vertebrates (Flajnik and Kasahara,2010), the prevention of inbreeding in plants (Nasrallah, 2002),and fusion between closely related individuals in invertebrates(Buss, 1982).

Despite the ubiquity of the allorecognition process, the geneticbasis for allorecognition is not well-understood in most taxa out-side vertebrates (MHC) and plants (SI). The genetic determinantsof allorecognition are being uncovered in a few systems: a bacte-rium (Gibbs et al., 2008), a colonial ascidian (De Tomaso et al.,2005), a cellular slime mold (Benabentos et al., 2009), fungi (Glasset al., 2000), a hydroid (Nicotra et al., 2009), and a solitary ascidian(Harada et al., 2008). Yet lack of knowledge about the proteins thatinteract with these determinants limits our understanding of thegenetic and molecular basis of the allorecognition process.

The allorecognition system in the colonial ascidian Botryllus sch-losseri is an excellent model to ask these questions. Allorecognitionoccurs when terminal projections of an extracorporeal vasculature,

called ampullae, come into contact between juxtaposed colonies.During this process, the tunic is dissolved, and the epithelium ofthe juxtaposed ampullae touch. If colonies are compatible, theampullae will fuse, forming a parabiosis between the two colonies.If they are incompatible, the ampullae will undergo a rejectionreaction which prevents vascular fusion. The outcome is deter-mined by a single, highly polymorphic locus called the fuhc (for fu-sion/histocompatibility), with the rules that genotypes whichshare one or both fuhc alleles will fuse, while those sharing nonewill reject. There are no modifying loci for allorecognition in Bot-ryllus, providing a genetically tractable system for revealing thegenes involved.

Positional cloning of the fuhc delineated a region >350 Kbwhich encodes several genes involved in the allorecognition re-sponse. A candidate fuhc gene (cfuhc) was identified, and polymor-phisms of this gene predicted outcome by themselves (De Tomasoet al., 2005).

Two other genes that play a role in allorecognition responseshave also been identified encoded near the fuhc, called fester anduncle fester. Fester is encoded approximately 300 Kb from the fuhc,and functional studies suggest that fester acts as a receptor in boththe fusion and rejection responses, but is not a genetic determinantof histocompatibility (i.e. individuals with identical fester allelescan reject each other). Uncle fester represents a partial duplicationof the fester locus, with the genomic region encoding the COOH halfof the protein (exons 4–9) nearly identical to fester’s exons 6–11,but uncle fester’s Exons 1–3 do not appear to be related to any festersequence (McKitrick et al., 2011). In contrast to fester, uncle fester is

0145-305X/$ - see front matter � 2013 Elsevier Ltd. All rights reserved.http://dx.doi.org/10.1016/j.dci.2013.03.004

⇑ Corresponding author. Current address: Division of Science and Mathematics,Centre College, Danville, KY 40422, United States. Tel.:+1 607 227 2903; fax: +1 859236 7925.

E-mail addresses: [email protected] (M.L. Nydam), [email protected] (A.W. De Tomaso).

Developmental and Comparative Immunology 41 (2013) 128–136

Contents lists available at SciVerse ScienceDirect

Developmental and Comparative Immunology

journal homepage: www.elsevier .com/locate /dc i

Page 3: Author's personal copy - Marie Nydam · 2018. 5. 24. · 2005), a cellular slime mold (Benabentos et al., 2009), fungi (Glass et al., 2000), a hydroid (Nicotra et al., 2009), and

Author's personal copy

a monomorphic protein that plays a role in initiating the rejectionresponse between incompatible individuals, but is not involved inthe fusion response (McKitrick et al., 2011). Thus both a ligand (thecandidate fuhc) and two putative receptors (one polymorphic andthe other monomorphic), are linked in the B. schlosseri genome.

For both fuhc and fester, the majority of the protein evolves viapurifying selection: dN/dS < 1 (Nydam and De Tomaso, 2012; Ny-dam et al., 2013). However, certain regions in each protein experi-ence directional/balancing selection: dN/dS > 1. A similar pattern iswidespread in MHC (Major Histocompatibility Complex) proteins;the regions with dN/dS values greater than 1 are the regions thatbind antigens (Cohen, 2002; Garrigan and Hedrick, 2003 and refer-ences within; Campos et al., 2006; Wan et al., 2006; Worley et al.,2006; Bos et al., 2008; Koutsogiannouli et al., 2009). Understandingthe evolution of these and other proteins in the B. schlosseri allorec-ognition system will help provide insight into the structural inter-actions involved in a histocompatibility response.

We have recently identified another gene in the fuhc locus,approximately 8 Kb 30 of the fuhc transmembrane gene that dis-plays similar characteristics. The gene encodes a highly conservedprotein of the HSP40 family, with a J domain and zinc finger, andwe have named it Hsp40-L. However, while DnaJ proteins are con-sidered housekeeping proteins, Hsp40-L has an unusual level anddistribution of polymorphism. While the function of Hsp40-L iscurrently under investigation, this polymorphism coupled to itsphysical location in the fuhc locus and expression in the ampullae,the site of allorecognition, makes it a candidate gene for involve-ment in this process. Here we characterized the structural domainsand their amino acid conservation in relation to other Hsp40 andHsp40-like proteins, to determine whether the Botryllus Hsp40-Lis a likely co-chaperone of Hsp70. We also tested whether theHsp40-L protein evolves in the same manner as fuhc and fester,and identified regions of the protein that may experience direc-tional/balancing selection. By analyzing the Hsp40-L protein andother proteins that potentially interact with the genetic determi-nant of allorecognition, we can increase our understanding of thegenetic and molecular basis of the allorecognition process.

2. Results

2.1. Location of Hsp40-L expression

mRNA localization was determined in adults by in situ hybrid-ization using a probe in a unique portion of the Hsp40-L gene. Asshown in Fig. 1C and D, expression is limited to the epitheliumof the ampullae, and not seen in other parts of the vasculature,blood, or other tissues. One issue with in situ hybridization in B.schlosseri is the tunic, which often shows non-specific binding toprobes, and can range from barely detectable (Fig. 1A) to quitestrong (Fig. 1B–D). In addition, the ampullae and vasculature areembedded in the tunic, and the vascular epithelium is juxtaposedwith the tunic matrix. In these experiments we took advantageof the fact that during fixation the vascular epithelium will oftenpull away from the tunic, leaving an imprint of the ampullae inthe tunic matrix. This allows us to discriminate non-specific bind-ing to the tunic versus specific binding in the epithelium of theampullae (compare sense control in Fig. 1B to specific binding inFig. 1E and D). In summary, Hsp40-L is expressed in the epitheliumat the tips of the ampullae, the site of allorecognition.

2.2. Genetic localization, domain characterization and amino acidconservation

B. schlosseri Hsp40-L is a gene spanning 2095 bp that contains nointrons, and is encoded in the opposite orientation of the cfuhc,

terminating 8 Kb from the 30 end of the cfuhc tm gene. The termi-nal 1654 bp of Hsp40-L are encoded on the same fosmid as thecfuhc genes, and the remainder of the gene was identified in ourtranscriptome database (Rodriguez et al., in preparation), and con-firmed by RT-PCR. There is an intervening gene which encodes aWD-40 superfamily member with high identity to retinoblas-toma-binding protein 5, and is expressed. We have not yet deter-mined if it is polymorphic.

Hsp40-L encodes a predicted protein that is 521 amino acidslong and contains a clear J domain at its N-terminus. This domainis 69 amino acids long (Fig. 2) and can be classified as a J domainbecause it fulfills four criteria (Walsh et al., 2004). First, JPRED 3predicts that Hsp40-Lhas four helices that correspond to the helicesin mammalian Hdj1 and Escherichia coli DnaJ (Fig. 2: Helix I: aminoacid positions 4–7, Helix II: amino acid positions 16–30, Helix III:amino acid positions 40–55, Helix IV: amino acid positions 60–69). Second, the second helix contains an abundance of basic ami-no acids (e.g. arginine and lysine); third, the region between thesecond and third helices contains an HPD motif (histidine, prolineand aspartic acid); and fourth, the second and third helices haveconserved hydrophobic amino acids which are known to stabilizepacking of the helices in mammalian Hdj1 and E. coli DnaJ (Walshet al., 2004).

Several conserved amino acid residues in the J domain areknown to play a role in Hsp70 interactions (Hennessy et al.,2000). The most frequently conserved motif is HPD (histidine, pro-line and aspartic acid), which falls between Helix II and III; this mo-tif is conserved in nearly all Hsp40 proteins, regardless of type. TheHPD motif in Hsp40-L is found amino acid positions 31–33 (Fig. 2).Other residues that could be interacting with charged residues inHsp70 include glutamic acid and two aspartic acids (negativelycharged) as well as seven lysines and one arginine (positivelycharged). The positively charged amino acids could be interactingwith negatively charged residues in Hsp70 (Gassler et al., 1998).In B. schlosseri Hsp40-L, both aspartic acids, five lysines and thearginine are conserved. The remaining two lysines are substitutedfor an arginine (positively charged to positively charged) and a glu-tamic acid (positively charged to negatively charged).

Hsp40-L possesses a 17 amino acid glycine-phenylalanine (G/F)rich region from amino acid positions 99–116. We determined thatthis region qualifies as G/F rich by comparing the percentage of Gand F residues in the region to G/F rich regions in other Type II pro-teins cited in Hennessy et al., 2000 (Arabidopsis thaliana AAB91418,Burrelia burgdorferi AAC66991, Cryptococcus curvatus CAA72798,Escherichia coli P36659, Homo sapiens BAA08495, Mus musculusQ61712, Saccharomyces cerevisiae P39101, and Thermus thermophi-lus Q56237). The percentage of G/F in the B. schlosseri Hsp40-L G/Fregion is 17.6%/17.6%, whereas the average percentage of G/F in theG/F regions of the other eight sequences is 14.0%/7.86%. Becausethe percentages of glycine and phenylalanine in the focal regionof B. schlosseri Hsp40-L are higher than in well-characterized G/Frich regions of Hsp40-like proteins from other species, we deter-mined that this region in B. schlosseri Hsp40-L qualifies as G/F rich.

B. schlosseri Hsp40-L contains no zinc finger regions that containfour repeats of cysteine-XX-cysteine-X-glycine-X-glycine(CXXCXGXGX), which would be expected after the G/F-rich region.Therefore, the C-terminal region begins at amino acid position 117and continues to the end of the protein. Based on the JPRED 3 anal-ysis, Hsp40-L retains significant homology with other Hsp40-likeproteins until amino acid position 350, where most of the otherprotein sequences end (Supplementary Material 1 and 2). In partic-ular, amino acid positions 308–338 (a zinc-finger domain) are con-served in the alignment, which includes sequences from taxa asdiverse as the apicomplexan Plasmodium vivax, cellular slime moldDictyostelium discoideum, mosquito Culex quinquefasciatus andpufferfish Takifugu rubripes (Supplementary Material 1 and 2).

M.L. Nydam et al. / Developmental and Comparative Immunology 41 (2013) 128–136 129

Page 4: Author's personal copy - Marie Nydam · 2018. 5. 24. · 2005), a cellular slime mold (Benabentos et al., 2009), fungi (Glass et al., 2000), a hydroid (Nicotra et al., 2009), and

Author's personal copy

Considerable conservation also exists in the C-terminal region ofthe protein alignment (amino acid positions 480–502); these ami-no acids are also conserved within B. schlosseri. These residues mayencode another zinc-finger domain (although it is not predicted byall conserved domain algorithms), and the entire protein is con-served with another ascidian protein from Ciona intestinalis (calledZn-finger (UI-like)-8), and dnaJ subfamily C, member 21 from mul-tiple primate species and zebrafish.

In summary, Hsp40-L has a J domain, and a glycine-phenylala-nine rich region, but lacks a cysteine-glycine repeat region. TypeII proteins have a J domain and a glycine-phenylalanine rich regionbut not a cysteine-glycine repeat region (Cheetham and Caplan,1998). Hsp40-L is therefore a Type II protein. We call this proteinHsp40-L, as Hsp40 is reserved for Type I proteins only, and -J- couldbe confused with other protein names that also contain the letter J(Walsh et al., 2004; Hennessy et al., 2005).

2.3. Amino acid diversity across Hsp40-L

Amino acid diversity across B. schlosseri Hsp40-L is shown inFig. 3. The mean diversity across the whole protein is 0.053. Outof 449 amino acid positions, 379 (84%) are completely conserved(diversity = 0.05). Amino acid positions 340 through the end ofthe sequence are slightly more diverse: the average diversity valuefor amino acid positions 1–339 is 0.05 and for amino acid positions340–449 is 0.06.

The same analyses were performed on the following Hsp40-Lsequences from GenBank: bacteria Aeromonas veronii bv. Sobria(PopSet 325305397, Silver et al., 2011), Bacillus licheniformis, B.

sp., Bacillus subtilis subsp. Inaquosorum, B. subtilis subsp. Spizizenii(PopSet 284154897, Connor et al., 2010), Vibrio coralliilyticus (Pop-Set 298112845, Pollock et al., 2010), butterflies Heliconius cyndocordula, H. melpomene melpomene, H. heurippa (PopSet300077621, Salazar et al., 2010), Heliconius erato emma, H. eratofavorinus, H. erato emma x favorinus (PopSet 295150361, Counter-man et al., 2010), fly Drosophila melanogaster (PopSet 30421311,Bettencourt BR, Lerman DN, Feder ME unpublished data), and mos-quito Anopheles gambiae (PopSet 56067752, Turner et al., 2005).

Nine of the 14 Hsp40 data sets obtained from GenBank havecomplete or nearly complete amino acid conservation (diversityvalue = 0.05); the mode of the data sets from GenBank is 0.05.However, the mean diversity value of the GenBank data set is0.085. The greater mean diversity of the Genbank data as comparedto that of B. schlosseri can be attributed to five data sets with highacross-the-protein diversity. Anopheles gambiae has an across-the-protein diversity value of 0.119, and A. veronii bv. Sobria has anacross-the-protein diversity value of 0.347. The three Heliconiuserato subspecies had across-the-protein diversity values between0.078 and 0.107.

2.4. Recombination

Intragenic recombination in Hsp40-L for all populations to-gether was assessed by calculating Rm, the minimum number ofrecombination events in DnaSP 5.10.01 (Librado and Rozas, 2009)and the correlation between physical distance and three measuresof linkage disequilibrium: r2, D0 and G4 in program permute (Wil-son and McVean, 2006). Intragenic recombination has occurred

A

B

C

D F

E

Fig. 1. Spatial localization of Hsp40-L expression by in situ hybridization reveals expression in the epithelium of the ampullae. Panels A and B show negative controlhybridization using the Hsp40-L sense probe, and Panels C and D show Hsp40-L expression detected by the antisense probe. Probes are red, and tissue is counterstained withDAPI to reveal nuclei (blue). As described in the text, during fixation, we often see a retraction of the ampulla tissue from the overlaying tunic (Panel E vs. Panel F). In addition,the tunic often shows variable non-specific staining, and can range from nothing (Panel A, the epithelium of a single ampullae is outlined) to strong staining (arrows in panelsB-D). Because the ampullae retract in some cases, we can easily discriminate between true staining of the ampullae vs. background staining in the juxtaposed tunic. Epithelialstaining can be seen using the antisense probes (panels C and D), but is not detected using the sense probe (Panel B). The separation between tunic and ampullar epitheliumcan be seen in panel D (arrow with asterisk). A DIC image of a retracted ampulla is shown in Panel E, while the normal juxtaposition of tunic and ampulla is shown in anunfixed sample in Panel F.

130 M.L. Nydam et al. / Developmental and Comparative Immunology 41 (2013) 128–136

Page 5: Author's personal copy - Marie Nydam · 2018. 5. 24. · 2005), a cellular slime mold (Benabentos et al., 2009), fungi (Glass et al., 2000), a hydroid (Nicotra et al., 2009), and

Author's personal copy

within Hsp40-L. Minimum numbers of recombination events (Rm)are as follows – Eel Pond MA: 5, Sandwich MA: 3, Monterey CA:21, and Santa Barbara CA: 18. Significant negative correlations ex-ist between physical distance and all three measures of linkage dis-equilibrium (p = 0.001 in all cases).

2.5. Tests of selection: x statistics

Within each B. schlosseri population, 13–22 codons have >95%posterior probability of selection: 13 in Eel Pond MA and SandwichMA, 17 in Monterey CA, and 22 in Santa Barbara CA. Between 68–92% of these codons are in the region of elevated diversity identi-fied by DIVAA (positions 339–449). The positions of these focal co-dons along the protein are shown in Supplementary Material 3.

Mann–Whitney U tests in R 2.15.0 determine that positions339–449 do not have higher x (dN/dS) values than the rest of theprotein (Eel Pond MA: W = 14696, p-value = 1.0, Sandwich MA:W = 14386, p-value = 1.0, Monterey CA: W = 11,757, p-value = 1.0,Santa Barbara CA: W = 14,751, p-value = 1.0). The J domain doeshave lower x values than the rest of the protein for the Massachu-setts populations (Eel Pond MA: W = 10,570, p-value < 0.01, Sand-wich MA: W = 10,820, p-value = 0.01, Monterey CA: W = 11,855,

p-value = 0.11, Santa Barbara CA: W = 11,969, p-value = 0.13). TheG/F rich region does not have different x values than the rest ofthe protein (Eel Pond MA: W = 2999, p-value = 0.11, SandwichMA: W = 4313, p-value = 0.41, Monterey CA: W = 4080, p-va-lue = 0.70, Santa Barbara CA: W = 2946, p-value = 0.09).

For the proteins obtained from GenBank, between 0 and 4 co-dons within each taxon have a >95% probability of selection. Theaverage number of codons with a >95% probability of selectionacross all data sets is 1.1.

2.6. Tests of selection: polymorphism statistics

2.6.1. Hsp40-Lh, p, number of haplotypes and haplotype diversity for each

population are shown in Supplementary Material 4. Tajima’s D val-ues for each population are shown in Table 1. Coalescent simula-tions given h and segregating sites give similar p-values (andalways with the same qualitative result), so only results from thesimulations given h will be shown. No populations had Tajima’sD values significantly different from zero.

Fu and Li’s D⁄ and F⁄ values for each population are shown in Ta-ble 1. Coalescent simulations given h and segregating sites give

Fig. 2. An alignment of the J domain of B. schlosseri Hsp40-L. Only unique protein sequences are included. The boxes below the alignment correspond to Helices I-IV of E. coliDnaJ and mammalian Hdj1. The asterisks mark the conserved HPD region.

M.L. Nydam et al. / Developmental and Comparative Immunology 41 (2013) 128–136 131

Page 6: Author's personal copy - Marie Nydam · 2018. 5. 24. · 2005), a cellular slime mold (Benabentos et al., 2009), fungi (Glass et al., 2000), a hydroid (Nicotra et al., 2009), and

Author's personal copy

similar p-values (and always with the same qualitative result), soonly results from the simulations given h are shown. No popula-tions had D⁄ or F⁄ values significantly different from zero.

2.6.2. Housekeeping genesh, p, number of haplotypes and haplotype diversity for each

population are shown in Supplementary Material 4. Values for Taj-ima’s D, Fu and Li’s D⁄ and F⁄ can be found in Supplementary Mate-rial 5. Similar to Hsp40-L, we see very few significant values for thehousekeeping loci. For mtCOI, two populations are significant forall three statistics. For 60S ribosomal protein L10, one populationis significant for all three statistics. For vasa, one population is sig-nificant for one statistic. None of the other housekeeping loci havepopulations with significant values for any of the three statistics.

2.7. Tests of selection: distribution of polymorphism within and amongpopulations

2.7.1. Hsp40-LAMOVA analyses show that while a substantial amount of var-

iation is found within populations (80%), there is also some varia-tion among groups (West Coast vs. East Coast) (15%) and amongpopulations within groups (5%) (Table 2). The variation amongpopulations within groups, although small, is significant. Signifi-cant differentiation exists among populations among groups: over-all Fst is significant for all loci, and a majority of pairwise Fst valuesare significant.

2.7.2. Housekeeping genesAMOVA results from all housekeeping genes are presented in

Supplementary Material 6. The overall pattern is similar to

Hsp40-L, with the majority of the variation present within popula-tions, but statistically significant variation among populationswithin groups, and among populations among groups.

3. Discussion

3.1. Structure of Hsp40-L and implications for function

Hsp40 and Hsp40-like proteins function as co-chaperones ofHsp70 (Qiu et al., 2006). Hsp70 proteins bind short polypeptidesand aid in their degradation, folding and translocation (Bukauand Horwich, 1998). Hsp40 and Hsp40-like proteins control theactivity of Hsp70 proteins by binding to the ATPase domains ofHsp70 proteins via their J domains (Walsh et al., 2004). This cata-lyzes ATP hydrolysis, resulting in Hsp70 binding tightly to thepolypeptide substrate (Jensen and Johnson, 1999). Because Hsp40and Hsp40-like proteins regulate Hsp70 function, they play animportant role in protein translation, folding, unfolding, transloca-tion and degradation (Qiu et al., 2006). In addition to catalyzingATP hydrolysis of Hsp70, Type I and II Hsp40 proteins can bindto non-native substrates and present them to Hsp70 proteins (Luand Cyr, 1998). Certain Hsp40 and Hsp40-like proteins bind poly-peptides via the cysteine-glycine repeat region (zinc finger-like do-main) (Banecki et al., 1996; Szabo et al., 1996). Hsp40-like proteinsthat lack this domain (Type II) can also bind polypeptides, usingthe C-terminal region alone (Fan et al., 2004).

Hsp40-L contains a typical J domain with conservation at partic-ular positions that are known to interact directly with Hsp70 (e.g.the HPD motif) or thought to interact directly with Hsp70 (e.g. fivelysines and one arginine thought to interact with negativelycharged residues in Hsp70) (Hennessy et al., 2000). SeveralHsp40 and Hsp40-like proteins can also act as independent chap-erones; DnaJ in E. coli, Sec63, Sci1 and Jem1 in yeast, and ERdj1–5 in mammalian act as a chaperone of the Hsp70 homolog BiP/Kar2 (Qiu et al., 2006). BiP/Kar2 proteins reside in the endoplasmicreticulum and aid in protein folding and assembly. We do notknow if Hsp40-L can act as an independent chaperone. The mam-malian Tpr2 interacts with both Hsp90 and Hsp70 (Brychzyet al., 2003), but the Type II B. schlosseri Hsp40-L does not appear

0 100 200 300 400

0.05

0.07

0.09

0.11

Amino Acid Position

Div

ersi

ty V

alue

Fig. 3. Graph of amino acid diversity values across the Hsp40-L protein. The blackline at diversity value = 0.05 is composed of hundreds of black dots; the vastmajority of Hsp40-L proteins are completely conserved (diversity value = 0.05). Thelabels at the top of the graph refer to the three regions of the protein. J = J domain,G/F = glycine-phenylalanine rich region.

Table 1Tajima’s D, Fu and Li’s D⁄ and F⁄ statistics for Hsp40-L.

Population Tajima’s D Fu Li’s D⁄ Fu and Li’s F⁄

Eel Pond, MA �0.54 �0.35 �0.47Sandwich, MA 0.61 0.28 0.19Monterey, CA �0.17 0.25 0.13Santa Barbara, CA �0.04 0.25 0.20

No value is significant at the a = 0.05 level.

Table 2AMOVA, Fixation Indices and Pairwise Fst values for Hsp40-L. Asterisks denotep < 0.05.

Source of variation df Sum ofSquares

Variancecomponents

Percentageof variation

Among groups 1 128.31 3.31 15.24Among populations

within groups2 64.48 1.02 4.71

Within populations 54 940.55 17.42 80.05

Fixation Indices Value p-valueFct 0.15 0.34 ± 0.005Fsc 0.06 0.04⁄ ± 0.002Fst 0.20 0⁄ ± 0

Pairwise Fst Fst p-valueEel Pond vs.

Sandwich0.08 0.06 ± 0.01

Eel Pond vs.Monterey

0.22 0 ± 0⁄

Eel Pond vs. SantaBarbara

0.27 0 ± 0⁄

Sandwich vs.Monterey

0.15 0 ± 0⁄

Sandwich vs. SantaBarbara

0.18 0 ± 0⁄

Monterey vs. SantaBarbara

0.03 0.14 ± 0.01

132 M.L. Nydam et al. / Developmental and Comparative Immunology 41 (2013) 128–136

Page 7: Author's personal copy - Marie Nydam · 2018. 5. 24. · 2005), a cellular slime mold (Benabentos et al., 2009), fungi (Glass et al., 2000), a hydroid (Nicotra et al., 2009), and

Author's personal copy

to have a TPR central clamp domain, which is specific to Hsp90interactions.

Hsp40-L contains a G/F rich domain, as all Type I and II proteinsdo. The function of this domain is not entirely certain. This regioncan occupy many conformational states (Huang et al., 1999), but itis not always simply a flexible linker between the J domain and thezinc-finger or C-terminal domains (Craig et al., 2006). This regionmay play a role in substrate specificity and binding to Hsp70(Yan and Craig, 1999; Fan et al., 2004). Specifically, this G/F regionmay affect Hsp70’s specificity for substrates by influencing theconformation of the J domain or by interacting directly withHsp70 (Craig et al., 2006). In support of this idea, the J domain ofE. coli DnaJ varies structurally depending on whether the G/F re-gion is present (Huang et al., 1999) and the glycine/phenylala-nine-rich region facilitates Hsp70’s binding to the r-32 peptide(Wall et al., 1995).

Hsp40-L lacks a cysteine/glycine repeat domain, which excludesit from the Type I classification of proteins. However, Type I and IIproteins may not have distinct functional roles. The cysteine/gly-cine repeat domain binds non-native substrates for presentationto Hsp70 proteins, but Type II proteins can also bind these sub-strates. The yeast protein Ydj1 binds polypeptides using a cys-teine/glycine repeat domain and the C-terminal region, but Sis1protein does so using only the C-terminal region; no difference inbinding ability between these two proteins was detected (Fanet al., 2004).

The longest section of the Hsp40-L protein is the C-terminal re-gion. In Hsp40 and Hsp40-like proteins, the C-terminal region of-ten contains a conserved carboxy-terminal domain (CTD) (Fanet al., 2004). The C-terminal regions of Hsp40 and Hsp40-like pro-teins are less well conserved and characterized than the otherthree regions (Cheetham and Caplan, 1998) and are involved insubstrate specificity (Fan et al., 2004). According to our JPRED 3alignment, Hsp40-L retains significant homology with otherHsp40-like proteins until amino acid position 350, where manyof the other sequences end. Many of these residues, which corre-spond to the amino acid positions 180–352 of the yeast proteinSis1, are conserved across eukaryotes; these residues are importantfor maintaining the structure of Domains 1 and 2 within the C-ter-minal region (Sha et al., 2000). A portion of the Hsp40-L C-terminalregion further downstream (amino acid positions 480–502) is alsoconserved across a wide diversity of organisms. The sections of theC-terminal region that are not conserved may enable Hsp40-L tobind to B. schlosseri-specific substrates.

We have shown that Hsp40-L is expressed in a particular por-tion of the organism, the ampullae. The expression of chaperoneproteins has not been well-studied (Hageman and Kampinga,2009). A recent study of the expression of the human HSPH/HSPA/DNAJ family shows that these genes do not show tissue-spe-cific expression, except for testis-specific DNAJ members (Hag-eman and Kampinga, 2009).

3.2. Selection and diversity across Hsp40-L

The majority of the Hsp40-L protein experiences purifying selec-tion (x < 1). Only 13–22 codons have a > 95% chance of experienc-ing directional/balancing selection (x > 1). The AMOVA/FST

analyses also support this conclusion, as they reveal similar pat-terns between Hsp40-L and the housekeeping genes. Housekeepinggenes play important roles in the functioning of the organism andare therefore expected to be under purifying selection. However,none of the polymorphism statistics for any of the populationsare significantly different from zero. Using these statistics, we can-not therefore reject the null hypothesis that Hsp40-L evolvesneutrally.

All other Hsp40 and Hsp40-like proteins that we studied fromGenBank also appear to evolve under purifying selection. Thesedata sets have between zero and four codons with a greater than95% probability of selection (x > 1), much lower than the 13–22codons in B. schlosseri Hsp40-L.

More support for purifying selection comes from the DIVAAanalyses. The majority of the Hsp40 and Hsp40-like proteins westudied have similar across-the-protein diversity values, includingB. schlosseri Hsp40-L. However, B. schlosseri Hsp40-L is unique inone respect; the diversity values spike at the C-terminal end ofthe protein, whereas diversity values vary randomly in the otherHsp40 and Hsp40-like proteins, if they vary at all.

Several analyses show that Hsp40-L evolves differently from theother proteins in the fuhc locus. The amino acid diversity of B. sch-losseri Hsp40-L is more similar to Hsp40 and Hsp40-like proteinsfrom other organisms than to other proteins in the fuhc locus.The across-the-protein diversity value for Hsp40-L is 0.053,whereas the diversity value for fuhc is 0.145. AMOVA/FST analysesin fuhc and fester point towards balancing selection (more variationwithin populations at these genes than in housekeeping genes),whereas AMOVA/FST analyses in Hsp40-L point towards purifyingselection (similar patterns of variation in this gene as in house-keeping genes). However, these analyses use the entire coding se-quence, whereas x statistics find codons with > 95% probability ofpositive selection in fuhc, fester, and Hsp40-L (34–45 in fuhc, 15–27in fester, 13–22 in Hsp40-L) (Nydam and De Tomaso, 2012; Nydamet al., 2013). It is therefore likely that directional or balancingselection acts on certain codons in all three of these proteins, ahallmark of allorecognition proteins.

The differences between fester/fuhc and Hsp40-L, specifically thesegregation of polymorphic residues to the C terminus, as well asits expression in the cytoplasm, imply that the latter is not an allo-recognition determinant (like fuhc) or a receptor that interactswith an allorecognition determinant (like fester). The process ofallorecognition promotes diversity and directional/balancing selec-tion in the proteins directly involved in histocompatibility. B. sch-losseri individuals must be able to recognize kin from hundredsof unrelated individuals that they could encounter; allorecognitionligands would be expected to exhibit substantial variation acrossthe entire protein, which Hsp40-L clearly does not.

Proteins that are involved in allorecognition can still have littlevariation, and may be downstream players in the process (i.e. re-cruited by fuhc or fester). The protein uncle fester, which is involvedin initiating the rejection response, is monomorphic at the aminoacid level (McKitrick et al., 2011). This may be the case with theN-terminal half of Hsp40-L. In contrast, the C-terminal region ofHsp40 and Hsp40-like proteins plays a role in binding non-nativesubstrates for presentation to Hsp70. The variation present in theC-terminal region of the B. schlosseri HSP40-L protein and the direc-tional/balancing selection likely acting on this region are uniqueand not seen in other Hsp40 or Hsp40-like proteins. This regionmay encode another zinc-finger domain, although this is not foundin all conserved domain searches, and is hypothetical at this point.A signature of directional/balancing selection is compatible withbiological roles other than allorecognition, and the location ofHsp40-L within an allorecognition locus is not proof that Hsp40-Lis involved in allorecognition. However, this concentrated poly-morphism suggests that Hsp40-L may bind a rapidly evolving andpolymorphic substrate involved in the allorecognition process,and that allorecognition in B. schlosseri may be due to a haplotypethat includes both the secreted and membrane bound fuhc candi-dates, as well as the Hsp40-L protein. As both candidate fuhc genesand Hsp40-L are encoded in a region that is approximately 43 Kb inlength, identifying recombinants within this region will not be triv-ial, and the functional role Hsp40-L plays is still under investiga-tion. Hsp40-L is encoded within the fuhc locus, expressed in the

M.L. Nydam et al. / Developmental and Comparative Immunology 41 (2013) 128–136 133

Page 8: Author's personal copy - Marie Nydam · 2018. 5. 24. · 2005), a cellular slime mold (Benabentos et al., 2009), fungi (Glass et al., 2000), a hydroid (Nicotra et al., 2009), and

Author's personal copy

ampullae, and shows an unusual distribution of polymorphismsuggests this protein plays some role in allorecognition specificity.

4. Material and methods

4.1. In situ hybridization

Whole mount in situ hybridization was carried out using previ-ously published methods (Nyholm et al., 2006). The 400 bp probewas amplified from a unique portion of the Hsp40-L gene usingthese two primers (sense 50-GGTAGTTCCGAAAGTGATTACGAC-30;antisense 50-TCTTTCGTTGTGTTTTGCTAACTC-30), cloned, se-quenced, and transcribed as previously described (Nyholm et al.,2006).

4.2. Sampling

14–20 colonies were collected from floating docks in each offour populations in 2007 and 2008: Eel Pond, MA, Falmouth, MA,Quissett, MA, Sandwich, MA, Monterey, CA, Santa Barbara, CA,and Seattle, WA. Single systems were dissected from individualcolonies and flash-frozen with liquid nitrogen and stored at�80 �C. B. schlosseri has recently been shown to be a complex offive genetically distinct and divergent clades (Lopez-Legentilet al., 2006; Bock et al., 2012). All our samples are Clade A.

4.3. Amplification and sequencing

4.3.1. Hsp40-LTotal RNA was extracted from frozen tissue using a NucleoSpin�

RNA Extract II kit (Macherey–Nagel). cDNA was synthesized fromthe RNA using a reaction mixture of 1 lL of Invitrogen SuperscriptIII, 1 lL of oligo dTT18 and 0.4 lL 25 mM dNTP. The reaction mix-ture was heated to 70 �C for five minutes, put on ice for two min-utes and heated to 50 �C for 90 min. Polymerase chain reaction(PCR) amplification was performed on 1488 bp of the Hsp40-L geneusing 0.6 lL DMSO, 4 lL 5� Phusion buffer, 0.15 lL Hot-start Phu-sion Taq, 0.4 lL of custom-made primers 5’DnaJ (50-GTATTAAGGT-CATTTCTTAAACCC-30) and DnaJWR (50-AACACTTGATTACATTTACCGCAAC-30), 12.29 lL H20 and 2 lL ofcDNA. The cycling conditions included an initial denaturation of98 �C for 3 min, 40 cycles of 98 �C for 30 s, 50 �C for 30 s, 72 �Cfor 45 s, and a final extension step of 72 �C for 3 min. PCR productswere visualized on 1% agarose gels with 0.4 lg/mL of ethidiumbromide.

Each PCR product was cleaned using SpinPrep™ PCR Clean-UpKit (Novagen). The cleaned products were A-tailed using 1 lL10X NEB Standard buffer, 0.08 lL 25 mM dNTP, 0.08 lL NEB Taqand 9.0 lL template. The reaction mixtures were incubated at72 �C for 20 min. These products were ligated into pGEM T vectors(Promega) using 1.0 lL 10X ligase buffer, 1.0 lL T4 ligase, 0.5 lLpGEM vector and 7.5 lL of template. The ligation reactions werecarried out at room temperature overnight. The ligated productswere cleaned by N-butanol precipitation. One microliter of cleanedplasmid was electroporated into 20 lL of Top10 electrocompetentE. coli cells at 1.56 kV, 2.5 kV resistance and 186 resistance timing.One milliliter of SOC was added to the electroporated cells, and thecells were shaken at 225 rpm for one hour at 37 �C. Twenty micro-liters of the cells were plated onto agar plates containing ampicil-lin, X-gal and IPTG. Cells were grown overnight at 37 �C. Whitecolonies were selected, and the gene was amplified using T7 andSP6 primers (34 cycles, 53 �C annealing).

The PCR products were purified using ExoSAP-IT (USB) and cy-cle sequenced in four directions using primers T7 and SP6, or5’DnaJ and WR, and DnaJ_F3 (50-GGAATTAGAGAGAATCGCAACC-

GAAC-30) and F14DnaJR (50-ATTGAGAAACTCTTTCGTTGCAAC-30).The products were cleaned using a 75% isopropanol precipitationand sequenced using a Prism 3100 Genetic Analyser (AppliedBiosystems).

All sequences have been submitted to GenBank (AccessionNumbers JQ864255-JQ864312). Sequences were edited, trimmedand aligned with Aligner (CodonCode Corporation, Dedham, MA).Individuals from the following populations were sequenced: EelPond, MA, Sandwich, MA, Monterey, CA, and Santa Barbara, CA.

4.3.2. Housekeeping genesWe amplified 13 housekeeping genes (12 nuclear genes and

mitochondrial cytochrome oxidase I) to determine whether thepattern of population structure and the values of polymorphismstatistics for the Hsp40-L gene were similar to genes not presumedto be under directional/balancing selection. Polymorphism statis-tics with values statistically less than zero could be due to selectiveor demographic processes (e.g. recent population growth). Butdemographic processes would affect all genes, not just those in-volved in allorecognition. Mitochondrial cytochrome oxidase I(mtCOI) is a gene commonly used for population structure analysesin botryllid ascidians (e.g. Lopez-Legentil et al., 2006; Bock et al.,2012). Two of the 12 nuclear loci were found in GenBank (adult-type muscle actin 2, Accession # FN178504.1 and vasa, Accession# FJ890989.1) and the other 10 were located in the De Tomaso Lab-oratory B. schlosseri EST database (40S ribosomal protein 3A, 60Sribosomal protein L6, 60S ribosomal protein L8, 60S ribosomal pro-tein L10, 60S ribosomal protein L13, heat shock cognate 71 kdaprotein, cytoplasmic actin 2, ADP/ATP translocase 3, Hsp90 beta,and vigilin).

Template for PCR amplification was generated as describedabove for the Hsp40-L gene. Primers and thermocycling conditionsfor each gene are available from the authors. vasa PCR productswere cloned as described above for Hsp40-L. The PCR products ofthe other nuclear loci were sequenced directly. PCR products wereincubated with 0.25 lL each of Exonuclease I and Shrimp AntarcticPhosphatase at 37 �C for 30 min, followed by 90 �C for 10 min.

Purified PCR products were sequenced with a Big Dye Termina-tor Cycle sequencing kit and a 96 capillary 3730xl DNA Analyzer(Applied Biosystems) at the UC Berkeley Sequencing Facility. Forsequences that were obtained by direct sequencing of PCR prod-ucts (all nuclear sequences minus vasa), we reconstruct haplotypesusing PHASE implemented in DnaSP 5.10.01 (Librado and Rozas,2009). All sequences have been submitted to GenBank (Hsp40L:XXXXXX, 40S ribosomal protein 3A: JQ596880-JQ596936, 60S ribo-somal protein L6: JQ596937-JQ597084, 60S ribosomal protein L8:JQ597085-JQ597174, 60S ribosomal protein L10: JQ597175-JQ597294, 60S ribosomal protein L13: JQ597595-JQ597716, heatshock cognate 71 kda protein: JQ597295-JQ597430, cytoplasmicactin 2: JQ597431-JQ597548, ADP/ATP translocase 3: JQ597549-JQ597594, Hsp90 beta: JQ597717-JQ597826, adult-type muscle ac-tin 2: JQ597827-JQ597974, mtCOI: JN083237-JN083303, vasa:JN083304-JN083376, vigilin: JQ597975-JQ598070). Sequenceswere edited, trimmed and aligned with Aligner (CodonCode Corpo-ration, Dedham, MA). Individuals from the following populationswere sequenced: Falmouth, MA, Quissett, MA, Sandwich, MA, Mon-terey, CA, Santa Barbara, CA, and Seattle, WA.

4.4. Domain characterization and amino acid conservation

We classified Hsp40-L (Type I, II, or III) based on informationfound in the literature about the amino acid composition of eachdomain (Hennessy et al., 2000; Walsh et al., 2004). We used JPRED3 (Cole et al., 2008) to predict the secondary structure of the entireprotein. For secondary structure prediction, we used a 1,618 bp se-quence generated from fosmid and Illumina sequencing rather

134 M.L. Nydam et al. / Developmental and Comparative Immunology 41 (2013) 128–136

Page 9: Author's personal copy - Marie Nydam · 2018. 5. 24. · 2005), a cellular slime mold (Benabentos et al., 2009), fungi (Glass et al., 2000), a hydroid (Nicotra et al., 2009), and

Author's personal copy

than the 1,347 bp sequence obtained from our Sanger sequencing.For the J domain specifically, we determined whether Hsp40-L con-tains certain amino acids that interact with Hsp70 proteins (Hen-nessy et al., 2000).

4.5. Amino acid diversity across Hsp40-L

We employed the program DIVAA (Rodi et al., 2004) to calculatethe amino acid variation across the B. schlosseri Hsp40-L protein.The same analyses were performed on the following Hsp40-L se-quences from GenBank: bacteria Aeromonas veronii bv. Sobria (Pop-Set 325305397, Silver et al., 2011), Bacillus licheniformis, B. sp.,Bacillus subtilis subsp. Inaquosorum, B. subtilis subsp. Spizizenii (Pop-Set 284154897, Connor et al., 2010), Vibrio coralliilyticus (PopSet298112845, Pollock et al., 2010), butterflies Heliconius cyndo cordu-la, H. melpomene melpomene, H. heurippa (PopSet 300077621, Sala-zar et al., 2010), Heliconius erato emma, H. erato favorinus, H. eratoemma x favorinus (PopSet 295150361, Counterman et al., 2010), flyDrosophila melanogaster (PopSet 30421311, Bettencourt BR, Ler-man DN, Feder ME unpublished data), and mosquito Anophelesgambiae (PopSet 56067752, Turner et al., 2005).

A diversity value of 1 at a particular position means that anyamino acid is as likely as any other to be present at that position.A diversity value of 0.05 is consistent with 100% conservation atthat site (1/20 possible amino acids).

4.6. Recombination

Intragenic recombination in Hsp40-L for all populations to-gether was assessed by calculating Rm, the minimum number ofrecombination events in DnaSP 5.10.01 (Librado and Rozas, 2009)and the correlation between physical distance and three measuresof linkage disequilibrium: r2, D0 and G4 in program permute (Wil-son and McVean, 2006).

4.7. Tests of selection: x statistics

x (dN/dS) values and associated 95% HPD regions across Hsp40-Lfrom B. schlosseri and the 14 taxa mentioned in the ‘‘Amino aciddiversity across Hsp40-L’’ section (and an additional taxon: sun-flower Helianthus annus (PopSet 328796638, Strasburg et al.,2011)) were estimated using the program omegaMap 0.5 (Wilsonand McVean, 2006). omegaMap runs were carried out using the re-sources of the Computational Biology Service Unit at Cornell Uni-versity which is partially funded by the Microsoft Corporation.We chose 250,000 iterations for each run, with thinning set to1000. We used an improper inverse distribution for l (rate of syn-onymous transversion), and j (transition-transversion ratio), andan inverse distribution for x (selection parameter) and q (recom-bination rate). Initial parameter values for l and j were 0.1, and3.0, respectively. x and q priors were set between 0.01 and 100.An independent model was used for x, so that x values were al-lowed to vary across sites. The number of iterations discarded asburnin varied across runs, but was determined by plotting thetraces of l and j; iterations affected by the starting value of theparameter were discarded. Two independent runs were conductedfor each population. These two runs were combined in all cases,after it was determined that the mean and 95% highest posteriordensity (HPD) regions for each parameter in the two runs matchedclosely. We also calculated the posterior probability of selectionper codon across the proteins.

Mann–Whitney U tests in R 2.15.0 were used to determinewhether amino acid positions 339–449 had higher x values thanthe rest of the protein (one-sided test), whether the J domainhad lower x values than the rest of the protein (one-sided test),

and whether the G/F rich region had different x values than therest of the protein (two-sided test).

4.8. Tests of selection: polymorphism statistics

The summary statistics h, p, number of haplotypes and haplo-type diversity were calculated in DnaSP 5.10.01 (Librado and Ro-zas, 2009) for B. schlosseri Hsp40-L and all B. schlosserihousekeeping genes.

We also employed Tajima’s D (Tajima, 1989) and Fu and Li’s D⁄

and F⁄ (Fu and Li, 1993) test statistics for the same data set. Statis-tical significance of D, D⁄, and F⁄ were determined using 1000 coa-lescent simulations in DnaSP. We performed two sets of coalescentsimulations: based on h and segregating sites. Estimates of pergene recombination (R) for each population were made in DnaSPand were then imported into the simulations.

4.9. Tests of selection: distribution of polymorphism within and amongpopulations

We characterized population structure within B. schlosseri forHsp40-L and all housekeeping genes using an analysis of molecularvariance (AMOVA), fixation indices (Fct, Fsc and Fst), and pairwise Fst

values between all populations in Arlequin 3.5.1.2 (Excoffier andLischer, 2010). The two groups in these analyses were U.S. EastCoast and U.S. West Coast.

Acknowledgements

This work was funded by National Science Foundation GrantIOS-0842138, and National Institute of Health Grant AI041588 toA.W. De Tomaso and by the Margaret Davis Collison Scholarshipto T. Hoang. We thank Kathi Ishizuka, Tanya McKitrick, and KarlaPalmeri for technical advice.

Appendix A. Supplementary data

Supplementary data associated with this article can be found, inthe online version, at http://dx.doi.org/10.1016/j.dci.2013.03.004.

References

Banecki, B., Liberek, K., Wall, D., Wawrzynow, A., Georgopoulos, C., Bertoli, E.,Tanfani, F., Zylicz, M., 1996. Structure-function analysis of the zinc finger regionof the DnaJ molecular chaperone. J. Biol. Chem. 271, 14840–14848.

Benabentos, R., Hirose, S., Sucgang, R., Curk, T., Katoh, M., Ostrowski, E., Strassmann,J., Queller, D., Zupan, B., Shaulsky, G., Kuspa, A., 2009. Polymorphic members ofthe lag-gene family mediate kin-discrimination in Dictyostelium. Curr. Biol. 19,567–572.

Bock, D.G., MacIsaac, H.J., Cristescu, M.E., 2012. Multilocus genetic analysesdifferentiate between widespread and spatially restricted cryptic species in amodel ascidian. Proc. Roy. Soc. Lond. B. http://dx.doi.org/10.1098/rspb.2011.2610.

Bos, D.H., Gopurenko, D., Williams, R.N., DeWoody, J.A., 2008. Inferring populationhistory and demography using microsatellites, mitochondrial DNA, and majorhistocompatibility complex (MHC) genes. Evolution 62, 1458–1468.

Brychzy, A., Rein, T., Winklhofer, K.F., Hartl, F.U., Young, J.C., Obermann, W.M.J.,2003. Cofactor Tpr2 combines two TPR domains and a J domain to regulate theHsp70/Hsp90 chaperone system. EMBO 22, 3613–3623.

Bukau, B., Horwich, A.L., 1998. The Hsp70 and Hsp60 chaperone machines. Cell 92,351–366.

Buss, L., 1982. Somatic cell parasitism and the evolution of somatic tissuecompatibility. Proc. Natl. Acad. Sci. 79, 5337–5341.

Campos, J.L., Posada, D., Moran, P., 2006. Genetic variation at MHC, mitochondrial,and microsatellite loci in isolated populations of Brown trout (Salmo trutta).Conserv. Genet. 7, 515–530.

Cheetham, M.E., Caplan, A.J., 1998. Structure, function and evolution of DnaJ:conservation and adaption of chaperone function. Cell Stress Chaperon 3, 28–36.

Cohen, S., 2002. Strong positive selection and habitat-specific amino acidsubstitution patterns in MHC from an estuarine fish under intense pollutionstress. Mol. Biol. Evol. 19, 1870–1880.

M.L. Nydam et al. / Developmental and Comparative Immunology 41 (2013) 128–136 135

Page 10: Author's personal copy - Marie Nydam · 2018. 5. 24. · 2005), a cellular slime mold (Benabentos et al., 2009), fungi (Glass et al., 2000), a hydroid (Nicotra et al., 2009), and

Author's personal copy

Cole, C., Barber, J.D., Barton, G.J., 2008. The JPRED 3 Secondary structure predictionserver. Nucl. Acids. Res. 36, W197–201.

Connor, N., Sikorski, J., Rooney, A.P., Kopac, S., Koeppel, A.F., Burger, A., Coles, S.G.,Perry, E.B., Krizanc, D., Field, N.C., Slaton, M., Cohan, F.M., 2010. Ecology ofspeciation in the genus Bacillus. Appl. Environ. Microbiol. 76, 1349–1358.

Counterman, B.A., Araujo-Perez, F., Hines, H.M., Baxter, S.W., Morrison, C.M.,Lindstrom, D.P., Papa, R., Ferguson, L., Joron, M., Ffrench-Constant, R.H., Smith,C.P., Nielsen, D.M., Jiggins, C.D., Reed, R.D., Halder, G., Mallet, J., McMillan, W.O.,2010. Genomic hotspots for adaptation: the population genetics of Mullerianmimicry in Heliconius erato. PLoS Genet. 6, E1000796.

Craig, E.A., Huang, P., Aron, R., Andrew, A., 2006. The diverse roles of J-proteins, theobligate Hsp70 co-chaperone. Rev. Physiol. Biochem. Pharmacol. 156, 1–21.

De Tomaso, A.W., Nyholm, S.V., Palmeri, K.J., Ishizuka, K.J., Ludington, W.B., Mitchel,K., Weissman, I.L., 2005. Isolation and characterization of a protochordatehistocompatibility locus. Nature 438, 454–459.

Excoffier, L., Lischer, H.E.L., 2010. Arlequin suite ver. 3.5: a new series of programs toperform population genetics analyses under Linux and Windows. Mol. Ecol. Res.10, 564–567.

Fan, C.Y., Lee, S., Ren, H.Y., Cyr, D.M., 2004. Exchangeable chaperone modulescontribute to specification of type I and type II Hsp40 cellular function. Mol.Biol. Cell. 15, 761–773.

Flajnik, M.F., Kasahara, M., 2010. Origin and evolution of the adaptive immunesystem. Nat. Rev. Gen. 11, 47–59.

Fu, Y.X., Li, W.H., 1993. Statistical tests of neutrality of mutations. Genetics 133,693–709.

Garrigan, D., Hedrick, P.W., 2003. Detecting adaptive molecular polymorphism:lessons from the MHC. Evolution 57, 1707–1722.

Gassler, C.S., Buchberger, A., Laufer, T., Mayer, M.P., Schroder, H., Valencia, A., Bukau,B., 1998. Mutations in the DnaK chaperone affecting interaction with the DnaJcochaperone. Proc. Natl. Acad. Sci. USA 95, 15229–15234.

Gibbs, K.A., Urbanowski, M.L., Greenberg, E.P., 2008. Genetic determinants of selfidentity and social recognition in bacteria. Science 321, 256–259.

Glass, N.L., Jacobson, D.J., Shiu, P.K., 2000. The genetics of hyphal fusion andvegetative incompatibility in filamentous ascomycete fungi. Ann. Rev. Gen. 34,165–186.

Grosberg, R.G., Plachetzki, D., 2010. Marine invertebrates: genetics of colonyrecognition. ‘‘Marine invertebrates: genetics of colony recognition’’. In: Breed,M.D., Moore, J. (Eds.), Encyclopedia of Animal Behavior. Academic Press, Oxford.

Hageman, J., Kampinga, H.W., 2009. Computational analysis of the human HSPH/HPSA/DNAJ family and cloning of a human HSPH/HSPA/DNAJ expression library.Cell Stress Chaperon 14, 1–21.

Harada, Y., Takagaki, Y., Sunagawa, M., Saito, H., Yamada, L., Taniguchi, H., Shoguchi,E., Sawada, H., 2008. Mechanism of self-sterility in a hermaphroditic chordate.Science 320, 548–550.

Hennessy, F., Nicoll, W.S., Zimmermann, R., Cheetham, M., Blatch, G.L., 2005. Not allJ domains are created equal: Implications for the specificity of Hsp40–Hsp70interactions. Prot. Sci. 14, 1697–1709.

Hennessy, F., Cheetham, M.E., Dirr, H.W., Blatch, G.L., 2000. Analysis of the level ofconservation of the J domain among the various types of DnaJ-like proteins. CellStress Chaperon 5, 347–358.

Huang, K., Flanagan, J.M., Prestegard, J.H., 1999. The influence of C-terminalextension on the structure of the ‘‘J-domain’’ in E. coli DnaJ. Prot. Sci. 8, 203–214.

Jensen, R.E., Johnson, A.E., 1999. Protein translocation: is Hsp70 pulling my chain?Curr. Biol. 9, R779–R782.

Koutsogiannouli, E.A., Moutou, K.A., Sarafidou, T., Stamatis, C., Spyrou, V., Mamuris,Z., 2009. Major histocompatibility complex variation at class II DQA locus in thebrown hare (Lepus europaeus). Mol. Ecol. 18, 4631–4649.

Librado, P., Rozas, J., 2009. DnaSP v5: a software for comprehensive analysis of DNApolymorphism data. Bioinformatics 25, 1451–1452.

Lopez-Legentil, S., Turon, X., Planes, S., 2006. Genetic structure of the star sea squirt,Botryllus schlosseri introduced into southern European harbours. Mol. Ecol. 15,3957–3967.

Lu, Z., Cyr, D.M., 1998. Protein folding activity of Hsp70 is modified differentially bythe hsp40 co-chaperones Sis1 and Ydj1. J. Biol. Chem. 273, 27824–27830.

McKitrick, T.R., Muscat, C.C., Pierce, J.D., Bhattacharya, D., De Tomaso, A.W., 2011.Allorecognition in a basal chordate consists of independent activating andinhibitory pathways. Immunity 34, 616–626.

Nasrallah, J.B., 2002. Recognition and rejection of self in plant reproduction. Science296, 305–308.

Nicotra, M.L., Powell, A.E., Rosengarten, R.D., Moreno, M., Grimwood, J., Lakkis, F.G.,Dellaporta, S.L., Buss, L.W., 2009. A hypervariable invertebrate allodeterminant.Curr. Biol. 19, 583–589.

Nydam, M.L., De Tomaso, A.W., 2012. The fester locus in Botryllus schlosseriexperiences selection. BMC Evol. Biol. 12, 249.

Nydam, M.L., Taylor, A.A., De Tomaso, A.W., 2013. Evidence for selection on achordate histocompatibility locus. Evolution 67, 487–500.

Nyholm, S.V., Passegue, E., Ludington, W.B., Voskoboynik, A., Mitchel, K., Weissman,I.L., De Tomaso, A.W., 2006. Fester, a candidate allorecognition receptor from aprimitive chordate. Immunity 25, 163–173.

Pollock, F.J., Morris, P.J., Willis, B.L., Bourne, D.G., 2010. Detection and quantificationof the coral pathogen Vibrio coralliilyticus by real-time PCR with TaqManfluorescent probes. Appl. Environ. Microbiol. 76, 5282–5286.

Qiu, X.B., Shao, Y.M., Miao, S., Wang, L., 2006. The diversity of the DnaJ/Hsp40family, the crucial partners for Hsp70 chaperones. Cell. Mol. Life Sci. 63, 2560–2570.

Rodi, D.J., Mandava, S., Makowski, L., 2004. DIVAA: analysis of amino acid diversityin multiple aligned protein sequences. Bioinformatics 20, 3481–3489.

Salazar, C., Baxter, S.W., Pardo-Diaz, C., Wu, G., Surridge, A., Linares, M.,Bermingham, E., Jiggins, C.D., 2010. Genetic evidence for hybrid traitspeciation in heliconius butterflies. PLoS Genet. 6, E10000930.

Sha, B., Lee, S., Cyr, D., 2000. The crystal structure of the peptide-binding fragmentfrom the yeast Hsp40 protein Sis1. Structure 8, 799–807.

Silver, A.C., Williams, D., Faucher, J., Horneman, A.J., Gogarten, J.P., Graf, J., 2011.Complex evolutionary history of the Aeromonas veronii group revealed by hostinteraction and DNA sequence data. PLoS ONE 6, E16751.

Strasburg, J.L., Kane, N.C., Raduski, A.R., Bonin, A., Michelmore, R., Rieseberg, L.H.,2011. Effective population size is positively correlated with levels of adaptivedivergence among annual sunflowers. Mol. Biol. Evol. 28, 1569–1580.

Szabo, A., Korszun, R., Hartl, F.U., Flanagan, J., 1996. A zinc finger-like domain of themolecular chaperone DnaJ is involved in binding to denatured proteinsubstrates. EMBO J. 15, 408–417.

Tajima, F., 1989. Statistical method for testing the neutral mutation hypothesis byDNA polymorphism. Genetics 123, 585–595.

Turner, T.L., Hahn, M.W., Nuzhdin, S.V., 2005. Genomic islands of speciation inAnopheles gambiae. PLoS Biol. 3, E285.

Wall, D., Zylicz, M., Georgopoulos, C., 1995. The conserved G/F motif of the DnaJchaperone is necessary for the activation of the substrate binding properties ofthe DnaK chaperone. J. Biol. Chem. 270, 2139–2144.

Walsh, P., Bursac, D., Law, Y.C., Cyr, D., Lithgow, T., 2004. The J-protein family:modulating protein assembly, disassembly and translocation. EMBO Rep. 5,567–571.

Wan, Q.H., Zhu, L., Wu, H., Fang, S.G., 2006. Major histocompatibility complex classII variation in the giant panda (Ailuropoda melanoleuca). Mol. Ecol. 15, 2441–2450.

Wilson, D.J., McVean, G., 2006. Estimating diversifying selection and functionalconstraint in the presence of recombination. Genetics 172, 1411–1425.

Worley, K., Carey, J., Veitch, A., Coltman, D.W., 2006. Detecting the signature ofselection on immune genes in highly structured populations of wild sheep (Ovisdalli). Mol. Ecol. 15, 623–637.

Yan, W., Craig, E.A., 1999. The glycine-phenylalanine-rich region determines thespecificity of the yeast Hsp40 SisI. Mol. Cell. Biol. 19, 7751–7758.

136 M.L. Nydam et al. / Developmental and Comparative Immunology 41 (2013) 128–136