identification of diversified genes that contain immunoglobulin-like variable regions in a...

8
nature immunology volume 3 no 12 december 2002 www.nature.com/natureimmunology A RTICLES 1200 John P. Cannon 1,2 , Robert N. Haire 3 and Gary W. Litman 1–3 Published online 4 November 2002; doi:10.1038/ni849 The evolutionary origin of adaptive immune receptors is not understood below the phylogenetic level of the jawed vertebrates. We describe here a strategy for the selective cloning of cDNAs encoding secreted or transmembrane proteins that uses a bacterial plasmid (Amptrap) with a defective β- lactamase gene.This method requires knowledge of only a single target motif that corresponds to as few as three amino acids; it was validated with major histocompatibility complex genes from a cartilaginous fish. Using this approach, we identified families of genes encoding secreted proteins with two diversified immunoglobulin-like variable (V) domains and a chitin-binding domain in amphioxus, a protochordate.Thus, multigenic families encoding diversified V regions exist in a species lacking an adaptive immune response. 1 Immunology Program, H. Lee Moffitt Cancer Center and Research Institute, 12902 Magnolia Avenue,Tampa, FL 33612, USA. 2 All Children’s Hospital, 801 Sixth Street South, St. Petersburg, FL 33701, USA. 3 Department of Pediatrics, University of South Florida College of Medicine, Children’s Research Institute, 140 Seventh Avenue South, St. Petersburg, FL 33701, USA. Correspondence should be addressed to G.W. L. ([email protected]) at USF address. Identification of diversified genes that contain immunoglobulin-like variable regions in a protochordate The immunoglobulin superfamily (IgSF) is one of the most extensive- ly diversified gene families in the animal kingdom and contains many variant structures that have assumed different biological roles during evolution. Because of the versatility of the Ig fold, IgSF orthologs often share only a few, localized, short regions of amino acid identity among species 1,2 . Particular focus has been directed to the members of the IgSF that are involved in immune function. These proteins contain different types of Ig domains, including the variable (V), intermediate (I) and constant (C) domains. Each domain family is distinguished by tertiary structural features deduced from crystallographic analysis that can be represented by specific sequence motifs and the spacing between two conserved cysteine residues that form an intradomain disulfide bond. Although I-type domains appear to have an ancient origin that likely predated the emergence of adaptive immunity in jawed vertebrates 2 , V- and C-type domains are associated more closely with antigen recogni- tion by immune receptors. Vertebrate Ig and T cell antigen receptor (TCR) molecules both contain V regions that undergo somatic recom- bination to produce the diversity characteristic of adaptive immunity. C-type domains are found at the COOH termini of Ig, TCR and major histocompatibility complex (MHC) class I and II molecules. Orthologs of the IgSF genes associated with either adaptive immunity or MHC class I and II have not been identified in chordate species that occupy phylogenetic positions below that of the cartilaginous fishes 3,4 . However, because of the high variability associated with immune receptors, it is possible that such molecules exist in these species. Currently, a number of approaches are in use for the identification of homologous genes in diverse model systems. In humans, mice, fruit flies (Drosophila melanogaster), nematodes (Caenorhabditis elegans) and several other species, high representation expressed sequence tag (EST) databases and evolving approaches for scanning genomic sequences have facilitated the identification of distantly related genes; however, this technique only can be used for species that have been analyzed by large-scale genomic experiments 5–8 . Polymerase chain reaction (PCR)–based approaches have been invaluable in defining genes of immunological relevance in both higher and lower vertebrate species; however, these approaches require two separate and appropriately spaced regions of conserved amino acid sequence identity 9–15 . In addition, the necessity of intro- ducing sequence ambiguities in the design of oligonucleotide primers often results in high artifactual content, which can overwhelm efforts to amplify appropriate gene sequences. To improve the overall efficiency of cloning of cDNAs that encode secreted proteins or membrane-bound receptors and identify homolo- gous genes possessing only minimal sequence relatedness, we engi- neered a plasmid-based selection vector, termed Amptrap, developed a cloning strategy using this vector that requires only minimal informa- tion about the gene of interest and validated its effectiveness. Using the Amptrap cloning strategy, we have identified multiple families of V region–containing molecules, which also possess chitin-binding domains, in amphioxus (Branchiostoma floridae), a protochordate species that lacks an adaptive immune system. Results A selection-based cloning strategy The Amptrap bacterial plasmid vector was engineered to express a mature β-lactamase enzyme in which the NH2-terminal secretion signal peptide has been deleted; the absence of this region precludes the secre- tion of β-lactamase and results in sensitivity to growth on ampicillin. Secretion of β-lactamase is restored if a cDNA sequence that is insert- ed 5and in-frame to the β-lactamase coding sequence encodes both a © 2002 Nature Publishing Group http://www.nature.com/natureimmunology

Upload: gary-w

Post on 21-Jul-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Identification of diversified genes that contain immunoglobulin-like variable regions in a protochordate

nature immunology • volume 3 no 12 • december 2002 • www.nature.com/natureimmunology

ARTICLES

1200

John P. Cannon1,2, Robert N. Haire3 and Gary W. Litman1–3

Published online 4 November 2002; doi:10.1038/ni849

The evolutionary origin of adaptive immune receptors is not understood below the phylogenetic levelof the jawed vertebrates. We describe here a strategy for the selective cloning of cDNAs encodingsecreted or transmembrane proteins that uses a bacterial plasmid (Amptrap) with a defective β-lactamase gene.This method requires knowledge of only a single target motif that corresponds to asfew as three amino acids; it was validated with major histocompatibility complex genes from acartilaginous fish. Using this approach, we identified families of genes encoding secreted proteinswith two diversified immunoglobulin-like variable (V) domains and a chitin-binding domain inamphioxus, a protochordate.Thus, multigenic families encoding diversified V regions exist in a specieslacking an adaptive immune response.

1Immunology Program, H. Lee Moffitt Cancer Center and Research Institute, 12902 Magnolia Avenue,Tampa, FL 33612, USA. 2All Children’s Hospital, 801 Sixth StreetSouth, St. Petersburg, FL 33701, USA. 3Department of Pediatrics, University of South Florida College of Medicine, Children’s Research Institute, 140 Seventh Avenue South,

St. Petersburg, FL 33701, USA. Correspondence should be addressed to G.W. L. ([email protected]) at USF address.

Identification of diversified genes thatcontain immunoglobulin-like variable

regions in a protochordate

The immunoglobulin superfamily (IgSF) is one of the most extensive-ly diversified gene families in the animal kingdom and contains manyvariant structures that have assumed different biological roles duringevolution. Because of the versatility of the Ig fold, IgSF orthologs oftenshare only a few, localized, short regions of amino acid identity amongspecies1,2. Particular focus has been directed to the members of the IgSFthat are involved in immune function. These proteins contain differenttypes of Ig domains, including the variable (V), intermediate (I) andconstant (C) domains. Each domain family is distinguished by tertiarystructural features deduced from crystallographic analysis that can berepresented by specific sequence motifs and the spacing between twoconserved cysteine residues that form an intradomain disulfide bond.Although I-type domains appear to have an ancient origin that likelypredated the emergence of adaptive immunity in jawed vertebrates2, V-and C-type domains are associated more closely with antigen recogni-tion by immune receptors. Vertebrate Ig and T cell antigen receptor(TCR) molecules both contain V regions that undergo somatic recom-bination to produce the diversity characteristic of adaptive immunity.C-type domains are found at the COOH termini of Ig, TCR and majorhistocompatibility complex (MHC) class I and II molecules. Orthologsof the IgSF genes associated with either adaptive immunity or MHCclass I and II have not been identified in chordate species that occupyphylogenetic positions below that of the cartilaginous fishes3,4.However, because of the high variability associated with immunereceptors, it is possible that such molecules exist in these species.

Currently, a number of approaches are in use for the identificationof homologous genes in diverse model systems. In humans, mice,fruit flies (Drosophila melanogaster), nematodes (Caenorhabditiselegans) and several other species, high representation expressedsequence tag (EST) databases and evolving approaches for scanning

genomic sequences have facilitated the identification of distantlyrelated genes; however, this technique only can be used for speciesthat have been analyzed by large-scale genomic experiments5–8.Polymerase chain reaction (PCR)–based approaches have beeninvaluable in defining genes of immunological relevance in bothhigher and lower vertebrate species; however, these approachesrequire two separate and appropriately spaced regions of conservedamino acid sequence identity9–15. In addition, the necessity of intro-ducing sequence ambiguities in the design of oligonucleotide primersoften results in high artifactual content, which can overwhelm effortsto amplify appropriate gene sequences.

To improve the overall efficiency of cloning of cDNAs that encodesecreted proteins or membrane-bound receptors and identify homolo-gous genes possessing only minimal sequence relatedness, we engi-neered a plasmid-based selection vector, termed Amptrap, developed acloning strategy using this vector that requires only minimal informa-tion about the gene of interest and validated its effectiveness. Using theAmptrap cloning strategy, we have identified multiple families of Vregion–containing molecules, which also possess chitin-bindingdomains, in amphioxus (Branchiostoma floridae), a protochordatespecies that lacks an adaptive immune system.

ResultsA selection-based cloning strategyThe Amptrap bacterial plasmid vector was engineered to express amature β-lactamase enzyme in which the NH2-terminal secretion signalpeptide has been deleted; the absence of this region precludes the secre-tion of β-lactamase and results in sensitivity to growth on ampicillin.Secretion of β-lactamase is restored if a cDNA sequence that is insert-ed 5′ and in-frame to the β-lactamase coding sequence encodes both a

©20

02 N

atu

re P

ub

lish

ing

Gro

up

h

ttp

://w

ww

.nat

ure

.co

m/n

atu

reim

mu

no

log

y

Page 2: Identification of diversified genes that contain immunoglobulin-like variable regions in a protochordate

ARTICLES

www.nature.com/natureimmunology • december 2002 • volume 3 no 12 • nature immunology 1201

methionine start codon (ATG) and a signal peptide immediately down-stream from the start codon. By plating on ampicillin, the cloning ofcDNAs that encode either intracellular or nuclear proteins, or any othersequence lacking a signal peptide, is selectively eliminated. Such irrel-evant sequences can greatly reduce the efficiency of recovery of targetclones in degenerate low-stringency PCR amplifications10,16.

We combined Amptrap selection with degenerate 5′-RACE (rapidamplification of cDNA ends) PCR to allow cloning and selection ofsecreted or membrane proteins based on conservation of a single shortamino acid motif (Fig. 1). Specifically, full-length, double-strandedcDNA was synthesized from various tissues so that each cDNAsequence contained a common, specific anchor sequence at the 5′ endof the sense strand (see Methods)17. A 5′-RACE PCR reaction was thendone in which the specific anchor sequence served as the 5′ primer-binding site and was coupled with a 3′ degenerate antisense primercomplementing a short region of predicted amino acid sequence iden-tity. After PCR amplification, products were cloned directionally intothe Amptrap vector. Only those cDNA clones that contained a startcodon and signal sequence, fused in-frame to the codons complement-ed by the 3′ PCR primer, were able to grow on ampicillin.

In many instances (see below), the approximate distance between asingle conserved priming site and the NH2-terminal signal peptidecould be predicted, thus permitting size selection and further elimina-tion of irrelevant sequences; PCR products in the range of ∼ 200 to>800 bp were cloned and selected. Thus, cDNAs successfully clonedwith the Amptrap strategy met five criteria for positive selection:enrichment of 5′ ends of cDNAs during first stand synthesis; therequirement for a methionine start codon in the inserted cDNA; therequirement for a signal peptide open-reading frame (ORF) down-stream of the start codon; the requirement for conserved amino acid

codons being in-frame with the start codon and ORF signal peptide;and the requirement for a specified distance between the 5′ end of thecDNA and the 3′ degenerate primer binding site, which defined a basisfor size selection. By requiring a start codon in the cloned sequence,the cloning of introns, intergenic DNA regions and untranslatedregions of exons—all of which encumber other PCR-based gene dis-covery methods11,16—was minimized. Because these experiments werebased on selection rather than simple screening, clones encoding irrel-evant proteins were eliminated and did not appear in the pool ofcolonies for analysis; often, relatively few clones were recovered, butthe frequency of relevant targets was high.

Validation of Amptrap cloningThe selection capabilities of the directed Amptrap-PCR approach foridentifying related IgSF cDNAs based on low sequence identity were val-idated first by the cloning and selection of MHC and β2-microglobulin(β2M) genes in the clearnose skate, Raja eglanteria, a representative car-tilaginous fish (Fig. 2). The amino acid sequence Cys-X-Val (CXV) isconserved in the α3 domains of many MHC class I proteins as well as inthe α2 domains of MHC class II proteins and in β2M. A 3′ primer com-plementing the CXV motif, in which all possible Cys and Val codonswere incorporated and the second codon position was fully degenerate(NNN), was used in directed Amptrap-PCR cloning of R. eglanteriaspleen cDNA (Fig. 2a). The initial PCR reaction produced a broad ethid-ium bromide–staining smear (Fig. 2b). Reaction products were subject-ed to digestion with the restriction endonuclease SfiI and size-selected bygel filtration to remove unincorporated primers and very short productsbefore ligation to the Amptrap vector. After bacterial transformation andselection on ampicillin-kanamycin plates, ∼ 30 colonies were recoveredand the insert sizes from half of these colonies were determined by PCR.Eight colonies contained cDNA inserts of at least ∼ 600 bp and wereselected for nucleotide sequencing (Fig. 2c); five of the eight insertsequences indicated strong similarity to nurse shark MHC IIα (with theBLASTX alignment search tool, to BLAST expect value (E) = 8 × 10–59).The failure to recover MHC class I sequences was likely due to both sizeselection bias and the need to change PCR cycling conditions to favor therecovery of longer transcripts (unpublished observations).

A similar experiment, in which an oligonucleotide primer comple-menting five residues was used, yielded shorter products with strongsimilarity to β2M. The predicted coding region of a full-length cDNAencoding the skate β2M ortholog consisted of a 111 amino acid ORFthat exhibited strong similarity to various mammalian β2Ms (with theBLASTP alignment search tool, E = 10–11 – 10–12). We concluded thefollowing from an alignment of skate β2M and other selected β2Ms:considerable identities exist between skate β2M and the other β2Ms;several regions of identity between the other β2Ms are not shared byskate β2M; several identities are shared by skate β2M and some, but not

Figure 1. Amptrap selection strategy. First-strand cDNA synthesis was donein the presence of the SMART oligoribonucleotide, which annealed to a nontem-plated stretch of oligo(dC) residues added by RT to the end of the nascent cDNA.The RT enzyme completed the first strand of cDNA by adding nucleotides com-plementary to the SMART sequence, which included an SfiI recognition site (5′-GGCCNNNN^NGGCC). PCR was then done on the cDNA with an oligonu-cleotide that corresponded to the SMART 5′ sequence (5′ to its SfiI site) and adegenerate oligonucleotide, such as CXV-Sfi, that corresponded to a putative con-served motif of three to five amino acids plus a second SfiI recognition sequence.The SfiI sites at the ends of the resulting PCR products were asymmetrical andallowed directional cloning into corresponding SfiI sites in the Amptrap vector.Afterselection of E. coli transformants on ampicillin plates, colonies were evaluated forinsert size with colony PCR. Inserts of the anticipated size range were sequenceddirectly, and the source clones were archived. BLA, β-lactamase gene.

cDNA synthesis

PCR amplification

Cloning inAmptrapvector

Selection onampicillin

Colony PCRsize evaluation

SequencingClonearchive

+

mRNASMART-5′

SMART-5′

CXV-Sfi

oligo-dTCCC 1st strand cDNA

S 1 2 3 4 5 6

Secreted or membrane proteinIntracellular protein

12 3

45 6

Signalpeptide

Lac promoter ∆BLA

©20

02 N

atu

re P

ub

lish

ing

Gro

up

h

ttp

://w

ww

.nat

ure

.co

m/n

atu

reim

mu

no

log

y

Page 3: Identification of diversified genes that contain immunoglobulin-like variable regions in a protochordate

nature immunology • volume 3 no 12 • december 2002 • www.nature.com/natureimmunology

ARTICLES

1202

all, other β2Ms; and, based on the sequences of the other β2Ms, it isapparent that a standard PCR-based amplification based on two con-tiguous three or four amino acid regions of identity would not havebeen possible (Fig. 2d). A variety of other genes were recovered in val-idation studies that used Amptrap, including mouse NKG2D (a type IItransmembrane protein), a multipass membrane protein (tetraspaninfrom amphioxus), a mouse ortholog of the orphan human immunereceptor CMRF-3518 and an Ig joining (J) chain from R. eglanteria.

V region–containing genes in a protochordateMany different approaches have failed to yield evidence for diversifiedV region genes characteristic of the rearranging antigen binding andother receptors in species found below the phylogenetic level of the car-tilaginous fish. We hypothesized that three to five amino acid motifs sur-rounding a single conserved tryptophan residue in the NH2-terminal Igdomains of teleost novel immune-type receptors (NITRs) possiblyreflect a conserved feature of primordial immune receptors19,20. To eval-uate this possibility, we used several different 3′ primers complementingthis region in individual reactions to amplify and clone cDNA from theprotochordate amphioxus with Amptrap-PCR. One of the primers, cor-responding to the amino acid sequence Trp-Phe-Lys (WFK), producedfour distinct families of PCR products encoding NH2-terminal frag-ments of related IgSF proteins; these four families represented ∼ 27% ofall Amptrap cDNA inserts derived with the WFK primer. Probes repre-senting three of the families were used successfully to screen a cDNAlibrary for full-length clones; isolation of cDNAs representing the fourthfamily necessitated the use of 3′-RACE due to the presence of an inter-nal SfiI site in the cDNA, which interfered with cloning in the λTriplEx2vector. In addition, a fifth family of cDNAs was identified in subsequent3′-RACE experiments with gene-specific primers based on the sequenceof one of the original families of amplified products. Representative full-length cDNA clones for all five gene families encoded putative secretedproteins containing two IgSF domains at their NH2 termini and singleputative chitin-binding domains at their COOH termini (Fig. 3a).

The IgSF domains were of the V-type (see below) and the chitin-binding domains demonstrated characteristic spacing of cysteineresidues: Cys1-X13-Cys2-X5-Cys3-X10–12-Cys4-X12–13-Cys5-X7–9-Cys6,where Xn represents a contiguous stretch of n amino acids. Such spac-ing is an unequivocal feature of the chitin-binding protein domain. Anexpressed recombinant form of one of these proteins bound chitin, anddeletion of the predicted chitin-binding domain abolished binding(data not shown). An alignment of the predicted amino acid sequencesof the five polypeptides, designated V region–containing chitin-bind-ing proteins (VCBP), is shown (Fig. 3b). Except in the cases ofVCBP2 and VCBP5, which share 78% identical NH2-terminal Vdomains, the five VCBP families share only limited amino acidsequence identity (27–41%), although localized regions of sequencesimilarity are apparent. The translated sequences of VCBP genes werenot significantly similar to known proteins in database searches donewith BLASTP; however, similarities among the chitin-bindingdomains of VCBPs and several chitinases were noted. An Ig-like foldhas been described in chitinase A from the bacterium Serratiamarcescens21; however, this domain is divergent from the Ig domainsof antigen receptors and fails to meet the criteria for V regions(described below).

Defining the requisite structural features of V-type domains in rela-tion to the more ancient I-type domains is essential to further interpre-tation of the VCBP genes. On the basis of crystallographic analysesand compilations of vertebrate V domains, V-frame determinantresidues are comprised, in part, by a set of seven residues that arecanonical to both V- and I-type Ig domains11,22: Gly16, Cys23, Trp41,Leu89 (or other hydrophobic residue), Asp98, Tyr102 and Cys104. Aneighth canonical position, Arg75 (with Lys or His seen less frequentlyat this position), Arg (Lys or His)75, is required along with Asp98 for theformation of a salt bridge characteristic of both V and I domains22.Although there are many examples of Ig and TCR V domains that lackone of the eight canonical residues and rare examples that lack two,variation at three or more residue positions has been observed only

Figure 2.Validation of Amptrap cloning based on MHC class II and β2M genes from Raja eglanteria. (a) Location of conserved CXV motif in MHC class II. L,leader;TM, transmembrane; CYT, cytoplasmic (b) Agarose gel analysis of the 5′-RACE PCR products; the size standard was ΦX174-HaeIII (sizes in bp: 1353, 1078, 872, 603,310, 281, 271, 234, 194, 118 and 72). (c) Sizing of inserts from ampicillin-resistant colonies. Products selected for sequencing are indicated by asterisks (*); the size standardwas ΦX174-HaeIII. (d) ClustalW40 comparison of Raja β2M with β2M sequences from humans, mice, chickens, cod, zebrafish and sturgeon. Black shading indicates identicalresidues at the same position in four or more sequences. Gray shading indicates functionally equivalent residues at the same position in four or more sequences.

a

b

c

d

©20

02 N

atu

re P

ub

lish

ing

Gro

up

h

ttp

://w

ww

.nat

ure

.co

m/n

atu

reim

mu

no

log

y

Page 4: Identification of diversified genes that contain immunoglobulin-like variable regions in a protochordate

ARTICLES

www.nature.com/natureimmunology • december 2002 • volume 3 no 12 • nature immunology 1203

rarely in antigen receptor V domains and was used here operationallyto rule out a V designation for indeterminate candidate molecules.Based on crystallographic information, the structures of both V- and I-type domains share eight distinct amino acid strands (A, B, C, C′, D,E, F and G); however, V-type domains also contain a ninth strand, C′′ ,which is required for the formation of the V region complementarity-

determining region 2 (CDR2), a central feature of antigen bindingsites. I-type domains lack the C′′ strand and, as a result, the distancebetween Cys23 and Cys104 in an I-type domain rarely exceeds 70residues; in contrast, this distance ranges from 65 to 78 residues in V-type domains. The ten VCBP Ig domains have been designated V-type,based on the above considerations.

Figure 3. Characterization of fivefamilies of predicted IgSF proteinsfrom amphioxus. (a) Schematic rep-resentation of a V region–containingchitin-binding protein (VCBP). L,leader; CBD, chitin-binding domain.(b) ClustalW40 amino acid alignmentof five families of VCBP. Higher verte-brate Ig and TCR consensus residuesare indicated above the alignments (V-frame residues in green; J-like motifs inpurple). Cysteine residues characteris-tic of chitin-binding domains areshown in blue.Three or more identi-cal residues at the same position arehighlighted in red; three or more func-tionally equivalent residues are high-lighted in yellow. (c) ClustalW40 align-ment of invertebrate V- and I-typesequences with vertebrate antigenreceptor V regions. Xenopus (Africanclawed frog), Danio (zebrafish),Heterodontus (horned shark), and Rajaantigen receptors were selected assequences representing phylogeneti-cally lower vertebrates and theirdomain lengths were typical of theirrespective data pools. Consensus V-frame residues within each domainare highlighted in green. Sequencesare arranged in descending order ofdomain length. Where multipledomains are listed from the samesource sequence, individual domainsare indicated by parentheses. FREP2,B. glabrata FREP2; FREP3, B. glabrataFREP3; FREP4, B. glabrata FREP4;FREP7, B. glabrata FREP7; HS-CD8B,human CD8β; MM-TREM2A, mouseTREM2a; CTX, X. laevis CTX; HF-IGH,Heterodontus francisci IgH; CHT1,chicken ChT1; RE-TCRD, R. eglanteriaTCRδ; XL-TCRG, X. laevis TCRγ;AMALGAM, Drosophila amalgam;NITR1.1, D. rerio novel immune-typereceptor 1.1; DR-IGLTCH, D. rerio IgL;CMRF35, human CMRF35; GCRTK,Geodia cydonium receptor tyrosinekinase; DR-TCRA, D. rerio TCRα; HS-SIRPB1, human SIRPβ1; MM-SIRPA1,mouse SIRPα1; IG_BOTSC, B.schlosseri soluble Ig moleculehomolog; BSMBP, B. schlosseri man-nose-binding protein.

a

b

c

©20

02 N

atu

re P

ub

lish

ing

Gro

up

h

ttp

://w

ww

.nat

ure

.co

m/n

atu

reim

mu

no

log

y

Page 5: Identification of diversified genes that contain immunoglobulin-like variable regions in a protochordate

nature immunology • volume 3 no 12 • december 2002 • www.nature.com/natureimmunology

ARTICLES

1204

Assignments of V-type domains have been made previously in proto-chordates and other invertebrates based on database searches done witheither BLAST alignment tools or simple amino acid similarity pat-terns23–27. However, some of these studies failed to distinguish between V-and I-type domains or relied on criteria that did not emphasize the impor-tance of V-frame residues in general and the canonical V residues in par-ticular. To compare the V domains of VCBPs and other invertebrate mol-ecules to the V domains of antigen-binding receptors using the criteria ofconserved canonical residues and domain length, we generated an align-ment of 34 candidate V regions that have been described in variousspecies (Fig. 3c). It is apparent that the IgSF domains described in thefibrinogen-related proteins (FREPs)26 from the snail Biomphalariaglabrata generally lack two to four key V-frame residues and cannot beunambiguously classified as a V-type domain family. The Ig domains ofthe soluble immunoglobulin molecule homolog (IG_BOTSC)23 and man-nose-binding protein (BSMBP)24 from Botryllus schlosseri, as well as theNH2-terminal domain of the tyrosine kinase (GCRTK)25 from the marinesponge Geodia cydonium, also would be classified as I-type, as the dis-tance between Cys23 and Cys104 in each domain is insufficient to form theC′′ strand characteristic of V domains. However, if a (Gly16→Ser1616) isconsidered22 the COOH-terminal IgSF domain of GCRTK meets theminimal V-type criteria. The NH2-terminal IgSF domain of Drosophilaamalgam27 represents the best example of a nonchordate domain meetingthe residue and length requirement for a V-type domain. However, amal-gam is present as only a single copy in the Drosophila genome and mayreflect the structure of a distant ancestral form of the V domain or repre-sent evolutionary convergence. Although lectin-mediated innate immuni-ty is well recognized in protochordates28, the VCBP genes described here,by our criteria, represent the only multigenic families encoding diversi-fied V regions below the phylogenetic level of jawed vertebrates.

During the segmental rearrangement process of vertebrate antigenbinding receptors, short sequences known as the diversity (D) andjoining (J) regions are fused to the 3′ ends of V-region genes. J regionsare conserved in all rearranging antigen receptor genes and encode

the amino acid consensus sequence Phe-Gly-X-Gly-Thr-X-Leu-X-Val (FGXGTXLXV). A continuous region similar to J is not presentin any of the VCBP sequences thus far characterized; however, oneform of VCBP4 encodes the motif FGXG after the NH2-terminal Vdomain. The corresponding regions of VCBP2, VCBP3, otherVCBP4s and VCBP5 all encode FGXD at this position. In addition,the TXLXV motif, which represents an essentially invariant feature ofboth J regions and I-type Ig domains, is found in VCBP1 and VCBP3after the COOH-terminal V domain (Fig. 3b). The identification oftwo separated J-related motifs in VCBP3 is notable, as a recombina-tion event between these segments could give rise to a single struc-tural region resembling the complete J region of the rearranging anti-gen binding receptors. A contiguous VJ sequence was the likely tar-get of a recombination-activating gene–mediated transposition eventthat created the segmental rearrangement process, which somaticallydiversifies both Ig and TCR genes29,30. A precursor of this target maybe reflected in the VCBP genes.

Complexity and selective expression of VCBP genesIn addition to the sequence variation that was apparent in the cDNAcomparisons, diversity of the VCBP genes also was evident ingenomic blotting (Fig. 4a). This complexity was consistent withobservations of sequence variation in individual VCBP2 cDNAs andgenomic segments. We recovered eight separate VCBP2-related tran-scripts using Amptrap-PCR and cDNA library screening of pooledcDNA from seven specimens of amphioxus. A nucleotide alignmentof these sequences was used to design VCBP2 consensus primers,which were used subsequently with the same cDNA pool to examinediversity in an additional ten PCR products from a narrower regionof VCBP2-related genes. When the 18 sequences were compared inaggregate, it was apparent that the VCBP2 V-frame residuesremained constant throughout the compilations, multiple substitu-tions were present at several positions and, overall, the patterns ofsequence variation were regionalized (Fig. 4b).

a b

Figure 4. Diversity of VCBP genes. (a) Hybridization of individual VCBP family-specific probes (1–5) to 10 µg of HindIII-restricted genomic DNA from a single animal. λ-HindIII size standards are indicated. (b) ClustalW40 alignment of amino acid translations of VCBP2 cDNAs isolated from a pool of seven animals.VCBP2.1-VCBP2.6,VCBP2.8:Amptrap-PCR products isolated with the WFK-Sfi primer;VCBP2.7: cDNA clone isolated by hybridization to a VCBP2 probe; PCR01-PCR10: RT-PCR fragments isolated withVCBP2-specific primers.Arrowheads indicate the positions of VCBP2-specific primers (black, primers for cDNA; white, primers for single-animal genomic DNA);V consensusis boxed at the top. Unresolvable codons at the 3′ end of the VCBP2.4 cDNA are represented by X.A stop codon in sequence PCR09 is indicated by an asterisk (*).Aminoacid substitutions at the same position are highlighted in successive changes of color: no color, blue, green, yellow, red and magenta.

©20

02 N

atu

re P

ub

lish

ing

Gro

up

h

ttp

://w

ww

.nat

ure

.co

m/n

atu

reim

mu

no

log

y

Page 6: Identification of diversified genes that contain immunoglobulin-like variable regions in a protochordate

ARTICLES

www.nature.com/natureimmunology • december 2002 • volume 3 no 12 • nature immunology 1205

To initially address diversity within an organism, PCR amplificationof genomic DNA from individual amphioxus with a second pair ofVCBP2 consensus primers was done (Fig. 4b, white arrowheads). Werecovered 18 sequences encoding a 42–amino acid segment; four dif-ferent sequences across this segment were identified in one animal andthree were identified in another. All these genomic sequences wereidentical to corresponding portions of at least one cDNA sequenceshown in Fig. 4b (data not shown). Notwithstanding the diversity evi-dent in this comparison, it is essential to note that the analysis was notexhaustive, even within the VCBP2 family, and did not address thediversity that was evident in the other VCBP families. Comparisons offull copy–length cDNA and genomic sequences of VCBP genes fromindividual animals may determine whether or not a mechanism forsomatic variation is present at this level of phylogeny.

Protochordates lack organized lymphoid tissue but show generalanatomical similarities to the phylogenetically higher chordates. Insitu RNA hybridization to transverse sections of adult B. floridaedemonstrated selective expression of VCBP1 in scattered cells in theintestine (Fig. 5). Probes complementing corresponding regions ofVCBP2 and VCBP3 demonstrated essentially superimposablehybridization patterns (data not shown). In situ hybridization showedthat the genes are expressed in adult animals only in the intestine and,within the intestine, exclusively by a cell lineage localized to theintestinal walls. Notably, this species is a colloidal filter feeder, andthus its digestive tract is continuously exposed to a diverse range ofpotential pathogens.

DiscussionInvestigations of the function of distinct IgSF molecules such as pro-tochordate VCBPs are essential for us to understand the mechanismsby which structural variation in immune receptors has been achievedduring evolution. In the case of VCBPs, which are likely secretedand contain at least two separate functional domains, two distinctmodes of antigen recognition can be envisioned. Vertebrate Vregions associate with self as well as foreign antigens, peptides aswell as MHC class I and II (in the context of antigen presentation)and also with one another (OX2-OX2 receptor)31. The cDNA andgenomic data we describe here provide compelling evidence forextensive structural diversity in VCBPs. Thus, it is reasonable toconclude that the overall size of the VCBP multigene family andnature of substitutions may be associated with specific V-directedrecognition of foreign antigens. However, the presence of a func-tional chitin-binding domain is also notable, as the protochordatesare likely subject to continuous challenge by chitin-containing para-sitic arthropods, pathogenic bacteria and fungi. VCBPs may use theirchitin-binding domains, rather than their V regions, to interact withmicrobial determinants, with the V regions serving to mediate anoth-er function.

Roles for chitin-binding proteins in innate immunity have been pro-posed. Specifically, penaeidins—a moderately diverse family of genesfound in a marine invertebrate—consist of a short NH2-terminal regionas well as a chitin-binding domain and exhibit both antimicrobial andantifungal activities32. Penaeidins are synthesized in hemocytes, bindto the chitin-containing invertebrate cuticle as well as to bacteria andfungi and may signal an opsonic process33. Tachychitin, which is foundin another invertebrate, also contains a short NH2 terminus as well asa chitin-binding domain and enhances the antimicrobial activity of β-defensins34,35. The VCBPs represent a potential bifunctional moleculein which the central core feature of the adaptive immune receptors—that is, diversified V regions—is combined with a chitin-bindingdomain, a moiety associated with innate immune function in inverte-brates. Other roles for VCBPs in self-nonself discrimination representadditional possibilities36.

The relevance of the findings presented here extends beyond theobservations with the VCBPs. Additional immune-type genes may bepresent in modern representatives of the protochordates as well as inthe more recently diverged jawless vertebrates37. However, given theextensive phylogenetic distances separating these species and thejawed vertebrates, it is likely that whatever relationships exist (suchas those described with VCBPs) may be limited to relatively fewresidues that define the essential structural features of such mole-cules. Immune receptors pose a difficult challenge because their func-tion can largely depend on variation, the very effect that confoundsgenetic identification. Even within the related, but highly diversified,families of rearranging antigen receptor genes, relatively few posi-tions can be considered invariant. In addition, functionally equivalentamino acid substitutions can occur at “invariant” positions. Becauseit is necessary to define only the first and third positions in a threeamino acid motif, Amptrap-PCR is particularly well suited to testingsubstitutions at those positions. The strategic advantage of thisapproach is evident and should open new paths to investigations ofthe origins of immune recognition.

MethodsGeneration of Amptrap cloning vector G7311. A DNA fragment from pBluescript(Stratagene, La Jolla, CA), encoding the mature β-lactamase enzyme without its secretionsignal sequence, was amplified by PCR with specific primers and cloned into the EcoRI siteof the pBluescript vector to form pBS-BLA-C (G6934). A SacI-KpnI fragment of G6934,containing the signal sequence–deficient β-lactamase gene, was subcloned into pBK-CMVto form pBK-BLA-C (G6946). Two SfiI sites were introduced to G6946 at its SacII-BamHIsites with synthetic oligonucleotides. Because the SfiI recognition site (5′-GGCC-NNNN^NGGCC) is noncontiguous, the two SfiI sites added to the vector were chosen to beasymmetrical and compatible with commercially available vectors (SfiIA: 5′-GGCCAT-TA^TGGCC-3′; SfiIB: 5′-GGCCGCCT^CGGCC-3′, ^ denotes the site of endonucleaseclearage). The neo gene of pBK-CMV was replaced by the neo gene of pCR4 (Invitrogen,Carlsbad, CA) in order to remove an SfiI site in the pBK-CMV–derived neo gene, formingplasmid G7186. Phage P1 loxP sites were added 3′ to the β-lactamase gene in G7186 as aHindIII-XhoI fragment to form the Amptrap vector G7311. G7311 was linearized with SfiIand purified with a Chromaspin-400 column (Clontech, Palo Alto, CA).

Figure 5. Selective expression ofVCBP1 in adult amphioxus intes-tine. (a) Hematoxylin and eosin stain-ing of a transverse 8-µm section.Magnification: × 100. (b,c) In situ RNAhybridization of VCBP1 probes to serial8-µm sections with (b) a sense probe(c) an antisense probe. Magnification:× 100. The variation in the patterns ofhybridization in c reflects the tangentialnature of the sectioning. (d) Highlightedregion of c; magnification: × 400.

a b c d

©20

02 N

atu

re P

ub

lish

ing

Gro

up

h

ttp

://w

ww

.nat

ure

.co

m/n

atu

reim

mu

no

log

y

Page 7: Identification of diversified genes that contain immunoglobulin-like variable regions in a protochordate

nature immunology • volume 3 no 12 • december 2002 • www.nature.com/natureimmunology

ARTICLES

1206

RNA isolation and full length cDNA synthesis. RNA was isolated from either a sin-gle R. eglanteria spleen or pooled dissected ventral regions of adult B. floridae.Poly(A)+ RNA was purified with magnetic oligo(dT) beads (Dynal, Oslo, Norway).Full-length cDNA was synthesized from 1 µg of poly(A)+ RNA with the SMART sys-tem (Clontech), which is based on nontemplated addition of oligo(dC) to the 3′ end ofnascent first-strand cDNA by reverse transcriptase (RT)17. Briefly, first-strand cDNAwas synthesized by RT in the presence of an oligoribonucleotide (SMART) containinga short oligo(rG) sequence at its 3′ end, which allowed annealing to the nontemplatedoligo(dC) at the ends of first-strand products, template strand switching by RT and sub-sequent completion of each first strand cDNA with a specific anchor sequence comple-mentary to the SMART oligoribonucleotide. This reaction resulted in a final volume of100 µl of cDNA for subsequent 5′-RACE PCR (see below). cDNA (50 µl) prepared inthis manner from B. floridae was digested with SfiI and ligated into the SfiIA and SfiIBcloning sites of λTriplEx2 phage arms (Clontech). A cDNA library consisting of 2 × 106

primary recombinants was derived. All experimental protocols involving skate andamphioxus were approved by the University of South Florida Institutional Animal Careand Use Committee and were in accordance with NIH guidelines for care and use ofexperimental animals.

Amptrap-PCR. Degenerate 5′-RACE PCR was done with 1 µl of the cDNA templatedescribed above. PCR reactions in which the SMART 5′ primer was combined with var-ious 3′ degenerate primers (WFK-sfi: 5′-GACTGGCCGAGGCGGCCCYTTRAACCA-3′; CXV-sfi: 5′-GACTGGCCGAGGCGGCCCNACNNNRCA; C[HQ]V[DE]H-sfi: 5′-TGGCCGAGGCGGCCCRTGNTCNACNTGRCA-3′, with SfiI sites in italics) were donewith Taq polymerase in 20 cycles of PCR (94 °C × 30 s; 55–45 °C × 30 s, –0.5 °C percycle; 72 °C × 2 min), followed by 20 cycles at the final annealing temperature (94 °C ×30 s; 45 °C × 30 s; 72 °C × 2 min). We have found in certain applications that it is bene-ficial to slightly increase primer length (where possible), decrease annealing temperatureand vary extension times in order to recover amplified products in instances of low and/orcompeting template concentrations. After PCR, products were restricted with SfiI, puri-fied with Chromaspin-400 or -1000 columns and ligated to SfiI-linearized Amptrap vec-tor G7311. Ligations were used to electrotransform Escherichia coli DH10β cells with aCell-Porator E. coli Pulser (Whatman Biometra, Göttingen, Germany). Transformantswere selected on Luria-Bertani (LB)–agar medium containing 50 µg/ml of ampicillin and25 µg/ml of kanamycin. Colony inserts were sized by PCR amplification of colonies withprimers flanking the insert site of G7311. Colony PCR products of the desired size weresequenced directly (see below).

Cloning of VCBP4 and VCBP5 by 3′- and 5′-RACE. Because the cDNA encodingVCBP4 contains an SfiI site, full-length VCBP4 cDNAs could not be isolated from theλTriplEx2 cDNA library. The VCBP4 cDNA was extended by 3′-RACE with theGeneRacer Kit (Invitrogen) and TITANIUM Taq polymerase (Clontech). A contiguousORF for VCBP4 was confirmed by PCR cloning with gene-specific primers flanking thestart and stop codons. The VCBP5 sequence was identified as a 3′-RACE product afteramplification with VCBP2 consensus primers. The remainder of the VCBP5 cDNA wasisolated by 5′-RACE with VCBP5-specific primers derived from the 3′-RACE amplifiedproduct.

Analysis of VCBP diversity by PCR with cDNA and genomic templates. Sequencesfrom cDNA and genomic DNA templates were amplified by PCR with Taq polymerase orTITANIUM Taq polymerase and VCBP2 consensus primers. Products were cloned into a T-tailed pBluescript vector for bidirectional sequencing.

cDNA library screening. Recombinants (5 × 105) from the unamplified B. floridaecDNA library (above) or an amplified R. eglanteria spleen cDNA library in λUNIZAP(Stratagene) were screened under conditions of moderate stringency—65 °C hybridiza-tion in 0.6 M NaCl, 0.2 M Tris at pH 8.0, 0.5% SDS, 0.1% Na4P2O7 and 20 mM EDTA;washed in 1 × SSC (150 mM NaCl and 15 mM sodium citrate) at 52 °C—with ampli-fied PCR product probes labeled to a specific activity of 109 cpm/µg of DNA. Positiveplaques were purified and their inserts were excised in vivo into plasmids with eitherthe BM25.8 bacterial strain (λTriplEx2) or the ExAssist phage excision system(λUNIZAP).

DNA sequencing. PCR products and plasmid inserts were sequenced by dideoxynucleotidechain termination38 with either Thermo Sequenase (Amersham Biosciences, Piscataway,NJ) or Excel II-LC (Epicentre, Madison, WI) cycle sequencing kits with fluorescentprimers. Reaction products were resolved on a LI-COR Long Readir 4200 DNA sequencer(LI-COR, Lincoln, NE).

Genomic hybridization analysis. B. floridae genomic DNA (10 µg/lane), derived from asingle amphioxus, was restricted with HindIII, electrophoretically separated and trans-ferred with TurboBlotter transfer apparatus (Schleicher and Schuell, Keene, NH) toZetaprobeGT (BioRad Laboratories, Hercules, CA). Hybridization was done inExpressHyb (Clontech) according to the manufacturer’s protocol. Probes used as abovewere Ig domain–containing fragments of the respective VCBP cDNAs. Exposure timeswere 1–3 days. After hybridization, bound probe was dissociated by washing in 1% SDS+ 1 mM EDTA at 90 °C, and the single blot was rehybridized with successive probes rep-resenting VCBP1 through VCBP5.

In situ RNA hybridization. Digoxygenin-labeled riboprobes were generated by in vitrotranscription of cloned VCBP cDNA fragments with T3 or T7 RNA polymerase. Probescorresponded to fragments of the Ig regions of each respective VCBP cDNA. In situRNA hybridization to 8-µm transverse paraffin sections of adult B. floridae was donewith ∼ 600 ng/ml of sense or antisense riboprobe in 50% formamide 5 × SSC39. Boundriboprobe was detected with alkaline phosphatase–conjugated anti-digoxygenin. Aftercolor development in nitro blue tetrazolium-X-phosphate, sections were counterstainedwith nuclear fast red.

Accession numbers. GenBank accession numbers for the sequences reported here are asfollows: R. eglanteria J chain, AF520475; R. eglanteria β2M, AF520476; B. floridaeVCBP1, AF520472; B. floridae VCBP2, AF520473; B. floridae VCBP3, AF520474; B.floridae VCBP4, AF532182; B. floridae VCBP5, AF532183. Accession numbers for pre-viously published sequences are as follows: human β2M, NP_004039; mouse β2M,NP_033865; chicken β2M, P21611; cod β2M, CAA10761; zebrafish β2M, NP_571238;sturgeon β2M, CAB61322; B. glabrata FREP2, AAK13550; B. glabrata FREP3,AAK13548; B. glabrata FREP4, AAK13551; B. glabrata FREP7, AAK13547; humanCD8β, AAB21668; mouse TREM2a, NP_112543; Xenopus laevis CTX, AAC59899;Heterodontus francisci IgH, CAA31798; chicken ChT1, CAA74391; R. eglanteriaTCRδ, AAB51491; X. laevis TCRγ, AAM21540; Drosophila amalgam, NP_476579;Danio rerio NITR1.1, NP_571721; D. rerio IgL, AAG31726; human CMRF35,NP_006669; Geodia cydonium receptor tyrosine kinase, CAA66986; D. rerio TCRα,AAG31712; human SIRPβ1, NP_006056; mouse SIRPα1, CAA71375; B. schlosseri sol-uble Ig molecule homolog, CAA67003; B. schlosseri mannose-binding protein,CAA62217.

Acknowledgments

We thank B. Pryor for editorial assistance, R. Litman for sequence analysis, C.Andrews,J.Wahle and T.Willis for technical assistance and C.Amemiya for comments about themanuscript. Supported by NIH grant AI23338 (to G.W. L) and the H. Lee MoffittCancer Center (J. P. C.).

Competing interests statementThe authors declare that they have no competing financial interests.

Received 18 June 2002; accepted 10 September 2002.

1. Barclay,A. N. et al. The Leucocyte Antigen FactsBook (Academic Press, San Diego, 1997).2. Teichmann, S.A. & Chothia, C. Immunoglobulin superfamily proteins in Caenorhabditis elegans. J. Mol.

Biol. 296, 1367–1383 (2000).3. Flajnik, M. F. & Kasahara, M. Comparative genomics of the MHC: glimpses into the evolution of the

adaptive immune system. Immunity 15, 351–362 (2001).4. Litman, G.W.,Anderson, M. K. & Rast, J. P. Evolution of antigen binding receptors. Annu. Rev. Immunol.

17, 109–147 (1999).5. Moore, P.A. et al. BLyS: member of the tumor necrosis factor family and B lymphocyte stimulator.

Science 285, 260–263 (1999).6. Ollmann, M. et al. Drosophila p53 is a structural and functional homolog of the tumor suppressor

p53. Cell 101, 91–101 (2000).7. Nagle, D. L. et al. The mahogany protein is a receptor involved in suppression of obesity. Nature 398,

148–152 (1999).8. Coyle,A. J. & Gutierrez-Ramos, J. C.The expanding B7 superfamily: increasing complexity in costimu-

latory signals regulating T cell function. Nature Immunol. 2, 203–209 (2001).9. Rast, J. P. & Litman, G.W.T cell receptor gene homologs are present in the most primitive jawed ver-

tebrates. Proc. Natl. Acad. Sci. USA 91, 9248–9252 (1994).10. Rast, J. P. et al. α, β, γ, and δT cell antigen receptor genes arose early in vertebrate phylogeny.

Immunity 6, 1–11 (1997).11. Litman, G.W., Hawke, N.A. & Yoder, J.A. Novel immune-type receptor genes. Immunol. Rev. 181,

250–259 (2001).12. Rast, J. P., Haire, R. N., Litman, R.T., Pross, S. & Litman, G.W. Identification and characterization of T-

cell antigen receptor related genes in phylogenetically diverse vertebrate species. Immunogenetics 42,204–212 (1995).

13. Okamura, K., Ototake, M., Nakanishi,T., Kurosawa,Y. & Hashimoto, K.The most primitive vertebrateswith jaws possess highly polymorphic MHC class I genes comparable to those of humans. Immunity7, 777–790 (1997).

14. Bartl, S. & Weissman, I. L. Isolation and characterization of major histocompatibility complex class IIBgenes from the nurse shark. Proc. Natl. Acad. Sci. USA 91, 262–266 (1994).

15. Sarwal, M. M., Sontag, J. M., Hoang, L., Brenner, S. & Wilkie,T. M. G protein α subunit multigene familyin the Japanese puffer fish Fugu rubripes: PCR from a compact vertebrate genome. Genome Res. 6,1207–1215 (1996).

16. Hawke, N.A., Strong, S. J., Haire, R. N. & Litman, G.W.Vector for positive selection of in-frame genet-ic sequences. Biotechniques 23, 619–621 (1997).

17. Chenchik,A. et al. in Gene Cloning and Analysis by RT-PCR (eds. Siebert, P. & Larrick, J.W.) 305–319(BioTechniques Books,Westborough, MA, 1998).

18. Jackson, D. G., Hart, D. N., Starling, G. & Bell, J. I. Molecular cloning of a novel member of theimmunoglobulin gene superfamily homologous to the polymeric immunoglobulin receptor. Eur. J.Immunol. 22, 1157–1163 (1992).

19. Strong, S. J. et al. A novel multigene family encodes diversified variable regions. Proc. Natl. Acad. Sci.USA 96, 15080–15085 (1999).

20. Yoder, J.A. et al. Immune-type receptor genes in zebrafish share genetic and functional propertieswith genes encoded by the mammalian lymphocyte receptor cluster. Proc. Natl. Acad. Sci. USA 98,6771–6776 (2001).

21. Perrakis,A. et al. Crystal structure of a bacterial chitinase at 2. 3 A resolution. Structure 2,1169–1180 (1994).

22. Harpaz,Y. & Chothia, C. Many of the immunoglobulin superfamily domains in cell adhesion moleculesand surface receptors belong to a new structural set which is close to that containing variable

©20

02 N

atu

re P

ub

lish

ing

Gro

up

h

ttp

://w

ww

.nat

ure

.co

m/n

atu

reim

mu

no

log

y

Page 8: Identification of diversified genes that contain immunoglobulin-like variable regions in a protochordate

ARTICLES

www.nature.com/natureimmunology • december 2002 • volume 3 no 12 • nature immunology 1207

domains. J. Mol. Biol. 238, 528–539 (1994).23. Pancer, Z., Cooper, E. L. & Muller,W. E.A tunicate (Botryllus schlosseri) cDNA reveals similarity to ver-

tebrate antigen receptors. Immunogenetics 45, 69–72 (1996).24. Pancer, Z., Diehl-Seifert, B., Rinkevich, B. & Muller,W. E.A novel tunicate (Botryllus schlosseri) putative

C-type lectin features an immunoglobulin domain. DNA Cell Biol. 16, 801–806 (1997).25. Pancer, Z., Skorokhod,A., Blumbach, B. & Muller,W. E. G. Multiple Ig-like featuring genes divergent

within and among individuals of the marine sponge Geodia cydonium. Gene 207, 227–233 (1998).26. Zhang, S.-M., Leonard, P. M.,Adema, C. M. & Loker, E. S. Parasite-responsive IgSF members in the snail

Biomphalaria glabrata: characterization of novel genes with tandemly arranged IgSF domains and afibrinogen domain. Immunogenetics 53, 684–694 (2001).

27. Seeger, M.A. & Kaufman,T. C. Characterization of amalgam: a member of the immunoglobulin super-family from Drosophila. Cell 55, 589–600 (1988).

28. Vasta, G. R., Quesenberry, M. S.,Ahmed, H. & O’Leary, N. Lectins from tunicates: structure-functionrelationships in innate immunity. Adv. Exp. Med. Biol. 484, 275–287 (2001).

29. Agrawal,A., Eastman, Q. M. & Schatz, D. G.Transposition mediated by RAG1 and RAG2 and its impli-cations for the evolution of the immune system. Nature 394, 744–751 (1998).

30. Hiom, K., Melek, M. & Gellert, M. DNA transposition by the RAG1 and RAG2 proteins: a possiblesource of oncogenic translocations. Cell 94, 463–470 (1998).

31. Wright, G. J. et al. Lymphoid/neuronal cell surface OX2 glycoprotein recognizes a novel receptor onmacrophages implicated in the control of their function. Immunity 13, 233–242 (2000).

32. Destoumieux, D., Muñoz, M., Bulet, P. & Bachere, E. Penaeidins, a family of antimicrobial peptides froma penaeid shrimp (Crustacea, Decapoda).Cell Mol. Life Sci. 57, 1260–1271 (2000).

33. Destoumieux, D. et al. Penaeidins, antimicrobial peptides with chitin-binding activity, are producedand stored in shrimp granulocytes and released after microbial challenge. J. Cell Sci. 113, 461–469(2000).

34. Kawabata, S. et al. Tachycitin, a small granular component in horseshoe crab hemocytes, is an antimi-crobial protein with chitin-binding activity. J. Biochem. 120, 1253–1260 (1996).

35. Iwanaga, S., Kawabata, S. & Muta,T. New types of clotting factors and defense molecules found inhorseshoe crab hemolymph: their structures and functions. J. Biochem. 123, 1–15 (1998).

36. Medzhitov, R. & Janeway Jr., C.A. Decoding the patterns of self and nonself by the innate immunesystem. Science 296, 298–300 (2002).

37. Laird, D. J., De Tomaso,A.W., Cooper, M. D. & Weissman, I. L. 50 million years of chordate evolution:seeking the origins of adaptive immunity. Proc. Natl. Acad. Sci. USA 97, 6924–6926 (2000).

38. Sanger, F., Nicklen, S. & Coulson,A. R. DNA sequencing with chain-terminating inhibitors. Proc. Natl.Acad. Sci. USA 74, 5463–5467 (1977).

39. Wilkinson, D. G. In Situ Hybridization: A practical approach (Oxford University Press, New York, NY,1999).

40. Thompson, J. D., Higgins, D. G. & Gibson,T. J. CLUSTAL W: improving the sensitivity of progressivemultiple sequence alignment through sequence weighting, positions-specific gap penalties and weightmatrix choice. Nucleic Acids Res. 22, 4673–4680 (1994).

©20

02 N

atu

re P

ub

lish

ing

Gro

up

h

ttp

://w

ww

.nat

ure

.co

m/n

atu

reim

mu

no

log

y