structure, organization, and controlled expression of the genes coding for the variable and constant...

27
Immunological Rev. (1977), Vol. 36 Published by Munksgaard, Copenhagen, Denmark No part may be reproduced by any process without written permission from the author(s) Structure, Organization, and Controlled Expression of the Genes Coding for the Variable and Constant Regions of Mouse Immunoglobulin Light Chains ISRAEL SCHECHTER', YIGAL BURSTEIN^ & RONALD ZEMELL' INTRODUCTION The problem of antibody diversity narrows nowadays to the DNA se- quences coding for the variable (V)-regions of immunoglobulin (Ig) chains. Resolution of this problem at the molecular level requires knowledge of the struaure, organization and mechanisms controlling expression of the V-genes. Gene expression is fairly well understood in prokaryotes, mainly because many mutants can be easily generated and highly sophisticated techniques have been developed to select and to characterize these mutants. These methodologies have not yet been sufficiently advanced for eukaryotic cells. An alternative approach is to use the mRNA molecule to analyze the expression of a distinct gene at the levels of the genome, transcription and translation. Since the mRNA is the key material in these studies it should be obtained at the highest purity. For this goal we developed an immuno- precipitation technique to separate from the total polysome population the polysome fraction engaged in the synthesis of a distinct protein to be used as a source to prepare pure mRNA (Schechter 1973, 1974). The L-chain mRNA isolated from mouse myeloma polysomes specifically precipitated with anti-L-chain antibodies was estimated to be over 95 % pure by sev- eral independent criteria. These included the precipitation of myeloma and non-myeloma polysomes each with anti-L-chain and an.ti-non-L-chain anti- Departments of Chemical Itnmunology^ and Organic Chemistry^, Tbe Weizrnan Insti- tute of Science, Rehovot, Israel. Dr. R. Zemell is a recipient of a fellowship from the Medical Research Council of Canada. Abbreviations: MOPC-41, MOPC-63, MOPC-104E, MOPC-315 and MOPC-321 my- elomas are abbreviated to M-41, M-63, M-104E, M-315 and M-321, respectively.

Upload: israel-schechter

Post on 21-Jul-2016

216 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Structure, Organization, and Controlled Expression of the Genes Coding for the Variable and Constant Regions of Mouse Immunoglobulin Light Chains

Immunological Rev. (1977), Vol. 36

Published by Munksgaard, Copenhagen, DenmarkNo part may be reproduced by any process without written permission from the author(s)

Structure, Organization, and ControlledExpression of the Genes Codingfor the Variable and Constant Regionsof Mouse Immunoglobulin Light Chains

ISRAEL SCHECHTER', YIGAL BURSTEIN^ & RONALD ZEMELL'

INTRODUCTIONThe problem of antibody diversity narrows nowadays to the DNA se-quences coding for the variable (V)-regions of immunoglobulin (Ig) chains.Resolution of this problem at the molecular level requires knowledge ofthe struaure, organization and mechanisms controlling expression of theV-genes. Gene expression is fairly well understood in prokaryotes, mainlybecause many mutants can be easily generated and highly sophisticatedtechniques have been developed to select and to characterize these mutants.These methodologies have not yet been sufficiently advanced for eukaryoticcells. An alternative approach is to use the mRNA molecule to analyze theexpression of a distinct gene at the levels of the genome, transcription andtranslation. Since the mRNA is the key material in these studies it should beobtained at the highest purity. For this goal we developed an immuno-precipitation technique to separate from the total polysome population thepolysome fraction engaged in the synthesis of a distinct protein to be usedas a source to prepare pure mRNA (Schechter 1973, 1974). The L-chainmRNA isolated from mouse myeloma polysomes specifically precipitatedwith anti-L-chain antibodies was estimated to be over 95 % pure by sev-eral independent criteria. These included the precipitation of myeloma andnon-myeloma polysomes each with anti-L-chain and an.ti-non-L-chain anti-

Departments of Chemical Itnmunology^ and Organic Chemistry^, Tbe Weizrnan Insti-tute of Science, Rehovot, Israel. Dr. R. Zemell is a recipient of a fellowship from theMedical Research Council of Canada.Abbreviations: MOPC-41, MOPC-63, MOPC-104E, MOPC-315 and MOPC-321 my-elomas are abbreviated to M-41, M-63, M-104E, M-315 and M-321, respectively.

Page 2: Structure, Organization, and Controlled Expression of the Genes Coding for the Variable and Constant Regions of Mouse Immunoglobulin Light Chains

4 SCHECHTER ET AL.

bcxiies; estim.ation of trace contamination of the pure L-chain mRNA bynon-L-chain mRNAs present in large abundance in. tbe myeloma cell(Schechter 1973); kinetic analyses of hybrid formation between the mRNAand its complementary DNA (Schechler 1975a, b), and the quantitativeyield data from amino add sequence analyses of the total proteins program-med by the mRNA (Burstein et al. 1976, Schechter & Burstein 1976a). Inthis review we present and discuss investigations of the Ig-genes, conductedin our laboratory with the aid of these purified mRNAs. The studies centeron two topics: 1) Studies on the structure and function of Ig-precursorwhich is the immediate product of mRNA translation. This line of researchhas provided new information on the structure of the V-genes, and hasstimulated new ideas concerning the organization and controlled expressionof Ig genes. 2) Enumeration of Ig genes, in an attempt to decide betweenthe germ line and somatic mutation hypotheses.

STRUCTURE AND PRESUMED FUNCTIONS OF IMMUNOGLOBULTN

PRECURSOR PROTEINS

Early studies have shown that the mRNA molecules coding for mousemyeloma L-chains direct the cell-free synthesis of proteins larger than themature L-chain by about 20 amino add residues, as estimated from electro-phoretic mobility in SDS-polyacrylamide gels (Milstein et al. 1972, Swanet al. 1972, Mach et al. 1973, Sehechter 1973, Tonegawa & Baldi 1973).The possibility that this protein was an L-chain precursor was initiallyindicated by the fact that its tryptic fingerprint was composed almost entirelyof L-chain peptides. To determine the position (N- or C-terminal end),precise size, and structure of the extra peptide segment in the precursor,it was necessary to subject it to radioactive amino add sequence analysis.When we started these studies (Schechter 1973) is was not at all clear thatmeaningful sequence data could be obtained. The sample analyzed con-tained a minute amount of the labeled precursor (0.1 pmol, 2.5 ng) whosesequence was monitored, and two million fold excess of the protein carrier(290 nmol, 5 mg) (Schechter & Burstein 1976a). One could envisage dif-ficulties in recovering 0.1 pmol which is the maximal amount of radio-active amino acid derivative released from the precursor. It could be ad-sorbed on the carrier, or released gradually, depending on the type ofamino add derivative released from the protein carrier at each sequencercycle. When appropriate precautionary measures were observed, however,it was possible to determine faithfully the primary structure of the precursor,as is evident from the following findings: discrete radioactive peaks wererecovered from sequencer runs of the labeled precursor; in the semi-log

Page 3: Structure, Organization, and Controlled Expression of the Genes Coding for the Variable and Constant Regions of Mouse Immunoglobulin Light Chains

IMMUNOGLOBULIN GENES AND PRECURSORS

plot the peaks lay oti a straight line (Figure 1), thus showing that theyoriginated from one protein species (Smithies et al. 1971); the analyses ofL-chain precursors that were labeled with 20 radioactive amino acids wereunambiguously interpreted as a distinct sequence of fairly long peptidesegments in which the known sequence of the mature protein was correctlyidentified (e.g. Figure 2 and Burstein & Schechter 1977a).

Amino acid sequence analyses established that the L-chain mRNA mole-cules program the cell-free synthesis of precursors in which extra peptide

' Q

a.

0,5

Q25-40

SEQUENCER CYCLE

Figure 1. Radioactivity recovered at each sequencer cycle from M-321 L-chain pre-cursor programmed by M-321 L-chain mRNA in the wheat germ cell-free system, andlabeled with PH]leucine. A, Radioactivity recovered at each sequencer cycle fromsample without added mRNA. B, Radioactivity recovered at each seqtiencer cycio fromsample with added mRNA. C, Data of B after correction for background and out-of-step degradation. D, Semi-log plot of corrected data. Cycle zero represents a blankcycle (without phenylisothiocyanate) which was used to wash out potential radioactivecontaminants.

Page 4: Structure, Organization, and Controlled Expression of the Genes Coding for the Variable and Constant Regions of Mouse Immunoglobulin Light Chains

6 SCHECHTER ET AL.

segments (19-22 residues long) precede the N-termini of the mature pro-teins. At this time we have determined partial sequences of the N-terminalextra pieces of two « and two X L-chain precursors, and complete sequencesof the N-terminal extra pieces of the M-41 (x) and M-104E (/) L-chainprecursors (Figure 3). It should be noted that the size of the N-terminalextra piece is securely defined even though the sequence is incompletebecause the mature protein's sequence is reached after a defined numberof degradation cycles (Schechter & Burstein 1976a). We shall discuss im-plications of the sequence data and other experiments on the structure,organization and controlled expression of Ig genes.

The Variable-Region Gene may be Larger than Hitherto Known;Evidence from the Structure of the N-Terminal Extra Piece ofLight-Chain Precursors

In the precursor the N-terminal extra piece is coupled to the V-region ofthe mature L-chain. This finding immediately raises the question whetherthe extra piece is part of the V-region or whether it represents a newconstant polypeptide segment of the Ig chain. This issue has been resolvedby sequencing the precursor of L-chains of the same and different sub-groups. The M-321, M-63 and M-41 L-chaitis are of the x:-type and theyshare an identical constant (C)-region. The M-321 and M-63 are of thesame subgroup and differ in the V-region in eight out of 111 amino acidresidues (7 %); the M-41 is of a different subgroup and differs extensivelyin the V-region from both M-321 (48 %) and M-63 (46 %) (McKean et

r Mature

I, PrecurBor Met-

*Glu-Al«-Val-V«l-Thr-

AIa-TTp-Ile-SeT-LBU-IlB-Leu-Sor-Leu-Leu-A.U-Leu-Bor-Ser-G!y-AlB-Ile-Ser-Gln-AU-VBl-V«l-Thr-S 10 IS ZO

Precursai Gin-

it 15 litGlx-Glx-Ser-Ms-Lou-TTiT -Thr - Ser - Pro-Gl y-Clx-Tlir-Val -Thr- Leu -Thr -

Glu-Sei-AlB-Leu- X - X -aer-Pro- X -Glu- X -Val- X -Leu- X -25 it 35 ••0

Figure 2. Alignment of the N-terminal amino acid sequence of the mature and pre-cursor forms of M-104E .ij L-chain. The primary structure of the precursor is basedon sequence analyses of precursor molecules labeled with 20 radioactive amino acids,and programmed by the M-104E L-chain mRNA in the wheat germ cell-free system(Burslein et al. 1976, Burstein & Schechter 1977a). Sequence of the mature L-chain isfrom Appella (1971) and Cesari & Weigert (1973). X, indicates a position in which theamino acid residue was not identified. < Glu, pyroglutamic acid.

Page 5: Structure, Organization, and Controlled Expression of the Genes Coding for the Variable and Constant Regions of Mouse Immunoglobulin Light Chains

IMMUNOGLOBULIN GENES AND PRECURSORS

TABLE ISequence heterogeneity at the N-terminal extra piece of L-chain precursors

Lightprecursors

chaincompared

No.

Extrapiece

(

Amino

%)

acid differences

Variable**region

No.

in

Hypervariable**regions

No.

M-41(x)/M-104E(Aj)M-321(x)/M-104E(Aj)

15/2216/2214/203/19

68*7370*16*

53/11164/11068/11111/110

48586110

20/3123/3024/314/26

65777715

* Numbers represent minimal values since they are based on partial seqtjences givenin Figure 3. The data compared are obtained from precursors labeled with the sameradioactive amino acids

** Numbers for the entire variable-region and for the hypervariable regions are basedon known sequences of the inature L-chains

al. 1973). Although lhe sequence data are incomplete, it is evident that theextra pieces of K-precursors of different subgroups (M-41 versus M-321 andM-63) differ in size and sequence; in x-precursors of the same subgroup(M-321 and M-63) the extra pieces are of the same size and so far theyshare an identical partial sequence (Table I). These findings strongly in-dicate that the N-terminal extra piece is part of the V-region; the impli-cation at the genome level is that the V-region gene may be larger thanhitherto knoun (Burstein & Schechter 1976).

In M-41 and M-321 precursors the extent of variability of the extra piece(at least 68 %) is greater than it is in the entire V-region (48 %), and itis comparable to the variability found when only the hypervariable regions(Wu & Kabat 1970) of these L-chains are examined (65 %) (Table I). Thissuggests that the DNA segment coding for the extra piece may be a 'hotspot' with an accelerated mutation rate, similar to the DNA segment codingfor the hypervariable regions (Schechter & Burstein 1976b).

The results obtained with ^-precursors corroborate the conclusions de-duced from ^-precursors. The Xi and I2 V-regions (110 residues long) areconsidered as members of the same subgroup (Dugan et al. 1973). Whencompared to M-104E h (Appella 1971) the V-region of RPC-20 h differsin one residue (Weigert et al. 1970), the V-region of M-315 X2 differs in11 residues (Dugan et al. 1973). Sequence analyses of the precursors labeledwith the same six radioactive amino acids (Met, Ala, Ser, Thr, Glu, Gin)enabled determination of 48-53 % of the sequence of their N-terminal extrapieces (Burstein & Schechter 1977b). Alignment of the partial sequence

Page 6: Structure, Organization, and Controlled Expression of the Genes Coding for the Variable and Constant Regions of Mouse Immunoglobulin Light Chains

8 SCHECHTER ET AL.

data (Figure 3) shows that the N-terminal extra pieces of the two xi pre-cursors have (so far) identical partial sequence; the extra piece of the l^-precursor differs from these in at least three out of 19 positions. It is alsoseen that the .{-extra pieces (19 residues long) are shorter than the x-extrapieces (20 and 22 residues long). The complete sequence of the extra pieceof the ll M-104E precursor was detertnined (Burstein & Schechter 1977a)and it is compared with the x-extra pieces. Although the V-regions of Aiand X chains differ extensively in sequence, it is seen that the extent ofsequence heterogeneity between the extra pieces of l^ and « (70-73 %) issomewhat larger than it is between their V-regions (58-61 %), and it ap-proaches the variability between their hypervadable regions (77 %) (Table I).

The V-regions of mouse « chains exhibit a high degree (up to 50 %) ofsequence heterogeneity (McKean et al. 1973). In contrast, the V-regions ofmouse X chains are extremely uniform in structure. Eight of 12 X\ chainsstudied are indistinguishable from each other, and the other four differ onlyin 1, 2 or 3 positions at the V-region (Cesari & Weigert 1973). The V-regionof M-315 ll L-chain differs from the most deviant X\ V-region at 14 posi-tions (13 %). The restricted heterogeneity of the X chains is also indicatedby the fact that they comprise only 3-5 % of the normal L-chain popu-lation in mice (Mclntire & Rouse 1970). This can be due to a small poolof V-genes or to restricted expression of a larger pool of I V-genes. Clarifi-cation of this problem might be achieved by determining the structure ofseveral A-precursors. Evidence supporting the possibility of a larger pool ofV-genes will be obtained if we find sequence heterogeneity in the extrapiece of precursors of ;., L-chains which share identical V-regions.

The ancient evolutionary origin of the N-terminal extra piece is indicatedby the fact that it is present in precursors of x and X chains which divergedabout 170 tnillion years ago, and it was found in the precursor of an H-chain (see below) which diverged from L-chains about 400 million yearsago (Dayhoff 1972). TTie long ancestry of the extra piece is also indicatedby findings suggesting that expression of the Ig gene is contingent on theN-terminal extra piece (see below).

The extra pieces of M-321 and M-63 contain repetition of the Leu-Leu-Leu-Trp-Val sequence (the probability that this sequence occurs by chanceis 1.6 X 10'^), suggesting duplication of a short DNA-segment in the struc-tural gene coding for these precursors. Duplication with inversion is alsoindicated from inverted repetition of the Phe-Leu-Leu sequence in the extrapiece of the M-41 precursor (Figure 3). Duplication of a DNA-segmentmay be the result of crossing over between misaligned genes. Alternatively,the repetitions could have arisen from a strong selective pressure for hydro-phobic residues in the extra piece (see below). To the best of our knowl-

Page 7: Structure, Organization, and Controlled Expression of the Genes Coding for the Variable and Constant Regions of Mouse Immunoglobulin Light Chains

IMMUNOGLOBULIN GENES AND PRECURSORS 9

edge there is only one clear example, reported by Smith (1973), showingrepetition of a sequence of six amino acids at positions 1-6 and 13-18 inthe mature MPC-11 L-chain.

Evidence for Car boxy-Terminal Extra Piece in lhe Light Chain Precursor

Sequence analyses established that L-chain precursors contain N-terminalextra pieces. We now present indirect evidence suggesting an additionalextra piece at the C-tenninus of the L-chain precursor.

In the Krebs II ascites and wheat germ cell-free systems the L-chainmRNAs often direct the synthesis of proteins smaller than the mature L-chains. However, when these proteins were isolated and sequenced, al! ofthem were found to contain the N-terminal extra piece, followed by thesequence of the mature L-chain (Schechter et al. 1975, Schechter & Bur-stein 1976c). These studies strongly indicate for initiation of mRNA trans-lation at one point with premature tennination of translation at variouspoints. Consequently, all proteins represent precursor molecules with in-tact N-terminal extra piece, but they differ in size because they lack por-tions at the C-terminus due to incomplete synthesis of the L-chain. Oc-

1 extra piece Bature L-chain

MeC-Asp-Met-Arg-AU-Prc-Ala-Gln-Ile-Phe-Cly-Phe-Leu-Leu-Leu-Leu-Phe-Pro-Gly-Thr-Arg-Cysl—MOPC-41 «Met- X -Thr- t -Thr-Leu-Leu-Leu-Trp-Val-Leu-Leu-Leu-Trp-Val-Pro- JI -Ser-Thr- K |—lCPC-321 KMet- X - X - X - X -Leu-Leu-Leu- X - X -Leu-Leu-Leu- X - X -Pro- X -Ser- X . X \—MOPC-63 K

Met-AlB-Trp-Ile-Ser-Leu-rie-I.eu-Ser-[—]-Leu-Leu-Ala-Leu-Ser-Ser-Gly-Ala-ne-Ser)—M0PC-104E iiMet-Ala. X - X -Ser- X . X - X - S e r - H - X - X -Ala- X -Ser-Ser- X -Ala- X -Serf- RPC-20 Xj

Mot-Ala- X -Thr-Ser- X - X - X - S e r - H - X - X -Ala- X - X -Ser- X -Ala-Ser-Ser|—M0PC-31S \,

I S ID 15 20

Figure 3. Sequence homology between the N-terminal extra pieces of precursors ofmouse immunoglobulin L-chains. Complete sequences of the extra pieces of M-104E(Burstein et ai. 1976, Burstein & Schechter 1977a) and M-41 (Schechter & Burstein1976b, Burstein & Schechter 1977a) are based on analyses of precursors labeled with20 radioactive amino acids. The M-41 mRNA directs the synthesis of another precursorwith a shorter extra piece in which the two N-terminal residues (Met-Asp) are nnissing,in the other 20 positions both extra pieces have identical sequences (data not shown,see Schechter & Burstein 1976c), The partial sequence of the M-321 extra piece isbased on analyses of precursor labeled with 12 radioactive amino acids (Met, Ala, Val,Leu, lie, Pro, Thr, Cys, Phe, Tyr, Trp and Ser, Schechter 1973, Schechter & Buretein1976a, b). TTie partial sequence of M-63 extra piece is based on analyses of precursorlabeled with six radioactive amino acids (Met, Ala, Leu, He, Pro and Ser, Burstein &Schechter 1976). The partial sequences of the extra pieces of RPC-20 and M-315 arebased on analyses of precursors labeled with the same six radioactive amino acids(Met, Ala, Thr, Glu, Gin and Ser, Burstein & Schechter 1977b). For maximal homologybetween the complete sequences of M-41 and M-104E extra pieces, a gap (indicatedby a bar) was inserted in the M-104E sequence. X, an amino acid not yet identified.

Page 8: Structure, Organization, and Controlled Expression of the Genes Coding for the Variable and Constant Regions of Mouse Immunoglobulin Light Chains

JO SCHECHTER ET AL.

casionaliy, however, we observed cell-free products which were even largerthan the precursor. The M-321 mRNA directs the synthesis of a proteinwith M.W. of 28,700 (Schechter 1973). This protein was isolated, sequenced,and was found to contain the expected M-321 N-terminal extra piece whichis 20 residues long (Schechter et al. 1975). The M.W. of the mature M-321L-chain is 24,020 (determined from primary structure of M-321 L-chain,McKean et al. 1973), i.e., the 28,700 dalton precursor is larger by about40 residues. These findings suggest that the M-321 mRNA contains trans-latable information for an additional extra peptide segment at the C-terminusof the mature L-chain (Schechter et al. 1975) which is about 20 residueslong (40 minus 20). The presumed C-terminal extra piece may be longerthan 20 residues because of premature termination of mRNA translationin the cell-free system. The possibility that the size of the M-321 precursorwas overestimated is somewhat unlikely. The M.W. was calculated fromeleetrophoretic mobility in SDS-polyacrylamide gels in which the precursorand mature L-chain marker were applied together in the same slot of thegel (Schechter 1973). Experiments in progress in our laboratory providefurther support for the existence of a C-terminal extra piece. It will beinteresting to determine the structure of the C-terminal extra piece to ob-tain insight into the additional genetic information encoded in (or joinedto) the C-gene.

Precursor of Immunoglobulin Heavy Chain

The N-terminal extra piece of L-chain precursors may be involved in theregulation of Ig production. It is also believed that H-chains and L-chainsinteract intracellularly (Askonas & Williamson 1967, ScharS et al. 1967).Therefore, it is important to determine whether the H-chain is also syn-thesized in the form of a precursor. Below we describe recent experimentswhich strongly indicate that there is a precursor of H-chain (Burstein etal. 1977b).

The mRNA coding for the alpha-H-chain («H) was isolated from poly-somes of the M-315 myeloma with the aid of the double-antibody technique(Schechter 1974). In the wheat germ cell-free system this mRNA program-med the synthesis of many protein bands ranging in size up to about 60,000daltons. Since premature termination of mRNA translation frequently oc-curs in the wheat germ extract (Schechter & Burstein 1976c) we decidedto study the fraction of protein products raiiging in size from 60,000-40,000 daltons. This fraction, labeled with six radioactive amino acids (Leu,Pro, He, Val, GIu, Gin), was isolated from the total cell-free products.Sequence analyses of the 60,000-40,000 dalton protein fraction showed

Page 9: Structure, Organization, and Controlled Expression of the Genes Coding for the Variable and Constant Regions of Mouse Immunoglobulin Light Chains

IMMUNOGLOBUUN GENES AND PRECURSORS

that the partial sequence determined after 18 degradation cycles (GIn-Leu-GIn-Glu-X-X-Pro-X-Leu-Val-X-Pro-X33, X is unknown) was com-pletely homologous with the known sequence of the mature H-chain (Asp'-Val-G]n-Leu-Gln-Glu-Ser-GIy-Pro-Gly-Leu-Val-Lys-Pro-Seri-\ from Franciset al. 1974). These findings establish that the aH-mRNA directs the syn-thesis of a precursor in which an extra piece, at least 18 residues long,precedes the N-terminus of the mature H-chain. This N-terminal extrapiece is markedly hydrophobic (it contains about five leudnes, two iso-leucines, two valines and one proline) similar to the extra pieces of L-chain precursors. The partial sequences of the N-terminal extra pieces ofthe precursors of the o-H-chain and Aa-L-chain, which are produced bythe M-315 myeloma, are different (unpublished data). Analyses of severalH-chain precursors should clarify the relationship between this N-terminalextra piece and the V-region.

Evidence Supporting the Intracellular Synthesis of ImmunoglobulinPrecursors

The finding of N-terminal Met in all precursors investigated (Figure 3)suggests that these molecules represent the immediate product of mRNAtranslation within the cell, while the possibility that they are translalionalartifacts occurring in vitro is unlikely. These precursors were synthesizedusing the wheat germ (Burstein & Schechter 1976) and Krebs II ascites(Schechter 1973) cell-free systems which are known to translate differentmRNAs with fidelity. It could be argued that, within the cell, initiation ofmRNA translation is different, resulting in direct synthesis of the matureL-chain, However, considering the role of unblocked Met in the initiationof protein synthesis in eukaryotes (Wigle & Dixon 1970) we would thenhave expected to find a Met residue in the precursor molecule at a positionjust before the N-terminal residue of the mature protein. The sequencedata rule out this possibility since this position is occupied by Cys in M-41,Ser in the Xj precursors, or by amino adds other than Met in the other pre-cursors (Figure 3). Recent experiments, in which we proved that the N-terminal Met is the initiator residue, provide strong evidence that the pre-cursor is synthesized within the cell. Briefly, precursors were synthesizedin the presence of [ -' SjMet-tRNA"'*'' species as sole source of label, onewas the initiator-tRNA^-^^ the other was the intemal-tRNA'"^^ The pre-cursors were sequenced, and in ail cases [3^S]Met was recovered at theN-terminal position only when the initiator-f^^SlMet-tRNA""*^ was used.The intemal-[3sS]Met-tRNA""' did not label the N-terminal Met but in-serted [ ^SJMet inside the polypeptide chain (Zemell et al. 1977). It is

Page 10: Structure, Organization, and Controlled Expression of the Genes Coding for the Variable and Constant Regions of Mouse Immunoglobulin Light Chains

12 SCHECHTER ET AL.

therefore expected that translation of the L-chain mRNA should be con-tingent on the nucleotide sequence coding for the N-terminal extra piece.This sequence is located at the 5' end of the mRNA, where mRNA trans-lation is initiated. In agreement, it was shown that L-chain mRNA mole-cules which are deficient at the 5' end are untranslatable in a cell-freesystem (Schechter 1975b). The finding that myeloma polysomes, incubatedin vitro, synthesize precursor molecules (Milstein et al. 1972, Blobel &Dobberstein 1975), and preliminary experiments showing sequences of N-terminal extra piece on nascent chain in intact myeloma cells (Zemell,Burstein & Schechter, unpublished data) further support the notion that theprecursor molecules are formed within the lymphoid cell. The vast majorityof precursor molecules, however, should be short lived within the cell andprocessed to the mature protein (cleavage of the extra piece) rather quickly,because only mature L-chains have been isolated from myeloma tumors orbody fluids (Svasti & Milstein 1972, Schechter 1973). Therefore, the cleav-age step(s) converting the precursor to the mature protein may provide apoint of metabolic control to regulate L-chain secretion (Burstein & Schech-ter 1976).

It is generally thought that secretory proteins are synthesized on micro-somes. The 'signal hypothesis' postulates that signal peptides located at theN-termini of secretory proteins direct polysomes synthesizing these proteinsto the endoplasmic membranes, the growing nascent chains are vectoriallydischarged across the microsome membrane, and the signal peptides arethen cleaved (Blobel & Dobberstein 1975). It was proposed that the modi-fied N-termini of L-chain precursors may serve as the signalling devicedirecting free polysomes synthesizing L-chains to become attached to theendoplasmic membranes (Milstein et al. 1972, Blobel & Dobberstein 1975,Schechter et al. 1975). The marked hydrophobicity of the N-terminal extrapieces supports these proposals (Schechter & Burstein 1976b). It shouldbe realized, however, that attempts to obtain mature L-chain by the in vitroincubation of precursors with microsomes or endoplasmic membranes havenot yet been successful (Milstein et al. 1972, Blobel & Dobberstein 1975).We are currently studying the possibihty that cleavage of the extra pieceoccurs during (rather than after) synthesis of the precursor. The sequencevariability in the extra piece at the junction with the mature protein (e.g.in M-41 extra piece the C-terminal sequence is Thr-Arg-Cys, whereas inthe other extra pieces this region is occupied by different sequences) doesnot point to a single endoproteolytic event. Yet, the enzyme(s) involvedin the maturation process may recognize common regions further removedfrom the point of cleavage (Berger & Schechter 1970) either on the extrapiece and/or on the N-terminus of the mature L-chain.

Page 11: Structure, Organization, and Controlled Expression of the Genes Coding for the Variable and Constant Regions of Mouse Immunoglobulin Light Chains

IMMUNOGLOBULIN GENES AND PRECURSORS 13

TABLE IIHydrophobic residues in immunoglobulin L-chain precursors,

membrane bound proteins, and secretory proteins

ProteinResidues

Hydrophobic/tota! (%)

Residues

Hydrophobic/total (%)

Extra piece Mature L-chainM-41 x-precursor 16/22 73 97/213 46M-321 x-precursor 15/20 75 107/218 49M-104E /j-precursor 13/19 69 118/215 55

Hydrophobic domain Exposed portionglycophorin 17/23 74 60/124 48cytochrome bj 29/40 72 40/97 41

Secretory proteinslactalbumin (bovine) 60/123 49trypsinogen (bovine) 110/229 48

Numbers for the extra pieces are calculated from data given in Figure 3. Data for themature L-chains are from published sequences (M-41, Gray et al. 1967; M-321, McKeanet al. 1973; M-104E, Appella 1971), glycaphorin from Segrest et al. (1972, 1973), cyto-chrome bj from Spatz & Strittmatter (1971), lactalbumin and trypsinogen from Day-hoff (1972)

Marked Hydrophobicity of the N-Termlnal Extra Piece; PossiblePhysiological Functions of the Precursor

Despite the fact that the extra pieces exhibit sequence heterogeneity, all werefound to contain large (and seldom observed) clusters of leudnes: twotriplets in M-321 and M-63, a quadruplet in M-41, and closely gatheredfive leudnes in the extra piece of the M-104E L-chain precursor. This in-dicated that the extra piece would be quite hydrophobic (Schechter et al.1975), as indeed was found when all hydrophobic residues (Tanford 1962)were analyzed (Schechter & Burstein 1976b). The percentage of hydro-phobic residues in the extra piece (69-75 %) is considerably larger thantheir percentage in the mature L-chain (46-55 %) or in other secretoryproteins (48^9 %) (Table II). The clustering of hydrophobic residues in adistinct region (i.e. the extra piece) of the precursor is reminiscent of themolecular tx>pography of membrane proteins. For example, in glycophorinthe hydrophobic domain embeded in the membrane and the portion of themolecule exposed to the surrounding environment contain, respectively,

Page 12: Structure, Organization, and Controlled Expression of the Genes Coding for the Variable and Constant Regions of Mouse Immunoglobulin Light Chains

14 SCHECHTER ET AL.

74 % and 48 % apolar residues (Segrest et al. 1972, 1973). The completesequences of the precursors further corroborate this resemblance. No chargedresidues occur in the hydrophobic domain of glycophorin (Segrest et al.1972). Similarly, the entire extra piece of the M-104E A-precursor, and astretch of 16 amino acids (Ala^-Thr^^) in the extra piece of M-41 K-pre-cursor are devoid of any charged residues (Figure 3). These structuralsimilarities suggest that the role of the hydrophobic extra piece is to favorinteraction of the precursor with cell membranes, in a manner similar tothe function of the hydrophobic domain of membrane proteins. This analogyhas led us to suggest the hypothesis that the Ig precursor, which seems tobe the immediate product of mRNA translation within the cell, is the com-mon intermediate for secreted Ig and for the antigen-recognizing receptor.In plasma cells which synthesize large amounts of antibodies most of theprecursor molecules are directed to the endoplasmic membranes where theyundergo the maturation process (cleavage of the extra piece) to yieldmature Ig destined for secretion. In lymphocytes, most of the precursormolecules may remain as such, and are anchored via the hydrophobic extrapiece in the cell-surface membrane to serve as the antigen-recognizing re-ceptor (Schechter & Burstein 1976b).

A few lines of evidence supporting this hypothesis are given below.Myeloma tumors with few membrane bound polysomes have large amountsof surface Ig and secrete Httle Ig, whereas tumors rich in membrane boundpolysomes have little surface Ig and secrete copious amounts of Ig (Anders-son et al. 1974). Of particular interest is the report on the penicillinases ofBacillus Ucheniformis. This bacterium produces two types of penicillinases,a highly soluble exoenzyme (secreted to the medium) and a hydrophobicmembrane-bound form, while containing only a single penicillinase struc-tural gene. The membrane-bound enzyme differs from the exoenzyme incarrying at the N-terminus an additional peptide (25 residues long) con-taining phosphatidylserine. The latter grouping (containing esterfield fattyadds) is responsible for anchoring the hydrophobic form of the enzymeto the cell membrane (Yamamoto & Lampen 1976).

We realize that much work is required to prove this hypothesis: demon-stration that H- and L-chain precursors can associate to generate moleculeswith antigen binding activity, search for precursor molecules on the cell-surface, etc. Nonetheless, experimental evaluation of this hypothesis shouldbe greatly facilitated since membrane bound Ig can be isolated, the N-terminal sequence of this Ig can be determined and compared with knownsequences of the mature Ig and of the precursor. If we find precursor mole-cules on the surface of B or T lymphocytes, we shall have to consider thepossibility that by virtue of its sequence variability the N-terminal extra

Page 13: Structure, Organization, and Controlled Expression of the Genes Coding for the Variable and Constant Regions of Mouse Immunoglobulin Light Chains

IMMUNOGLOBULIN GENES AND PRECURSORS 15

piece can also function as a new recognition system (Schechter & Burstein1976b).

One can speculate on other possible functions of the precursor, and wemention two. 1) If it turns out that the precursor contains a hydrophobicC-terminal extra piece, this peptide could be also considered as a tool toanchor Ig chains on the surface membrane. 2) Although mature H- andL-chains recombine to generate tetrameric 7S molecules, it might be thatthe hydrophobic extra pieces would facilitate this process by serving as anucleation point for chain association.

Independent Expression of the Gene Coding for the Constant Regionof Immunoglobulin Light-Chain

Abundant evidence suggests that distinct genes code for the variable (N-terminal) and constant (C-terminal) parts of the Ig chain, and it has beenrepeatedly proposed that these genes should be joined prior to transcriptionto provide a single mRNA for the entire polypeptide chain (Dreyer et al.1967, Gaily &Edelman 1970). However, it was suggested that in some mousemyelomas V- or C-region protein fragments may be synthesized per se(Schubert & Cohn 1970, Kuehl & ScharfT 1974). It has been reported thatthree clones derived from MPC-U mouse myeloma synthesize a C-regionfragment of K L-chain of molecular weight 11,600 (Kuehl & Scharff 1974).It was suggested that these clones contain mRNA that directs the cell-freesynthesis of a protein somewhat larger than the C,,-region; this protein wastentatively identified as a precursor to the C^-region (Kuehl et al. 1975).These studies, however, did not determine the structure and position (N-or C-terminal) of the extra peptide in the presumed C^^-precursor, nor didthey provide evidence whether or not a single gene codes for both the frag-ment and the whole L-chain. To resolve this problem requires analysis ofthe 5' end of the mRNA, or sequence analysis of the immediate productof mRNA translation to determine if its N-terminus is similar to the V-region sequence preceding the C-region. We describe below purification ofthe Cj(-region mRNA, demonstrate that this mRNA programs the synthesisof a precursor of the Cj^-region, present the partial amino acid sequence ofthis precursor, and discuss the implications of these findings on the organi-zation and controlled expression of the V- and C-genes (Burstein et al.1977a).

The immunoprecipitation technique (Schechter 1974) was used to purifythe C^-mRNA from two clones of the MPC-11 myeloma. Sequence analysesof the cell-free product labeled with ten radioactive amino acids showedthat this mRNA directs the synthesis of a C^-precursor (M.W. about 15,000)

Page 14: Structure, Organization, and Controlled Expression of the Genes Coding for the Variable and Constant Regions of Mouse Immunoglobulin Light Chains

1^ SCHECHTER ET AL.

in which an extra piece (17 residues lcmg) precedes the N-terminal residue(Ala^"*) of the C^-region. Subsequent residues identified in the precursorshow perfect sequence homology with the Cj^-region, i.e., 13 residues iden-tified in the segments Ala^^-Val"^ of the precursor match with the sameresidues in the segment Ala °*-Val ^^ of the mature L-chain. On the otherhand, regions preceding the above homology region are quite different;i.e., the extra piece segment in the Cj^-precursor (Met'-X'"'') bears no re-semblance to any known sequence preceding residue Ala'''* in whole L-chains, one of which {Asx -Arg "**) is given in Figure 4. The segment Asx"-y j-gios comprises a portion of the V-region, nonetheless sequence data ofseveral L-chains show that residues no. 98 to 108 are fairly conserved inmouse x-chains (McKean et al. 1973). The sequence data and the iden-tification of Met' as the initiator residue (Zemell et al. 1977) establish thatthe mRNA coding for the Cj -region could not have originated from themRNA coding for the whole L-chain.

The N-terminal extra piece of the C^-precursor is highly enriched withhydrophobic residues (82%, 14/17), it is 17 residues long, and containsN-terminaJ methionine (Figure 4). These structural features characterize theN-terminal extra pieces which are linked to the V-regions of whole L-chainprecursors, i.e., they are markedly hydrophobic (69-75 % hydrophobicresidues), are of comparable size (19-22 residues long), and all of themcontain N-terminal methionine. In addition, the partial sequences of the

92 9S 100 lOS UO l lS

M-SZl suture Asx-Gln-Asx-Pro-Trp-Thr-Phe-Gly-Ser-Gly-Thr-LyS-Leu-Glu-lle-Lys-Arg-AU-Asp-Als-A.U-Pro-'nir.Val.Ser-

C. precursor Met- X -•nir-Hap.TTir-Leu-Leu-Leu-TTp-Val-Leu-Leu-LBU-Trp.Val-Pro- X -AlB-Asp-Ala-Ala-Pro-Thr-V«l- X -

1 S 10 IS ^ 20 25

H-3ZI precursor Met- X -Tlir- X - Tli r - Leu-Leu-Leu-Trp-Val-Leu-Leu-Leu-Tip-Val-Pro- X -Ser- Ihr - X - X -Ile-Val-Leu-Thr-

H-J21 mature Asp-Ile-VBl-Uu-tlir-

1 S

1 » US UO

M- S21 itature I le-Phe-Pro-Pro-Ser-Ser-Glu-Gln-Uu-Tlir-SBr-Gly-Gly-Al a-Ser-Val-Vsl-

C precursor X -Phe-Pra-Pro- X - X - X - X -Leu- X - X - X - X - X - X -Val-Val-

30 3S 40

Figure 4. Alignment of the precursor of the «-type C-region with the precursor andmature forms of M-321 L-chain. The partial sequence of the Cx-region precursor isbased on analyses of precursor labeled with 10 radioactive amino acids (Met, Ala, Val,Leu, Pro, Thr, Phe, Trp, Asp and Arg, Burstein et al. 1977a). Details on M-321 pre-cursor are given in Legend to Figure 3. Sequences of two regions (Asp^-Thr=, Asx"-Val'") of the mature M-321 L-chain are from McKean et al. 1973. X, an amino acidnot yet identified. Arrow marks the end of the N-terminal extra piece and beginningof the mature L-chain sequence.

Page 15: Structure, Organization, and Controlled Expression of the Genes Coding for the Variable and Constant Regions of Mouse Immunoglobulin Light Chains

IMMUNOGLOBULIN GENES AND PRECURSORS 17

Cjj-extra piece and the extra piece linked to the mature V-region in theM-321 L-chain precursor are identical, showing at least 70 % homology(Figure 4).

These findings can be formally explained by the two genes-one Ig chainhypothesis, but they also raise the possibility that three genes may controlthe synthesis of one Ig-chain. According to the two gene hypothesis theextra piece-DNA (Xp-DNA) would be a constitutive part of the V-gene.We favor this model because the pattern of variability in the extra pieceis closely linked to the V-region subgroups. The C^^-region mRNA couldhave originated from translocation of the V-gene (that contains the Xp-DNA) to the C-gene, deletion of the entire segment of the mature V-gene,and 'end to end' repair of the remaining Xp-DNA to the C-gene. A relatedprocess is believed to occur in H-chain disease, except that short segmentsof the mature V-region are ustially retained and are joined to the C-regionof the defective H-chain (Frangione 1976). Another possibility is that onlythe Xp-DNA portion of the V-gene was translocated to the C-gene, andthat this event is sufficient to permit transcription of the C-gene. This pro-posal contrasts the view that transcription is made possible only after join-ing of the V- and C-genes has occurred.

TTie N-terminal extra piece sequences exhibit variability, and in thisrespea they might be considered as part of the V-region. However, this d o ^not necessarily mean that in the genome the Xp-DNA is covalently linkedto the V-gene at all times. Originally, the Xp-DNA may be separated fromthe mature V-gene, similar to the DNA of hypervariable regions presumedto be in episomes (Wu & Kabat 1970). This raises the intriguing speculationthat in addition to the mature V- and C-genes, the Xp-DNA may representa third distinct gene, designated Xp-gene. The presumed Xp-gene may beinvolved in the regulation of gene transcription: when linked to the matureV-gene it initiates a chain of events leading to whole L-chain mRNA for-mation, when attached to the C-gene it leads to its transcription to providethe C-region mRNA.

ENUMERATION OF IMMUNOGLOBULIN GENES

Any attempt to decide among the hypotheses advanced to explain the originof antibody diversity must come to grips with the question of V-gene dosage.According to the somatic mutation hypothesis it is predicted that all mem-bers of the same subgroup are represented in the genome by onJy one V-gene copy structurally related to the subgroup prototype, i.e., the genomewould contain a few V-genes. The germ line hypothesis predicts a V-genefor each member of the subgroup, i.e., the genome would contain a large

Immunological Rev. (1977), Vol. 36 a

Page 16: Structure, Organization, and Controlled Expression of the Genes Coding for the Variable and Constant Regions of Mouse Immunoglobulin Light Chains

Ig SCHECHTER ET AL.

ntimber of V-genes. To resolve this issue it is necessary to determine thecapacity of a nucleic acid probe corresponding to a distinct V-region toanneal and quantify V-genes coding for L-chains of the same and differentsubgroups. Because of the redundancy of the genetic code many base sub-stitutions may accumulate in the structural gene without affecting the aminoadd sequence of the encoded protein. These synonymous codon changes,called neutral mutations, have been detected in prokaryotes and eukaryotes.Sequence analyses of RNA from baaeriophages MS2 and R17 showedthat 4 % oi the bases in their coat protein cistrons are different. Since theamino acid sequences of their coat proteins {129 residues long) are identical,all the base changes detected represent neutral mutations (Robertson £LJeppesen 1972). The variations in primary structures of histon IV (102residues long) of different species can be explained by 0.6 % base sub-stitutions in the structural gene. Yet, hybridization studies using a distincthistone mRNA and nuclear DNA from different species suggest 8 % basesubstitutions in the structural gene coding for histone IV (Farquhar &McCarthy 1973). Potentially neutral mutations which do not change theprimary structure of the protein may involve up to 35 % base substitutions(Robertson & Jeppesen 1972). Therefore, it is not self evident that V-genesof the same subgroup (coding for similar amino acid sequences) will crosshybridize. On the other hand, the reverse situation may also exist, i.e., theextent of variabihty may be smaller in the nucleotide sequence as com-pared to amino acid sequence. This is because three bases specify one aminoacid, and most amino acid replacements can be explained by one base sub-stitution. Consequently, V-genes of different subgroups may potentially differby 12 % base substitutions, while the amino acid sequences may differ by36 %. Since polynucleotides with up to 15 % mismatched bases can formhybrids even under stringent conditions (Farquhar & McCarthy 1973), itis theoretically feasible that a distinct V-probe will hybridize with V-genesof different subgroups. Thus, it is clear that information on the extent ofcross hybridization between V-genes is required for interpretation of thehybridization data with nuclear DNA.

Cross Hybridizations Between Nucleic Acid Sequences Coding forV'Regions of the Same and Different Subgroups

A reasonable approach to determine the range of structural variation in V-gene sequences that can be quantified by a distinct V-region probe is tocross hybridize the complementary DNA (cDNA) containing the nucleotidesequence of a distinct V-gene with mRNAs coding for L-chains of the sameand different subgroups.

Page 17: Structure, Organization, and Controlled Expression of the Genes Coding for the Variable and Constant Regions of Mouse Immunoglobulin Light Chains

IMMUNOGLOBULIN GENES AND PRECURSORS 19

As a starting material we used the M-321 L-chain mRNA ( > 98 % pure,Schechter & Burstein 1976a). The size of this L-chain mRNA is about440,000 daltons, as determined from electrophoretic mobility in polyacryl-amide gels made in water or 98 % formamide (Schechter 1973). The M-321L-chain mRNA served as template for the avian myeloblastosis virus reversetranscriptase to synthesize the highly labeled [^HjcDNA probe (4.4 X 10"dpm/ng). The molecular weight of the M-321 cDNA is about 280,000 (840nucleotides long) as determined by alkaline sucrose gradient eentrifugation,and from the extent of protection of the ["^l]mRNA by the cDNA fromribonuclease digestion (Schechter 1975a). Specificity of the cDNA probewas ascertained from the lack of hybrid formation with a variety of nonL-chain mRNAs and rRNA at high C t values (Figure 5). Also, the M-321K L-chain mRNA did not hybridize with X L-chain mRNA (Schechter 1975a).

Of the nucleotide mass of the M-321 mRNA, 220,000 daJtons are re-quired to code for the V and C-regions of the mature L-chain, 40,000daltons to code for the extra pieces in the L-chain precursor (20 aminoacid residues at the N-terminus and about 20 residues at the C-terminus,Schechter et al. 1975), about 65,000 daltons for the poly(A) at the 3' endof the mRNA (Brownlee et al. 1973), and a residual untranslated mass of

100

LOG Cri (mole'liter"', sec)

Figure 5. Cross hybridization analyses. Kinetics of hybridization of M-321 cDNA withthe following L-chain mRNAs: O. M-321p; X, M-321; A, M-63; n , M-41. The con-trols were: globin mRNA (V) and rRNA (y ) of rabbit reticulocytes; poly(A)-richRNA (©) and rRNA (Q) of mouse liver; total RNA of E. coli ( • ) . Details of themRNA preparations are given in Table IIL Hybrids were assayed by staphylococcalnuclease (Kaciati & Spiegelman 1974).

Page 18: Structure, Organization, and Controlled Expression of the Genes Coding for the Variable and Constant Regions of Mouse Immunoglobulin Light Chains

SCHECHTER ET AL.

TABLE IIIThe cross hybridization of M-321 cDNA with mRNAs coding for immunoglobulin

light-chains of the same and different subgroups

mRNA

Samplecomposition^

mRNA rRNA

(mole X liter^ x sec)

Observed Corrected**

Satu-rationlevel

Tm

M-321 p («)M-321 («)M-63 (x)M-41 (x)

95353337

5656763

2.7X10""8.3 X10-"1.0x10-31.2x10-3

2.6 X 10-"2.9 X 10-"3.3 X10-"4.4 X 10-"

97%9575

87878787

* The amount of mRNA was determined from the scanning of stained polyacrylamidegels made in 98 % formamide (Schechter 1975b)

** The corrected CrtVi value was calculated by multiplication of the observed CrtViby the fraction of mRNA in the sample (Schechter 1975a)

about 115,000 daltons whose position has not yet been determined. InFigure 6 the mRNA and cDNA are aligned to scale in three possible ar-rangements of the mRNA: the untranslated mass was arbitrarily placed atthe 3' (a) and 5' (b) ends of the translated region, and when equally dividedbetween the two ends (c). Hybridization of the cDNA vnth mRNA mole-cules coding for L-chains of the same and different subgroups suggest thatarrangement (c) in Figure 6 seems to be the best approximation for thestructure of the L-chain mRNA.

The M-321, M-63 and M-41 mouse myeloma L-chains are of the x-type.The M-321 and M-63 are of the same subgroup and differ at the V-regiononly in eight out of 111 amino add residues; the M-41 belongs to a dif-ferent subgroup and differs from M-321 V-region in 53 positions (McKeanet al. 1973). The M-321 cDNA was annealed with the mRNAs coding forthese L-chains, and the results (Figure 5) are summarized in Table IIL

The cDNA annealed with the x-type mRNAs of the same (M-321, M-63)and different (M-41) subgroups with comparable Crt,/2 values. All hybridshad identical Tm values (87°) and sharp melting profiles (Schechter 1975a).This is probably due to the formation of well-matched duplexes betweennucleotide strands of the common C-region.

Considering the sizes of the cDNA (280,000 daltons) and mRNAs(440,000 daltons), and that reverse transcription starts from the 3' endof the mRNA (i.e., from the end coding for the C-regjon), it is seen thatin all possible arrangements of the mRNA (Figure 6), the M-321 cDNAcontains the entire C-region. The data quoted below suggest that the cDNAalso contains a portion of the V-region.

Page 19: Structure, Organization, and Controlled Expression of the Genes Coding for the Variable and Constant Regions of Mouse Immunoglobulin Light Chains

IMMUNOGLOBULIN GENES AND PRECURSORS 21

The mRNA segment complementary to the cDNA is larger in M-63 thanin M-41 because the saturation level achieved with M-63 (95 %) is largerthan that achieved with M-4] (75 %). It seems unlikely that the differenceat saturation between M-321 and M-63 to M-41 is due to the putative un-translated region at the 3' end of the mRNA. This is because the mousegenome contains one or very few C-genes, and therefore the nucleotidesequence adjacent to it in the 3' direction is also likely to be conserved.In view of the nearly complete complementarity between M-63 mRNAand M-321 cDNA, and the extensive amino add variability at the V-regionof M-41 compared to both M-321 and M-63, we suggest that the mis-matched segment scored in the M-41-mRNA-M-321-cDNA hybrid code forthe V-region. The size of the mismatched segment in this hybrid correspondsto about 21 % (96 % minus 75 %) of the cDNA, i.e., it is about 59,000daltons (0.21 X 280,000) or 180 bases long. Since the nucleotide masscoding for the entire V-region (110 amino acids) is 110,000 daltons, theM-321 cDNA might contain about half of the V-region. This interpreta-tion is compatible with scheme (c) of Figure 6. Scheme (b) is somewhatless likely because it predicts presentation of the entire V-region in thecDNA; scheme (a) is incompatible with the experimental results since itpredicts that the cDNA should cross-hybridize to completion with bothM-41 and M-63 mRNAs, i.e., 97 % saturation should have been achievedwith the M-41 mRNA. This analysis suggests that V-genes of different sub-groups do not cross-hybridize. Furthermore, it seems that V-regions withsimilar amino acid sequences (like M-321 and M-63) are coded by similarnucleotide sequences that cross-hybridize. This is deduced from the findingthat nearly complete saturation levels (95 %) were achieved with bothM-321 and M-63 mRNAs, and that the M-321 cDNA appears to contain

(a) U—L 1 " . . " ^JL^ mRNA

(b) , LJ-^Pi V , C ,P, A ,

(c) , U ,P • P. U . A

_y J l , cDNA

Figure 6. Schematic representation of possible complementary regions between thecDNA and mRNA of M-321 L-chain. Tbe M-321 mRNA (440,000 daltons) cDNA(280,000 daltons) and their divisions are drawn to scale (see text). Nucleotide sequencesin the mRNA are: P, for the amino and carboxyl terminal extra pieces of the L-chain precursor; V, variable region; C, constant region; A, paly(A) at the 3' end; U,untranslated region. The location of the untranslated region is not yet known, there-fore it is represented in three arbitrary arrangements. Nucleotide sequences in thecDNA are; H, heteropolymeric DNA; T, poly(dT).

Page 20: Structure, Organization, and Controlled Expression of the Genes Coding for the Variable and Constant Regions of Mouse Immunoglobulin Light Chains

3 ^ SCHECHTER ET AL.

about half of the V-regioa. That is, the nucleic acid probe to one V-regionmay anneal and quantify the V-genes of members of the same subgroup.

We believe that the cross hybridization analyses should be extended toL-chains with systematic increase in V-region variability (say 5, 10, 15, 20,etc. amino acid differences), and to V-regions of H-chains of different sub-groups. These investigations will define the 'maximal' range of structuralvariability of V-genes compatible with efficient cross hybridization. Havingthis information and an estimate of the number of V-region amino acidsequences in a subgroup, it would be possible to make a more meaningfulinterpretation of the hybridization data for determining how many V-genecopies per subgroup are present in the genome.

Gene Dosage Determinations

The M-321 cDNA probe was hybridized with nuclear DNA prepared fromM-321 myeloma tumors or mouse liver at different DNA/cDNA ratios. Theresults obtained with the DNA from both sources were essentially identical(Table IV, Figures 7 and 8). Gene dosage was calculated in two ways using

Myebma DNA Liver DNA

0

20

40

60

BO

iOO0

20

40

60

60

100-

1/2=200

Col 1/2= 500

Col 1/2=200

0 I 3 4 0 1 2 3

LOG Cot [mole)! liter"'* sec)

Figure 7. Enumeration of «-type L-cham genes in mouse. Kinetics of hybridization ofM-321 cDNA with nuclear DNA of M-321 myelama tumor (O) and liver ( « ) . TheDNA/cDNA ratio in the annealing reaction mixture was 2 x 10* in panels a and b; theratio was 15 X 10* in panels c and d. Hybrids were lssayed by hydroxyapatite chro-tnatography (Schechter et al. 1976).

Page 21: Structure, Organization, and Controlled Expression of the Genes Coding for the Variable and Constant Regions of Mouse Immunoglobulin Light Chains

IMMUNOGLOBULIN GENES AND PRECURSORS 23

TABLE IVEnumeration of the genes coding for the x-type immunoglobulin light-chains in mouse

NuclearDNAsotirce

Reactants

DNA(mg)

cDNA(ng)

DNA/cDNA

Eound

(M X sec)

cDNAhybrid, atsaturation

Genescalculated from

Cot 1/2*Satu-rationlevel**

M-321 tumor

liver

13

15

315

11.5

1

1.51

12

15

215

100200500

200500

3 3 %42%8 5 %

37%8 0 %

20104

104

32.5

2

22

• Calculation based on Cot'/s ~ 2X 10 mole X liter"* X sec for reannealing of mouseunique sequences. This value was determined from the total DNA in the sample

** Calculations based on the following molectilar weights: cDNA = 2.8x10*; mousehaploid genome = 1.

the two parameters of molecular hybridization: Coti/2 and saturation level.The Coti/2 values decreased (i.e., an increase in the calculated gene num-ber) with decreasing DNA/cDNA ratio. This finding by itself implies a fewgene copies. When the number of genes in the genome is small, at low

0 0

80

6 0

4 0

20

- I~ Tm=86° 1

- /

- /

60 80 100

Temperature {"O

Figure 8. Tm of cDNA DNA hybrids. The hybrids were formed by annealing M-321cDNA with nuclear DNA of M-321 myeloma tumor (O) and liver ( • ) . Hybrids wereobtained from reaction mixtures shown in panels a and b in Figure 7 tbat were an-nealed to C(,t — 3x10^ mole x Iiter^ x sec. The melting profiles were determined bythermal elution hydroxyapatite chromatography (Schechter 1975a).

Page 22: Structure, Organization, and Controlled Expression of the Genes Coding for the Variable and Constant Regions of Mouse Immunoglobulin Light Chains

24 SCHECHTER ET AL.

DNAVcDNA ratio the added cDNA probe increases significantly the con-centration of the si>rafic reactants (Ig genes in the present case) in theannealing mixture, and consequently the hybridization rate of these genesincreases (i.e., a decrease in Coti/a and an apparent increase in the calcu-lated gene number). The accelerated hybridization rate due to the addedcDNA probe should regress gradually with increasing DNA/cDNA ratiotill vast DNA excess is achieved. At DNA excess the change in specificreactant concentration induced by addition of the added cDNA probebecomes negligible. As seen in Table IV, the gene dosage calculatedfrom Co'ti/2 values decreased from 20 to 4 by approaching conditions ofvast DNA excess. If there were many copies of Ig genes, the added cDNAwould hardly change the concentration of the specific Ig DNA strands inthe nuclear DNA, and the hybridization rate would not depend on thechange in the DNA/cDNA ratio. The calculation based on the extent ofcDNA hybridized at saturation takes into account the effect of added cDNAon the amount of gene copies present in the nuclear DNA sample. Indeed,using this approach at all DNA/cDNA ratios about two gene copies perhaploid genome were determined (Table IV). The melting profiles and Tmvalues of the M-321 cDNA hybrids formed with the myeloma DNA andliver DNA were the same (Figure 8) suggesting that very similar (if notidentical) genes are scored in both tissues.

The size of the cDNA and the cross hybridization experiments Indicatedthat this cDNA probe should quantify w-type C-genes, and that presumablyit contains sufficient V-gene sequences to form stable hybrides (180 basepairs) with genes of the same subgroup. Accordingly, the results of mole-cular hybridizations with nuclear DNA show that there are few genes codingfor the «-type C-region (approx. two per haploid genome), and the numberof V-region genes of the same subgroup may also be small (approx. twoper haploid genome). Taken together, these results would support the so-matic mutation model for the generation of antibody diversity, providedthat the V-genes were clustered in defined and sufficiently large subgroupsin the genome (Schechter et al. 1976). Available data, however, show thatthis requirement is not fulfilled in mice. The variability of mouse x L-chainshas shown no tendency to saturate and to cluster into subgroups. Out of44 mouse « V-region sequences only three L-chains (M-321, M-63 and T-124) constitute a well defined subgroup (McKean et al. 1973). Therefore,hybridization data of x L-chain probes with mouse genome, do not ruleout the germ line hypothesis. The human genome may be more suitable inhelping to distinguish between the germ line or somatic mutation hy-potheses. This is because the human genome appears to contain well de-fined subgroups of the V-regions of x-, ;.- and H-chains (Dayhoff 1972). The

Page 23: Structure, Organization, and Controlled Expression of the Genes Coding for the Variable and Constant Regions of Mouse Immunoglobulin Light Chains

IMMUNOGLOBULIN GENES AND PRECURSORS 25

hybridization data with the mouse genome, however, strongly suggest thatthere is no amplification of the x-type C-genes in a tumor that produceslarge amounts of the L-chain because the same gene dosage was determinedin liver and in myeloma cells. The amino acid sequences of mouse «-chainsindicate the existence of at least 33 germ line genes (McKean et aL 1973).Since the genome contains about two x-type C-genes, there should be amechanism for joining the genetic material coding for the V- and C-regions.The linkage of the presumed V and C genes (or mRNAs) probably occursprior to release of the mRNA from the nucleus. This is based on the find-ings that L-chain mRNA molecules isolated from myeloma polysomes pre-cipitated with antibodies specific either to the V-region or to the C-region,program in vitro the synthesis of the complete L-chains (Schechter 1974).

ACKNOWLEDGEMENTS

We thank Dr. David Papermaster for stimulating discussions, Mrs. Ida Orenfor help with sequencer analyses, Mrs. Etty Ziv, Miss Frida Kantor andMrs. Zillah Cohen for excellent assistance. This research was supported bygrant CA-20817 from the National Cancer Institute, U.S. Public HealthService, and by grant 806 from the United States-Israel Binational ScienceFoundation (BSF), Jerusalem, Israel. Dr. Zemell is a recipient of a fellow-ship from the Medical Research Council of Canada.

REEERENCES

Andersson, J., Buxbaum, J., Citronbaum, R., Douglas, S., Fomi, L., Melchers, F., Pernis,B. & Stott, D. (1974) IgM-producing tumors in the balb/c mouse: A model forB-cell maturation. / . Exp. Med. 140, 742.

Appella, E. (1971) Amino acid sequences of two mouse immunoglobulin lambda chains.Proc. Nat. Acad. Sci. U.S.A. 68, 590.

Askonas, B.A. & Williamson, A. R. (1967) Biosynthesis and assembly of immuno-globulin G. Cold Spring Harbor Symp. quant. Biol. 32, 223.

Berrger,A. & Scbechter, I. (1970) Mapping the active site of papain with the aid ofpeptide substrates and inhibitors. Phil. Trans. Roy. Soc. Lond. B. 257, 249.

Blobel, G- & Dobbersfein, B. (1975) Transfer of proteins across membranes. 1. Pre-sence of proteolytically processed and unprecessed nascent immunoglobulin lightchains on membrane-bound ribosomes of murine myeloma. /. Cell. Biol. 67, 835.

Brownlee, G. G., Cartwdght, E. M., Cowan, N.Y., Jarvis, J.M. & Milstein, C. (1973)Purification and sequence of messenger RNA for immunoglobulin light chains.Nature New Biol. 244, 236.

Burstein, Y., Kantor, F. & Schechter, I. (1976) Partial amino acid sequence of theprecursor of an immunoglobulin light chain containing NH..-terminal pyroglutamicadd. Proc. Nat. Acad. Sci. U.S.A. 73, 2604.

Burstein, Y. & Schechter, I. (1976) Amino acid sequence variability at the N-lerminalextra piece of mouse immunoglobulin light chain precursors of the same and dif-ferent subgroups. Biochem. J. 157, 145.

Page 24: Structure, Organization, and Controlled Expression of the Genes Coding for the Variable and Constant Regions of Mouse Immunoglobulin Light Chains

26 SCHECHTER ET AL.

Burstein, Y. & Schechter, I. (1977a) Amino acid sequence of the NHj-terminal extrapiece segments of the precursors of mouse immunoglobulin ^^-type and x-typelight chains. Proc. Nat. Acad. Sci. U.S.A. 74, 716.

Burslein, Y. & Schechter, L (1977b) Glutamine as a precursor to N-terminal pyro-glutamic acid in mouse immunoglobulin /-type light chains. Amino acid sequencevariability at the N-terminal extra piece of P.-type light chain precursors. Biochem.J. (in press).

Burstein, Y., Zemell, R., Kantor, F. & Schechter, I. (1977a) Independent expression ofthe gene coding for the constant domain of immunoglobulin light chain: Evidencefrom sequence analyses of the precursor of the constant region polypeptide. Proc.Nat. Acad. Sci. U.S.A. 74 (in press).

Burstein, Y., Ziv, E. & Schechter, 1. {1977b) Partial amino acid sequence of the pre-cursor of mouse immunoglobulin heavy chain programmed by mRNA in vitro.Israel J. Med. Sci. Submitted for publication.

Cesari,LM. & Weigert, M. (1973) Mouse lambda-chain sequences. Proc. Nat. Acad.Sci. U.S.A. 70,2112.

Dayhoif, M. O. (1972) Atlas of protein sequence and structure. Vol. 5. National Bio-medical Research Foundation, Washington D.C.

Dreyer, W. J., Gray, W. R. & Hood, L. (1967) The genetic, molecular, and cellular basisof antibody formation: Some facts and a unifying hypothesis. Cold Spring HarborSymp. quant. Biol. 32, 353.

Dugan Schulenburg, E., Bradshaw, R. A., Simms, E. S. & Risen, H.N. (1973) Aminoacid sequence of the light chain of a mouse myeloma protein (MOPC-315). Bio-chemistry 12, 5400.

Farquhar, M. N. & McCarthy, B. J. (1973) Evolutionary stability of the histone genesof sea urchins. Biochemistry 12, 4113.

Francis, S. H., Leslei, R. G. Q., Hood, L. & Eisen, H.N. (1974) Amino acid sequence ofthe variable region of the heavy (alpha) chain of a mouse myeloma protein withanti-hapten activity. Proc. Nat. Acad. Sci. U.S.A. 71, 1123.

Frangione, B. (1976) A new immunoglobulin variant: ^3 heavy chain disease proteinCHI. Proc. Nat. Acad. Sci. U.S.A. 73, 1552.

Gaily, J. A. & Edelman, G. M. (1970) Somatic translocation of antibody genes. Nature227, 341.

Gray, W. R., Dreyer, W.J. & Hood, L. (1967) Mechanism of antibody synthesis: Sizedifferences between mouse kappa chains. Science 155, 465.

Kacian, D. L. & Spiegelman, S. (1974) Use of micrococcal nuclease to monitor hybridi-zation reactions with DNA. Anal. Biochem. 58, 534.

Kuehl,W. M., Kaplan, B. A., Scharff, M. D., Nau, M., Honio,T. & Leder, P. (1975)Characterization of light chain and light chain constant region fragment mRNAsin MPC 11 mouse myeloma cells and variants. Cell 5, 139.

Kuehl, W. M. & Scharf, M. D. (1974) Synthesis of a carboxyl-terminal (constant region)fragment of the immunoglobulin light chain by a mouse myeloma cell line.J. Mol. Biol. 89. 409.

Mach, B., Faust, C. & Vassalli, P. (1973) Purification of 14S messenger RNA of im-munoglobulin light chain that codes for a possible light chain precursor. Proc.Nat. Acad. Sci. U.S.A. 70, 451.

Mclntire, K. R. & Rouse, A.M. (1970) Mouse immunoglobulin light chains: alternationof x:X ratio. Fed. Proc. 29, 704 Abs.

McKean, D., Potter, M. & Hood, L. (1973) Mouse immunoglobulin chains. Pattern of

Page 25: Structure, Organization, and Controlled Expression of the Genes Coding for the Variable and Constant Regions of Mouse Immunoglobulin Light Chains

IMMUNOGLOBULIN GENES AND PRECURSORS 27

sequence variation among x chains with limited sequence differences. Biochemistry12, 760.

Milstein, C , Brownlee, G.G., Harrison, T. M. & Mathews, M. B. (1972) A possible pre-cursor of immunoglabulin light chains. Nature New Biol. 239, 117.

Robertson, H. D. & Jeppesen, P. G. N. (1972) Extent of variation in three related bac-teriophage RNA molecules. / . Mol. Biol. 68, 417.

Scharff, M. D., Shapiro, A. L. & Ginsberg, B. (1967) The synthesis, assembly and secre-tion of gamma globulin polypeptide chains by cells of a mouse plasmaceli tumor.Cold Spring Harbor Symp. quant. Biol. 32, 235.

Schechter, I. (1973) Biologically and chemically pure mRNA coding for mouse im-munogiobulin L-chain prepared with the aid of antibodies and immobilized oligo-thymidine. Proc. Nat. Acad. Sci. U.S.A. 70, 2256.

Schechter, I. (1974) Use of antibodies for the isolation of biologically pure messengerribonucleic acid from fully functional eukaryotic cells. Biochemistry 13, 1875.

Schechter, I. (1975a) Region of immunoglobulin light-chain mRNA transcribed intocomplementary DNA by RNA-dependent DNA polymerase of avian myeloblas-tosis virus. Proc. Nat. Acad. Sci. U.S.A. 72, 2511.

Schechter, I. (1975b) Further characterization of the mRNA coding for immunoglobulinlight chain. Biochem. Biophys. Res. Commun. 67, 228.

Schechter, I. & Burstein, Y. (1976a) Identification of N-terminal methionine in the pre-cursor of immunoglobulin light chain. Initiation of translation of messenger ribo-nucleic acid in plants and animals. Biochem. J. 153, 543.

Schechter, I. & Burstein, Y. (1976b) Marked hydrophobicity of the NHa-terminal extrapiece of immunoglobulin light chain precursors: Possible physiological functionsof the extra piece. Proc. Nat. Acad. Sci. U.S.A. Ti, iXTi.

Schechter, I. & Burstein, Y. (1976c) Partial sequence of the precursors of immuno-globulin light-chains of different subgroups: Evidence that the immunoglobulinvariable-region gene is larger than hitherto known. Biochem. Biophys. Res. Com-mun. 68, 489.

Schechter, I., Burstein, Y. & Spiegelman, S. (1976) Structure and function of immuno-globulin genes and immunoglobulin precursors. Ann. Immunol. (Inst. Pasteur)127c, 421.

Schechter, I., McKean, D. J., Guyer, R. & Terry, W. (1975) Partial amino acid sequenceof the precursor of immunoglobulin light-chain programmed by messenger RNAin vitro. Science 188, 160.

Schubert, D. & Cohn, M. (1970) Immunoglobulin biosynthesis. V. Light chain assembly.J. Mot. Biol. 53, 305.

Segrest, J. P., Jackson, R. L., Marchesi, V. T., Guyer, R. B. & Terry, W. (1972) Red cellmembrane glycoprotein: Amino acid sequence of an intramembranous regicra. Bio-chem. Biophys. Res. Commun. 49, 964.

Segrest, J. P., Kahane, I., Jackson, R. L. & Marchesi, V. T. (1973) Major glycopmtein ofthe human erythrocyte membrane: Evidence for an amphipathic molecular struc-ture. Arch. Biochem. Biophys. 155, 167.

Smith, G. P. (1973) Mouse immunoglobulin kappa chain MPC-11: extra amino termi-nal residues. Science 181, 941.

Smithies, O., Gibson, D., Fanning, E. M., Goodfiiesh, R. M., Gilman, J. G. & Ballantyne,D. L. (1971) Quantitative procedures for use with the Edman-Begg Sequenator.Partial sequences of two unusual immunoglobulin light chains, Rzf and Sac. Bio-chemistry 10, 4912.

Page 26: Structure, Organization, and Controlled Expression of the Genes Coding for the Variable and Constant Regions of Mouse Immunoglobulin Light Chains

2t SCHECHTER ET AL.

Spatz, L. & Strittmatter, P. (1971) A form af cytochrome bg that contains an additionalhydrophabic sequence af 40 amino acid residues. Proc. Nat. Acad. Sci. U.S.A. 68,1042.

Svasti, J. & Milstein, C. (1972) The disulphide bridges of a mouse immunoglobulin Glprolein. Biochem. J. 126, 837.

Swan, D., Aviv, H. & Leder, P. (1972) Purification and properties of biologically activemessenger RNA for a myeloma light chain. Proc. Nat. Acad. Sci. U.S.A. 69, 1967.

Tanford, C. (1962) Contribution of hydrophobic interactions to the stability of theglobular conformation of proteins. /. Am. Chem. Soc. 84, 4240,

Tonegawa, S. & Baldi, i. (1973) Electrophoretically homogeneous myeloma light chainmRNA and its translation in vitro. Biochem. Biophys. Res. Commun. 51, 81.

Weigert, M. G., Cesari,!. M., Yonkovich, S. J. & Cohn, M. (1970) Variability in thelambda light chain sequences of mouse antibody. Nature 228, 1045.

Wigle, D.T. & Dixon, G.H. (1970) Transient incorporation of methionine at the N-terminus of protamine newly synthesized in trout testis cells. Nature 227, 676.

Wu, T. T. Si. Kabat, E. A. (1970) An analysis of the sequences of the variable regionsof Bence Jones proteins and myeloma light chains and their implications for an-tibody complementarity. } . Exp. Med. 132, 211.

Yamamoto, S. & Lampen, J. O. (1976) Membrane penicillinase of Baccillus licheniformis749/c: Sequence and possible repeated tetrapeptide structure of the phospholipo-peptide region. Proc. Nat. Acad. Sci. U.S.A. 73, 1457.

Zemell, R., Burstein, Y. & Schechter, I. (1977) Identification of initiator methionineresidue in precursors of immunogltAulins. Israel J. Med. Sci. Submitted for pub-lication.

Page 27: Structure, Organization, and Controlled Expression of the Genes Coding for the Variable and Constant Regions of Mouse Immunoglobulin Light Chains