flanking region detection

Upload: kunkun3287

Post on 04-Apr-2018

230 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/30/2019 Flanking Region Detection

    1/10

    Sequenceof the cDNA and

    5\m='\-Flanking Regionfor Human

    Acid \g=a\-Glucosidase, Detection of an Intron in the5\m='\Untranslated Leader Sequence, Definition of 18-bpPolymorphisms, and Differences with Previous cDNA

    and Amino Acid Sequences

    FRANK MARTINIUK,* MARK MEHLER.\s=d\ STEPHANIE TZALL,*GARY MEREDITH,* and ROCHELLE HIRSCHHORN*

    ABSTRACT

    Acid maltase or acid \g=a\-glucosidase (GAA) is a lysosomal enzyme that hydrolyzes glycogen to glucose and isdeficient in glycogen storage disease type II. Previously, we isolated a partial cDNA (1.9 kb) for humanGAA; we have now used this cDNA to isolate and determine sequence in longer cDNAs from four addi-tional independent cDNA libraries. Primer extension studies indicated that the mRNA extended approxi-mately 200 bp 5\m='\of the cDNA sequence obtained. Therefore, we isolated a genomic fragment containing 5\m='\cDNA sequences that overlapped the previous cDNA sequence and extended an additional 24 bp to an initia-tion codon within a Kozak consensus sequence. The sequence of the genomic clone revealed an intron\p=n-\exon

    junction 32 bp 5\m='\to the ATG, indicating that the 5\m='\leader sequence was interrupted by an intron. The re-maining 186 bp of 5\m='\untranslated sequence was identified approximately 3 kb upstream. The promoter re-

    gion upstream from the start siteof

    transcription was GC rich and contained areas of homology to Sp1 bind-ing sites but no identifiable CAAT or TATA box. The combined data gave a nucleotide sequence of 2,856bp for the coding region from the ATG to a stop codon, predicting a protein of 952 amino acids. The 3\m='\un-translated region contained 555 bp with a polyadenylation signal at 3,385 bp followed by 16 bp prior to apoly(A) tail. This sequence of the GAA coding region differs from that reported by Hoefsloot et al. (1988) inthree areas that change a total of 42 amino acids. Direct determination of the amino acid sequence in one ofthese areas confirmed the nucleotide sequence reported here but also disagreed with the directly determinedamino acid sequence reported by Hoefsloot et al. (1988). At two other areas, changes in base pairs predictednew restriction sites that were identified in cDNAs from several independent libraries. The amino acidchanges in all three ares increased the homology to rabbit\p=n-\human isomaltase. Therefore, we believe that ournucleotide sequence for GAA is more precise. We have also identified single base-pair polymorphisms at 18sites for human GAA, some of which are not silent.

    INTRODUCTION

    AeroMALTASE or acid a-glucosidase (designated GAA

    on the human gene map) is a lysosomal enzyme thathydrolyzes glycogen to glucose (Brown and Brown, 1981).GAA is synthesized as an approximately 110-kD precursorprotein and is then apparently processed along pathways

    shared by other lysosomal enzymes (Kornfeld, 1986). Thepost-translational processing includes glycosylation, phos-phorylation of mannose residues, with subsequent target-ing to the lysosomes, and proteolytic processing. Analysisof purified enzyme and pulse-chase experiments reveal mo-lecular species of approximately 110, 95, 76, and 70 kD(Hasilik and Neufeld, 1980; Martiniuk et ai, 1984).

    New York University Medical Center, Department of Medicine, New York, NY 10016.tAlbert Einstein College of Medicine, Department of Neurology, Bronx, NY 10461.

  • 7/30/2019 Flanking Region Detection

    2/10

    Glycogen storage disease type II is an autosomal reces-sive disorder that results from a deficiency of acid a-glu-cosidase or acid maltase. Deficiency of the enzyme can re-sult in three myopathie syndromes (Pompe, 1932; Hers,1963; Courtecuissf et al, 1965; Engel et al, 1973; and re-viewed in Hers et al., 1989). The first type is a fatal infan-tile form (Pompe's disease) characterized by generalized

    hypotonia with massive accumulation of glycogen in all tis-sues, including cardiac as well as skeletal muscle. The sec-ond type is a more variable juvenile form which is oftenfatal by the second decade of life. The adult-onset formpresents later in life with progressive proximal limb weak-ness and glycogen storage limited to skeletal muscle. Cellsfrom patients have been studied as to residual enzyme ac-tivity, GAA protein detected by antibody (CRM), and ab-normalities in intracellular and post-translational process-ing including phosphorylation (Beratis et al, 1978; Hasilikand Neufeld, 1980; Brown and Brown, 1981; Kornfeld,1986; LaBadie et al, 1985; Reuser et al, 1985, 1987).These investigations have revealed extensive heterogeneityof all parameters between and within the

    subtypesof

    gly-cogen storage disease type II.Previously, we isolated a partial cDNA (1.9 kb) for hu-

    man GAA (Martiniuk et al, 1986), and, using the partialcDNA, found extensive genetic heterogeneity in patientswith GAA deficiency as revealed by gross abnormalities ofmRNA and DNA (Martiniuk et al, 1986 and unpublished;Hirschhorn et al, 1989). Thus, approximately one-half ofinfantile-onset patients (5 of 10) lack detectable levels ofGAA mRNA while two infantile-onset patients have ab-normal Sac I genomic DNA fragments. Two of four adult-onset patients show smaller-sized mRNA.

    We now report the cDNA sequence for the coding, the3', and the 5' untranslated regions of human GAA as wellas the 5' promoter region. The 5' untranslated cDNA is un-usual in that it is interrupted by an approximately 3-kb in-tron. The 5'-flanking region is GC rich with an Spl bindingsite and lacks a classical CAAT or TATA box, similar toother housekeeping genes (Briggs et al, 1986; Kadonaga etal, 1987, 1988; Mitchell and Tjian, 1989). Recently, Hoef-sloot et al. (1988) have published the cDNA sequence forhuman GAA and have identified significant homologywith the rabbit sucrase-isomaltase complex (Hunziker etal, 1986) and human isomaltase (Green et al, 1987). Thesequence we have determined for the coding region differsfrom that reported by Hoefsloot et al. (1988) at threeareas. These differences result in (i) changes in 42 of thepredicted amino acids,

    (ii)increased

    homologyto the su-

    crase-isomaltase complex, and (iii) prediction of new re-striction endonuclease sites that were found within severalcDNAs. In addition, for the first area of difference, ourindependently determined amino acid sequence (and thatof LaBadie, 1986) agrees with our cDNA sequence and dis-agrees with both the reported amino acid and cDNA se-quence of Hoefsloot et al (1988). Lastly, since the cDNAsequence was determined in independent clones from fivecDNA libraries as well as a genomic library, we could de-fine 18 single base-pair polymorphisms, not all of whichwere silent.

    MATERIALS AND METHODS

    Isolation of the coding region of GAA

    The previously isolated 1.9-kb cDNA (from a Xgtll livercDNA library) (Martiniuk et al, 1986) was used as a la-beled probe to screen the following Xgtll cDNA librariesin addition to the original liver cDNA library: (i) a human

    placenta (Clontech Inc., Palo Alto, CA), (ii) a human fi-broblast, and (iii) a human lymphoid line; the latter twowere constructed in this laboratory essentially as describedby Gubler and Hoffman (1983). More 5' fragments fromlarger cDNAs were used to screen a Xgtll randomlyprimed human fibroblast cDNA library (Stratagene, LaJolla, CA). Thus, cDNAs were isolated from a total of fivedifferent cDNA libraries. The recombinant phage andDNA were isolated by standard procedures and subclonedinto Ml3 and/or pUC18 or pUC19 for analysis by restric-tion endonuclease digestion and/or sequencing.

    The 5' region containing the initiation codon (ATG) wasisolated utilizing a 5' cDNA fragment (bp 24-427, Fig. 1A)to screen an Eco RI

    (9-25 kb)human

    genomic lymphoidDNA library constructed in X DASH (Stratagene). Therecombinant phage DNA was isolated, digested with Sac I,electrophoresed in agarose gel, transferred to nitrocellu-lose, and hybridized to the 5' cDNA probe. Of the five dif-ferent-sized fragments generated, only a 1.0-kb segmenthybridized to the cDNA. This 1.0-kb Sac I fragment wassubcloned into pUC19 and M13 for sequencing and analy-sis. In addition, the nucleotide sequence of the 5' untrans-lated area was identified using GAA-specific primers(based upon Hoefsloot et al, 1988 (Fig. IB). The order ofthe fragments was determined by standard restrictiondigests and hybridization procedures.

    Sequence analysis and strategy

    Sequencing of the coding region of GAA was performedafter subcloning various-sized cDNAs (and genomic frag-ments) and using the forward or reverse universal primeror GAA-specific primers by the Sanger dideoxy chain-ter-mination method (Sanger et al, 1977) with Klenow frag-ment, AMV reverse transcriptase, and/or Sequenase. Thestrategy, size of clones and partial restriction map is out-lined in Fig. 2. Independent clones from six different li-braries were sequenced. However, only clones from theplacental, randomly primed fibroblast and genomic li-

    brarieswere

    sequenced in both directions (greater than99%).

    Amino-terminal amino acid sequence analysisHuman placental GAA was purified by ammonium sul-

    fate precipitation, CM-Sephadex C50 elution, and Sepha-dex G100 affinity chromatography as described previously(Swallow et al, 1975; Martiniuk et al, 1984). The purifiedprotein contains 110-, 76-, and 70-kD sized bands whenanalyzed by NaDodS04-polyacrylamide gels (Martiniuk etal, 1984) that corresponds to the molecular weight species

  • 7/30/2019 Flanking Region Detection

    3/10

    A -32 GCCTGTAGGAGCTGTCCAGGCCATCTCCAACCATGGGAGTGAGGCACCCGCCCTGCTCCC 28MetGlyValArgHisProProCysSerH

    29 ACCGGCTCCTGGCCGTCTGCGCCCTCGTGTCCTTGGCAACCGCTGCACTCCTGGGGCACA 88isArgLeuLeuAlaValCysAlaLeuValSerLeuAlaThrAlaAlaLeuLeuGlyHisI

    89 TCCTACTCCATGATTTCCTGCTGGTTCCCCGAGAGCTGAGTGGCTCCTCCCCAGTCCTGG 148leLeuLeuHisAspPheLeuLeuValProArgGluLeuSerGlySerSerProValLeuG

    149

    AGGAGACTCACCCAGCTCACCAGCAGGGAGCCAGCAGACCAGGGCCCCGGGATGCCCAGG 208luGluThrHisProAlaHisGlnGlnGlyAlaSerArgProGlyProArgAspAlaGlnA2 09 CACACCCCGGCCGTCCCAGAGCAGTGCCCACACAGTGCGACGTCCCCCCCAACAGCCGCT 268

    laHisProGlyArgProArgAlaValProThrGlnCysAspValProProAsnSerArgPP

    26 9 TCGATTGCGCCCCTGACAAGGCCATCACCCAGGAACAGTGCGAGGCCCGCGGCTGCTGCT 3 28heAspCysAlaProAspLysAlalleThrGlnGluGlnCysGluAlaArgGlyCysCysT

    | 1329 ACATCCCTGCAAAGCAGGGGCTGCAGGGAGCCCAGATGGGGCAGCCCTGGTGCTTCTTCC 388yrlleProAlaLysGlnGlyLeuGlnGlyAlaGlnMetGlyGlnProTrpCysPhePheP

    3 89 CACCCAGCTACCCCAGCTACAAGCTGGAGAACCTGAGCTCCTCTGAAATGGGCTACACGG 4 48roProSerTyrProSerTyrLysLeuGluAsnLeuSerSerSerGluMetGlyTyrThrA

    449 CCACCCTGACCCGTACCACCCCCACCTTCTTCCCCAAGGACATCCTGACCCTGCGGCTGG 508laThrLeuThrArgThrThrProThrPhePheProLysAspIleLeuThrLeuArgLeuA

    50 9 ACGTGATGATGGAGACTGAGAACCGCCTCCACTTCACGATCAAAGATCCAGCTAACAGGC 568

    spValMetMetGluThrGluAsnArgLeuHisPheThrlleLysAspProAlaAsnArgAP569 GCTACGAGGTGCCCTTGGAGACCCCGCATGTCCACAGCCGGGCACCGTCCCCACTCTACA 6 28

    rgTyrGluValProLeuGluThrProHisValHisSerArgAlaProSerProLeuTyrSP P

    629 GCGTGGAGTTCTCCGAGGAGCCCTTCGGGGTGATCGTGCGCCGGCAGCTGGACGGCCGCG 6 88erValGluPheSerGluGluProPheGlyVallleValArgArgGlnLeuAspGlyArgV

    689 TGCTGCTGAACACGACGGTGGCGCCCCTGTTCTTTGCGGACCAGTTCCTTCAGCTGTCCA 74 8alLeuLeuAsnThrThrValAlaProLeuPhePheAlaAspGlnPheLeuGlnLeuSerT

    7 49 CCTCGCTGCCCTCGCAGTATATCACAGGCCTCGCCGAGCACCTCAGTCCCCTGATGCTCA 8 08hrSerLeuProSerGlnTyrlleThrGlyLeuAlaGluHisLeuSerProLeuMetLeuS

    8 09 GCACCAGCTGGACCAGGATCACCCTGTGGAACCGGGACCTTGCGCCCACGCCCGGTGCGA 868erThrSer/TrpThrArglleThrLeuTrpAsnArgAspLeuAlaProThrProGlyAlaA

    86 9 ACCTCTACGGGTCTCACCCTTTCTACCTGGCGCTGGAGGACGGCGGGTCGGCACACGGGG 928

    snLeuTyrGlySerHisProPheTyrLeuAlaLeuGluAspGlyGlySerAlaHisGlyV9 29 TGTTCCTGCTAAACAGCAATGCCATGGATGTGGTCCTGCAGCCGAGCCCTGCCCTTAGCT 988

    alPheLeuLeuAsnSerAsnAlaMetAspValValLeuGlnProSerProAlaLeuSerT

    989 GGAGGTCGACAGGTGGGATCCTGGATGTCTACATCTTCCTGGGCCCAGAGCCCAAGAGCG 1048rpArgSerThrGlyGlylleLeuAspValTyrllePheLeuGlyProGluProLysSerV

    10 49 TGGTGCAGCAGTACCTGGACGTTGTGGGATACCCGTTCATGCCGCCATACTGGGGCCTGG 1108alValGlnGlnTyrLeuAspValValGlyTyrProPheMetProProTyrTrpGlyLeuG

    P

    1109 GCTTCCACCTGTGCCGCTGGGGCTACTCCTCCACCGCTATCACCCGCCAGGTGGTGGAGA 1168lyPheHisLeuCysArgTrpGlyTyrSerSerThrAlalleThrArgGlnValValGluA

    PP1169 ACATGACCAGGGCCCACTTCCCCCTGGACGTCCAGTGGAACGACCTGGACTACATGGACT 12 28

    snMetThrArgAlaHisPheProLeuAspValGlnTrpAsnAspLeuAspTyrtletAspS 3 M *S

    1229 CCCGGAGGGACTTCACGTTCAACAAGGATGGCTTCCGGGACTTCCCGGCCATGGTGCAGG 1288

    erArgArgAspPheThrPheAsnLysAspGlyPheArgAspPheProAlaMetValGlnG1289 AGCTGCACCAGGGCGGCCGGCGCTACATGATGATCGTGGATCCTGCCATCAGCAGCTCGG 13 48

    luLeuHisGlnGlyGlyArgArgTyrHetMetlleValAspProAlalleSerSerSerG1349 GCCCTGCCGGGAGCTACAGGCCCTACGACGAGGGTCTGCGGAGGGGGGTTTTCATCACCA 14 08

    lyProAlaGlySerTyrArgProTyrAspGluGlyLeuArgArgGlyValPhelleThrA14 09 ACGAGACCGGCCAGCCGCTGATTGGGAAGGTATGGCCCGGGTCCACTGCCTTCCCCGACT 1468

    snGluThrGlyGlnProLeuIleGlyLysValTrpProGlySerThrAlaPheProAspP1469 TCACCAACCCCACAGCCCTGGCCTGGTGGGAGGACATGGTGGCTGAGTTCCATGACCAGG 15 28

    heThrAsnProThrAlaLeuAlaTrpTrpGluAspMetValAlaGluPheHisAspGlnVP

    15 29 TGCCCTTCGACGGCATGTGGATTGACATGAACGAGCCTTCCAACTTCATCAGGGGCTCTG 158 8alProPheAspGlyMetTrpIleAspMetAsnGluProSerAsnPhelleArgGlySerG

  • 7/30/2019 Flanking Region Detection

    4/10

    158 9 AGGACGGCTGCCCCAACAATGAGCTGGAGAACCCACCCTACGTGCCTGGGGTGGTTGGGG 16 48luAspGlyCysProAsnAsnGluLeuGluAsnProProTyrValProGlyValValGlyG

    16 4 9 GGACCCTCCAGGCGGCCACCATCTGTGCCTCCAGCCACCAGTTTCTCTCCACACACTACA 17 08lyThrLeuGlnAlaAlaThrlleCysAlaSerSerHisGlnPheLeuSerThrHisTyrA

    P

    17 09 ACCTGCACAACCTCTACGGCCTGACCGAAGCCATCGCCTCCCACAGGGCGCTGGTGAAGG 17 68snLeuHisAsnLeuTyrGlyLeuThrGluAlalleAlaSerHisArgAlaLeuValLysA

    1769 CTCGGGGGACACGCCCATTTGTGATCTCCCGCTCGACCTTTGCTGGCCACGGCCGATACG 18 28

    laArgGlyThrArgProPheVallleSerArgSerThrPheAlaGlyHisGlyArgTyrA18 29 CCGGCCACTGGACGGGGGACGTGTGGAGCTCCTGGGAGCAGCTCGCCTCCTCCGTGCCAG 18 88

    laGlyHisTrpThrGlyAspValTrpSerSerTrpGluGlnLeuAlaSerSerValProG

    18 89 AAATCCTGCAGTTTAACCTGCTGGGGGTGCCTCTGGTCGGGGCCGACGTCTGCGGCTTCC 1948luIleLeuGlnPheAsnLeuLeuGlyValProLeuValGlyAlaAspValCysGlyPheL

    1949 TGGGCAACACCTCAGAGGAGCTGTGTGTGCGCTGGACCCAGCTGGGGGCCTTCTACCCCT 2008euGlyAsnThrSerGluGluLeuCysValArgTrpThrGlnLeuGlyAlaPheTyrProP

    2 0 09 TCATGCGGAACCACAACAGCCTGCTCAGTCTGCCCCAGGAGCCGTACAGCTTCAGCGAGC 2 0 68heMetArgAsnHisAsnSerLeuLeuSerLeuProGlnGluProTyrSerPheSerGluP

    2 06 9 CGGCCCAGCAGGCCATGAGGAAGGCCCTCACCCTGCGCTACGCACTCCTCCCCCACCTCT 2128roAlaGlnGlnAlaMetArgLysAlaLeuThrLeuArgTyrAlaLeuLeuProHisLeuT

    P

    2129 ACACACTGTTCCACCAGGCCCACGTCGCGGGGGAGACCGTGGCCCGGCCCCTCTTCCTGG 2188

    yrThrLeuPheHisGlnAlaHisValAlaGlyGluThrValAlaArgProLeuPheLeuG2189 AGTTCCCCAAGGACTCTAGCACCTGGACTGTGGACCACCAGCTCCTGTGGGGGGAGGCCC 2248

    luPheProLysAspSerSerThrTrpThrValAspHisGlnLeuLeuTrpGlyGluAlaL

    22 4 9 TGCTCATCACCCCAGTGCTCCAGGCCGGGAAGGCCGAAGTGACTGGCTACTTCCCCTTGG 2 3 08

    euLeuIleThrProValLeuGlnAlaGlyLysAlaGluValThrGlyTyrPheProLeuGP

    2 3 09 GCACATGGTACGACCTGCAGACGGTGCCAATAGAGGCCCTTGGCAGCCTCCCACCCCCAC 2 36 8

    lyThrTrpTyrAspLeuGlnThrValProIleGluAlaLeuGlySerLeuProProProP\ 6

    2 369 CTGCAGCTCCCCGTGAGCCAGCCATCCACAGCGAGGGGCAGTGGGTGACGCTGCCGGCCC 2 4 28roAlaAlaProArgGluProAlalleHisSerGluGlyGlnTrpValThrLeuProAlaP

    P2 4 29 CCCTGGACACCATCAACGTCCACCTCCGGGCTGGGTACATCATCCCCCTGCAGGGCCCTG 2 4 88

    roLeuAspThrlleAsnValHisLeuArgAlaGlyTyrllelleProLeuGlnGlyProG

    2 4 89 GCCTCACAACCACAGAGTCCCGCCAGCAGCCCATGGCCCTGGCTGTGGCCCTGACCAAGG 2 548

    lyLeuThrThrThrGluSerArgGlnGlnProMetAlaLeuAlaValAlaLeuThrLysGP

    2 5 4 9 GTGGGGAGGCCCGAGGGGAGCTGTTCTGGGACGATGGAGAGAGCCTGGAAGTGCTGGAGC 2 6 08

    lyGlyGluAlaArgGlyGluLeuPheTrpAspAspGlyGluSerLeuGluValLeuGluA

    26 09 GAGGGGCCTACACACAGGTCATCTTCCTGGCCAGGAATAACACGATCGTGAATGAGCTGG 2668

    rgGlyAlaTyrThrGlnValllePheLeuAlaArgAsnAsnThrlleValAsnGluLeuV

    266 9 TACGTGTGACCAGTGAGGGAGCTGGCCTGCAGCTGCAGAAGGTGACTGTCCTGGGCGTGG 27 28

    alArgValThrSerGluGlyAlaGlyLeuGlnLeuGlnLysValThrValLeuGlyValA

    27 2 9 CCACGGCGCCCCAGCAGGTCCTCTCCAACGGTGTCCCTGTCTCCAACTTCACCTACAGCC 2788

    laThrAlaProGlnGlnValLeuSerAsnGlyValProValSerAsnPheThrTyrSerP

    2 789 CCGACACCAAGGTCCTGGACATCTGTGTCTCGCTGTTGATGGGAGAGCAGTTTCTCGTCA 28 48

    roAspThrLysValLeuAspIleCysValSerLeuLeuMetGlyGluGlnPheLeuValS

    2 8 4 9 GCTGGTGTTAGCCGGGCGGAGTGTGTTAGTCTCTCCAGAGGGAGGCTGGTTCCCCAGGGA 2 9 0 8erTrpCysEnd

    29 0 9 AGCAGAGCCTGTGTGCGGGCAGCAGCTGTGTGCGGGCCTGGGGGTTGCATGTGTCACCTG 296 8P

    296 9 GAGCTGGGCACTAACCATTCCAAGCCGCCGCATCGCTTGTTTCCACCTCCTGGGCCGGGG 3 0 28P P

    3029 CTCTGGCCCCCAACGTGTCTAGGAGAGCTTTCTCCCTAGATCGCACTGTGGGCCGGGGCC 3 08 8

    3089 TGGAGGGCTGCTCTGTGTTAATAAGATTGTAAGGTTTGCCCTCCTCACCTGTTGCCGGCA 3148

    314 9 TGCGGGTAGTATTAGCCACCCCCCTCCATCTGTTCCCAGCACCGGAGAAGGGGGTGCTCA 3 2 0 8

    32 0 9 GGTGGAGGTGTGGGGTATGCACCTGAGCTCCTGCTTCGCGCCTGCTGCTCTGCCCCAACG 3268P

    32 6 9 CGACCGCTGCCCGGCTGCCCAGAGGGCTGGATGCCTGCCGGTCCCCGAGCAAGCCTGGGA 3 328

    3329 ACTCAGGAAAATTCACAGGACTTGGGAGATTCTAAATCTTAAGTGCAATTATTTTTAATA 3388

    3389 AAAGGGGCATTTGGAATCAAA 3409

  • 7/30/2019 Flanking Region Detection

    5/10

    B CGTGCGGAG GTGAGCCGGG CCGGGGCTGC GGGGCTTCCC

    TGAGCGCGGG CCGGGTCGGT GGGGCGGTCG GCTGCCCGCG CGGCCTCT*CA-218

    GTGGGAAAGC TGAGGTTGTC GCCGGGGCCG CGGGTGGAGG TCGGGGATGA

    GGCAGCAGGT AGGACAGTGA CCTCGGTGAC GCGAAGGACC CCGGCCACCTP P

    CTAGGTTCTC CTCGTCCGCC CGTTGTTCAG CGAGGGAGGC TCTGGGCCTG

    CCGCAGCTGA CGGGGAAACT GAGGCACGGA GCGG*GTGAGA CACCTGACGT-33

    CTGCCCCGC

    FIG. 1. A. Nucleotide sequence of the combined analysis for human GAA. The numbered arrows indicate positions ofdifference from the Hoefsloot et al. (1988) GAA cDNA sequence. 1 = deletion of a G (Hoefsloot et al, number 598); 2= addition of a C (Hoefsloot et al, number 608-609); 3 = addition of a G (Hoefsloot et al, number 1,455-1,456; 4 =addition of a C (Hoefsloot et al., number 1,481-1,482); 5 = addition of a G (Hoefsloot et al, number 1,498-1,499); 6 =deletion of a G (Hoefsloot et al, number 2,625); 7 = addition of a G (Hoefsloot et al., number 2,699-2,700); and 8 =addition of a T (Hoefsloot et al, number 3,599-3,600). The initiation codon (ATG), stop codon (TAG), and polyad-enylation sites are underlined. The P above 16 base pairs indicates the location of the base pair polymorphisms listed inTable 2. The polymorphic base pair indicated is the most common base pair found at each site in Table 2. B. Nucleo-tide sequence of the 5' untranslated cDNA and flanking intron base pairs, one potential Spl binding site is underlined.The intron-exon junction is identified by an arrow. A P over two base pairs indicates the location of base pair polymor-phisms listed in Table 2.

    Sac I Ace I Smal Sac I Sac I-1-H-'-l!-1-L-

    1.0 Kb 2.0 3.0

    Kb0.450.46

    1.82.81 .6

    1.92.8

    3.2

    FIG. 2. Sequencing strategy for human GAA clones from five independent human cDNA libraries and one genomicfragment. Clones were sequenced using the forward or reverse universal primers or GAA specific primers. Arrows indi-cate the direction of sequencing. The 1.9-kb clone was the original clone isolated by antibody screening from a X gtl 1 hu-man liver cDNA library (Martiniuk et al., 1986). The 1.6-, 1.8-, and 2.8-kb (more 5') clones were isofated from a humanfibroblast cDNA library. Polymorphic base pair differences indicate that the 1.8- and 2.8-kb clones represent two differ-ent aleles. No differences were observed in the region of overlap between the 1.6 vs. the 1.8- and 2.9-kb clones, and thusthe 1.6-kb clone restriction fragment length polymorphisms (RFLPs) are not included in Table 2. The 2.8-kb (more 3')clone was isolated from a human placental cDNA library. The 3.2-kb clone was isolated from a human lymphoid line li-brary. The 0.46-kb clone was isolated from a human randomly primed fibroblast cDNA library. The 0.45-kb clone wasisolated from a human lymphoid line genomic library. Sites for Sma I, Sac I, and Ace I used for subcloning are indi-cated.

    seen in pulse-chase experiments in fibroblasts (Hasilik andNeufeld, 1980).

    The 76-kD protein was separated by NaDodSOvpoly-acrylamide gels and electroeluted using an ISCO electro-eluter (ISCO, Inc. Lincoln, NE) in 0.01 M Tris, 0.5%NaDodS04 pH 6.6. The sample was desalted on a Sepha-

    dex G25 column (Isolab, Inc., Akron, OH) in 50 mM am-monium acetate and precipitated with ethanol at -70Covernight. The sample was centrifuged in a Brinkman mi-crofuge for 30 min at 4C and dried. The ethanol-precipi-tated protein was amino-terminal amino acid sequenced byApplied Biosystems (Foster City, CA).

  • 7/30/2019 Flanking Region Detection

    6/10

    Restriction endonuclease analysis

    DNA was digested with the appropriate restriction endo-nuclease according to the manufacturer's instructions andelectrophoresed in 0.8% agarose-ethidium bromide gels ac-cording to Maniatis et al (1982).

    Primer extensionPrimer extension of human fibroblast mRNA was per-

    formed with a 5'-end-labeled "P-labeled GAA specific pri-mer (bp 136-165, Fig. 1 A) (DuPont-New England Nuclear5'-end-labeling kit, Boston, MA) using the method ofGhosh et al. (1980). The labeled extended DNA was elec-trophoresed on a 6% polyacrylamide-urea sequencing geland compared to a sequencing ladder to determine the sizeof the extended fragment.

    RESULTS

    We have previously isolated a partial cDNA clone (1.9kb) for human GAA. To obtain longer clones that spanthe entire coding region including the initiation codon, weused the 1.9-kb cDNA to screen three additional oligo(dT)-primed human cDNA libraries (lymphoid, placenta, andfibroblast). Various-sized cDNA clones were obtained withdifferent degrees of overlap at either the 3' or 5' ends (Fig.2). The largest clone was approximately 3.2 kb. A 5'-endfragment of the 3.2-kb cDNA clone was used to screen arandomly primed fibroblast cDNA library and more 5'cDNA sequence was obtained. The cDNAs and restriction

    fragments of some of these cDNAs were subcloned intopUC18/19 or M13 and sequenced. The sequencing strategyand a

    partialrestriction

    enzyme mapare outlined in

    Fig.2.

    The composite cDNA sequence contained 2,832 bp of anopen reading frame but did not contain the initiationcodon (ATG). The resulting combined nucleotide sequenceof these clones which began with bp 24 is presented in Fig.1A.

    Analysis for 5' untranslated region

    To determine the size of the mRNA including the 5' un-translated region, we performed primer extension analysisusing mRNA from normal human fibroblasts and a la-beled GAA-specific primer (30-mer, bp 136-165, Fig. 1A).The size of the labeled extended DNA was determined on asequencing gel as approximately 350 bp or 200 bp from thesequence of the most 5' cDNA clone.

    Isolation of the 5' cDNA and 5'-flanking regionfrom genomic fragments, identification of an introninterrupting the 5' untranslated region, and partialcharacterization of the promoter region

    To obtain the 5' untranslated region and initiationcodon of the cDNA, we constructed a genomic library of

    human Eco RI-digested DNA and screened this librarywith a 5' cDNA fragment (bp 24-427, Fig. 1A). Recombi-nant phage containing 20-kb inserts were isolated, digestedwith Sac I, and hybridized to the 5' cDNA fragment. A1.0-kb Sac I fragment which hybridized to the 5' cDNAprobe was subcloned and sequenced. The 3' Sac I site ofthe genomic fragment corresponded to the 3' Sac I site of

    the cDNA fragment and continued uninterrupted 5'to an

    initiation codon (ATG) which was included in a Kozak(1986) consensus sequence. Thirty two base pairs 5' of theATG, the sequence diverged from that reported byHoefsloot et al. (1988) into the sequence of an intron-exon

    junction (TTCTTCTCCCGCAG1GC). The transcrip-tional start site reported by Hoefsloot et al. (1988) wasidentified approximately 3 kb upstream of this intron-exon

    junction by sequencing of the next 5' 2.6-kb Sac I fragment(Fig. IB). The 5' region flanking the transcription start sitecontained several GC-rich regions with homology to theSpl binding sites in the promoter region of other "house-keeping" genes (Jones et al, 1988) but no obvious CAATor TATA box. One of the regions was identical in se-quence to the consensus sequence of the Spl binding site(Mitchell and Tjian, 1989) (Fig. IB).

    Therefore, we have sequenced the complete coding and5' and 3' untranslated region of GAA as well as an addi-tional 87 bp of the 5'-flanking promoter region. There is a5' 186-bp exon of untranslated sequence. The coding re-gion begins with the ATG (designated as bp 1, Fig. 1A),extends to bp 2,856 followed by a stop codon (TAG), andcodes for 952 amino acids. Greater than 99% of the codingsequence was sequenced in both directions using clonesfrom five independent cDNA libraries and a genomic clone(Fig. 2). The 3' untranslated region contains an additional555 bp with a poly(A) addition signal at bp 3,385 followed

    by 16 bp to a poly(A) tail.

    Comparison to previous GAA sequence

    Comparison of our sequence of the coding region tothat reported by Hoefsloot et al. (1988) revealed threeareas of deletion or addition of base pairs (Table 1 andFig. 1A; A of the initiation codon designated as 1, the cor-responding base pair numbers used by Hoefsloot et al,1988, are indicated in Fig. 1 A). We found a deletion of a Gbetween bp 378 and 379 and an insertion of a C at bp 389.

    This different nucleotide sequence was found in all clonessequenced from three different libraries and changes thereading frame for three of the predicted amino acids fromVal-Leu-Leu to Cys-Phe-Phe (Table 1). The amino acid se-quence in this area was also determined by Hoefsloot et al.(1988) as part of the amino-terminal sequence of the 76-kDprocessed protein. To confirm that our cDNA sequencewas correct and not an artifact of sequencing, we also de-termined the amino-terminal amino acid sequence of the

    corresponding 76-kD processed protein. The amino acidsequence we obtained in this area (_-Phe-Phe) is consis-tent with the amino acids predicted by the current cDNA

  • 7/30/2019 Flanking Region Detection

    7/10

    Table 1. Amino Acid Homology of Human GAA, Human Isomaltase, Rabbit Isomaltase, and Rabbit Sucrase:Comparison at Areas of Differences in Published cDNAs and Predicted Amino Acid Sequence

    A. Area: base pairs 379-389 (codons 127-129)Previous GAA cDNA* V L L

    Present GAA cDNA C F F

    Human isomaltasec C F F

    Rabbit isomaltasec C F FRabbit sucrasec C Y F

    B. Area: base pairs 1,237-1,282 (codons 413-427)Previous GAA cDNA* TSRSTRMAS-GLPGH

    DFTFNKDGFRDFPAM

    DFTYDQVAFNGLPQFDFTYDRVAYNGLPDF

    DFTIDE-NFRELPQFC. Area: base pairs 2,410-2,483 (codons 804-827)

    Previous GAA cDNA* VGDAAGPPGHHQRPPPGWVHHPPAPresent GAA cDNA WVTLPAPLDTINVHLRAGYI IPLQRabbit isomaltase^ RVEMSLPADKIGLHLRGGYI IPIQRabbit sucrase^ FQDFNTPYPALNLHVRGGHI IPCQ

    "Hoefsloot et al. (1988).bGreenea/. (1987).cHunziker et al. (1986).

    sequence (Cys-Phe-Phe; Table 1). Additionally, the sameamino acid sequence for this area was determined indepen-dently by LaBadie (1986). The amino acid sequence ofCys-Phe-Phe is identical to that of rabbit and human iso-maltase in the

    homologousarea

    (Table 1).The second area of difference involves an addition ofthree base pairs: a G at bp 1,237, a C at bp 1,264, and a Gat bp 1,282 (Fig. 1A). These additions result in a change in15 predicted amino acids (codons 413-427) and they in-crease the homology to the sucrase-isomaltase complex(Table 1). The extra G at bp 1,282 results in the generationof a new Nco I site. When DNA from cDNA clones fromthree different libraries was digested with Nco I, the ap-propriate fragments were seen in all three clones, indi-cating the presence of the predicted Nco I site (data notshown).

    The third area of difference involves the deletion of a Gbetween bp 2,409 and 2,410 and the addition of a G at bp2,483 (Fig. 1A). These changes generated a different pre-dicted amino acid sequence (24 amino acids, codons 804-827) which increases homology to the sucrase-isomaltasecomplex (Table 1). The addition of a G at bp 2,483 pre-dicts a new Ban II site. Digestion with Ban II of DNAfrom clones from three different libraries generated thefragments predicted by the presence of the Ban II restric-tion site (data not shown).

    Lastly, in the 3' untranslated region, we found an extraT at bp 3,384 in clones from the cDNA libraries. Two ad-ditional base pair differences were found in the 5' untrans-lated area.

    PolymorphismsComparison of the sequence of the cDNAs from the dif-

    ferent libraries to each other and to that reported by Hoef-sloot et al (1988) revealed polymorphisms at 18 differentbase pairs. These 18 sites are listed in Table 2 with the al-ternative bases, corresponding amino acids, number ofclones at each site, and hydrophobicity. Of the 12 poly-morphic base pairs in the coding region, only two (bp1,115 and 1,204) result in nonconservative amino acidchanges. However, at these two sites, all four clones dif-fered from the sequence reported by Hoefsloot et al.(1988).

    DISCUSSION

    To determine the nucleotide sequence of the coding re-gion of human GAA, we have isolated and sequenced vari-ous-length GAA cDNAs and genomic segments of thegene. The cDNA contains a coding region of 2,856 bpfrom the ATG to a stop codon (TAG) with the ATG con-tained within a Kozak sequence (Kozak, 1986). The 3' un-translated region is 555 bp, with a polyadenylation signalat bp 3,385 followed by a poly(A) tail. Primer extensionexperiments indicated that the size of the 5' untranslatedregion was approximately 200 bp, consistent with that re-ported by Hoefsloot et al. (1988). We found that this 5'untranslated region was interrupted by an intron 32 bp 5'

  • 7/30/2019 Flanking Region Detection

    8/10

    Table 2. Single Base Pair Polymorphisms for Human GAAa

    Base Sequence

    A* B

    Aminoacid

    change

    A* B

    Type ofchange

    Number of clonesobserved withalternate bases

    A* B

    1.2.

    3.

    4.

    5.

    6.

    7.

    8.

    9.

    10.

    11.

    12.

    13.

    14.

    15.

    16.

    17.

    18.

    -82-79

    324

    596

    642

    668

    1,1151,2031,2041,5811,7272,133

    2,3382,5533,0023,0823,0863,277

    TGG-GCC-

    TGC-

    CGT-

    TCC-

    CAC-

    CTC-

    CAA-

    CGG-

    AGA-

    GCC-

    ACA-

    ATA-

    GGA-

    ATC-

    GCC-

    GGG-

    CTT-

    TGCGCG

    TGT

    CAT

    TCT

    CGC

    CAC

    CAG

    TGG

    AGG

    GGC

    ACG

    GTA

    GGG

    ATT

    GCT

    GGC

    CTG

    NoncodingNoncodingCys-CysArgHisSerSer

    HisArgLeuHis

    Gin-Gin

    ArgTrpArg-ArgAla-GlyThr-Tin-

    Ile-Val

    Gly-GlyNoncodingNoncodingNoncodingNoncoding

    Silent

    Conservative

    Silent

    Conservative

    HydrophilicSilent

    HydrophobieSilentConservative

    Silent

    Conservative

    Silent

    aThe most common base pair found at each site is indicated in Fig. 1A with a P over the site.^Indicates base pair and predicted amino acid sequence reported by Hoefsloot et al. (1988) (Fig. 1A

    legend).

    of the ATG. The remaining 186 bp of the 5' untranslatedcDNA and the transcription start site was identified ap-proximately 3 kb upstream. Although interruption of the5' untranslated region by an intron is not common, it hasbeen observed for some of the oncogenes, ai-antitrypsin,and 0-actin (Kost et al., 1983; Long et al, 1984; Hoffmanet al, 1987). The promoter region was very GC rich (84%)and contained one sequence identical with the binding sitefor Apl, 27 bp upstream from the reported transcriptionstart site. Several additional areas of homology to the Splbinding site were also present, but no CAAT or TATA boxcould be identified. Thus, the promoter region appears tobe similar to that reported for several other housekeepinggenes (Jones et al, 1988).

    Recently, Hoefsloot et al. (1988) published the nucleo-tide sequence of a cDNA for human GAA that was 3,636bp with a coding region of 2,853 bp. (They also reportedthat the previously published cloning of the gene for mon-key GAA was spurious [Konings et al, 1984].) Theseauthors noted that human GAA has approximately 40%homology to human isomaltase and rabbit isomaltase-su-crase when comparing either amino acids or nucleotides.Greater homology was shown by Barnes and Wynn (1988)using related amino acids (Dayhoff et al, 1979). Although

    the current sequence of the coding region for human GAAis almost identical to that reported by Hoefsloot et al.(1988), there are three major areas where either the dele-tion or addition of base pairs results in a change in 42amino acids. In these areas, the predicted amino acid se-quence changes for 3, 15, and 24 amino acids (codons 127-129, codons 413-427, and codons 804-827). In the firstarea, our independently determined amino acid sequenceof the GAA 76-kD protein as well as that of LaBadie(1986) agrees with the amino acid sequence predicted bythe current cDNA. Hoefsloot et al. (1988) also determinedthe amino-terminal amino acid sequence for the 76-kDprotein and the amino aicd sequence matches the aminoacid

    sequence predicted bythe nucleotide

    sequence.The

    disagreement of the amino acid sequences could reflectrare normal variation in an unimportant domain of theprotein or some other as yet undefined difference. How-ever, it should be noted that the amino acid sequence re-ported here is identical to that found in isomaltase-su-crase. In the second and third areas of difference, nucleo-tide changes were confirmed by presence of a predictedNco I site and a predicted Ban II site in clones from threeindependent libraries. At all three areas of difference, theamino acid sequence reported here results in increased ho-

  • 7/30/2019 Flanking Region Detection

    9/10

    mology to the rabbit sucrase-isomaltase complex and hu-man isomaltase (Hoefsloot et al., 1988). For these severalreasons, we believe that the nucleotide sequence reportedhere for the cDNA for human GAA may be more precisethan the reported sequence of Hoefsloot et al. (1988).

    One of the areas (codons 413-427) in which the pre-dicted amino acid sequence is changed by the current

    cDNA sequence has been hypothesizedto

    bea

    binding sitefor glycogen (based on identification of strongly epitopicareas in GAA which were not homologous to isomaltase).The homology to isomaltase in this area is increased by thesequence reported here and thus appears to eliminate thebasis used for predicting that this area is a potential glyco-gen binding site.

    Finally, we have reported polymorphisms at 18 nucleo-tides by comparison of the sequence from clones from sixindependent libraries and that reported by Hoefsloot et al.(1988). This frequency of base pair differences (1 per 200bp) is not unusual. Of the total of 18 base pair polymor-phisms, 8 must be true polymorphisms rather than se-quencing or transcription errors because the two alterna-tive base pairs were found in cDNAs from more than onelibrary. Of these eight confirmed polymorphisms, threepredict conservative amino acid differences, (bp 596, Arg-199 vs. His; bp 668, His-223 vs. Arg; and bp 2,338, Ile-780vs. Val). (The Val-780 was detected in genomic DNA froman additional normal not included in Table 2). Althougharginine vs. histidine is considered to be a conservative dif-ference, there are examples where this amino acid altera-tion does have significant effects. It will be of interest todetermine whether either of the Arg vs. His polymor-phisms correlate with the polymorphisms previously de-fined biochemically (Swallow et al, 1975; Nickel and Mc-Alpine, 1982). The remaining five confirmed polymor-

    phisms are either silent or in the noncoding region.Of the 10 base pair polymorphisms that were not con-firmed by the above criteria, three predict amino acidchanges (bp 1,115, Leu-372 vs. His; bp 1,204, Arg-402 vs.Trp; and bp 1,727, Ala-576 vs. Gly). Two of the predictedamino acid changes are not conservative (Leu vs. His andArg vs. Tryp), whereas the third is conservative (Ala vs.Gly). For the nonconservative change, the amino acids re-ported here are similar in properties to the amino acids inhomologous areas of sucrase-isomaltase. The histidine re-ported here at codon 372 and the glutamine present in hu-man isomaltase and rabbit isomaltase and sucrase are hy-drophilic, whereas the previously reported leucine(Hoefsloot et al,

    1988)is

    hydrophobic.The tryptophan at

    codon 402 reported here and the valine in both isomaltasesand the tyrosine in sucrase are hydrophobic, whereas thepreviously reported arginine (Hoefsloot et al., 1988) ishydrophilic. If either or both of the original base pairs re-ported should be confirmed as true polymorphisms, itwould seem likely that these would result in detectable al-terations in properties of the enzyme. The remaining sevenputative silent base pair polymorphisms could be rare poly-morphisms, or errors in transcription or sequencing ofcDNA.

    ACKNOWLEDGMENTS

    This research was supported by the following grants:Muscular Dystrophy Association (F.M.), American HeartAssociation #870992 (F.M.), and National Institutes ofHealth grant ROI 39669-01 (F.M.). M.M. is a recipient ofa Teacher Investigator Development Award (NS00856)

    from the National Institute of Neurological and Com-munications Disorders and Stroke. S.T. is a fellow of theArthritis Foundation.

    We would like to thank Heather Fitzcharles, CarolineHobart, and Amy Ellenbogen for their excellent technicalassistance and Kathy Lawrence and Janet Samuel for prep-aration of the manuscript. We would also like to thank Dr.W. Knigsberg of Yale University for his expert evaluationof the amino acid sequence data.

    REFERENCES

    BARNES, A.K., and WYNN, C.B. (1988). Homology of lyso-

    somal enzymes and related proteins: Prediction of posttransla-tional modification site including phosphorylation of mannoseand potential epitopic and substrate binding sites in the alphaand beta subunits of hexosaminadase, alpha glucosidase, andrabbit and human isomaltase. Proteins: Structure, Functionand Genetics 4, 182-189.

    BERATIS, N., LaBADIE, G.U., and HIRSCHHORN, K.(1978). Characterization of the molecular defect in infantileand adult acid alpha-glucosidase deficiency fibroblasts. Am. J.Hum. Genet. 62, 1264-1274.

    BRIGGS, M.R., KADONAGA, J.T., BELL, S.P., and TJIAN,R. (1986). Purification and biochemical characterization of thepromoter-specific transcription factor, Spl. Science 234, 47-52.

    BROWN, B.I., and BROWN, D.H. (1981). The subcellular dis-tribution of enzymes in Type II glycogenosis and the occur-rence of oligo alpha-1, 4-glucan glucohydrolase in human tis-sues. Biochim. Biophys. Acta 110, 124-133.

    COURTECUISSF, V., ROYER, F., HABIB, R., MONNIFER,Ci and DENVOS, J. (1965). Glycogerose musculaire par de-ficit d'alpha 1-4 glucosidase simulant un dystrophie musculaireprogressive. Arch. Franc. Pediat. 22, 1153-1164.

    DAYHOFF, M.O., SCHWARTZ, R.M., and ORCUTT, B.C.(1979). A Model of evolutionary change in proteins. In Atlas ofProtein Sequence and Structure, vol. 5. M.O. Dayhoff, ed.(National Biomdical Research Foundation, Washington, DC)pp. 345-352.

    ENGEL, A.G., GOMEZ, M.R., SEYBOLD, M.E., and LAM-BERT, E.H. (1973). The spectrum and diagnosis of acid mal-tase deficiency. Neurology 23, 95-106.

    GHOSH, P.K., REDDY, V.B., PIATAK, M.,LEBOWITZ, P.,and WE1SSMAN, S.M. (1980). Determination of RNA se-

    quences by primer directed synthesis and sequencing of theircDNA transcripts. Methods Enzymol. 65, 580-595.

    GREEN, F., EDWARDS, Y., HAURI, H.P., HO, M.W.,PINTO, M., and SWALLOW, D. (1987). Isolation of a cDNAfor a human jejunal brush-border hydrolase, sucrease-isomal-tase, and assignment of the gene to chromosome 3. Gene 57,101-110.

    GUBLER, U., and HOFFMAN, B.J. (1983). A simple and veryefficient method for generating cDNA libraries. Gene 25, 263-269.

  • 7/30/2019 Flanking Region Detection

    10/10

    HASILIK, A., and NEUFELD, E.F. (1980). Biosynthesis of lyso-somal enzymes in fibroblasts. J. Biol. Chem. 255, 4937-4945.

    HERS, H.G. (1963). Alpha-glucosidase deficiency in generalizedglycogen storage disease (Pompe's Disease). Biochem. J. 86,11-16.

    HERS, H., VANHOOF, F., and DEBARSY, T. (1989). Glycogenstorage diseases. In The Metabolic Basis of Inherited Disease.C.R. Scriver, A.L. Beadet, W. Sly, and P. Valle, eds. (Mc-

    Graw-Hill, New York) pp. 425-452.HIRSCHHORN, R., MARTINIUK, F., TZALL, S., MEHLER,M., and PELLICER, A. (1989). The molecular basis of acidalpha glucosidase deficiency; Isolation of a cDNA and studiesof patients. In Molecular Genetics of Neuromuscular Diseases.L.P. Rowland, D.S. Wood, E.A. Schon, and S. DiMauro, eds.(Oxford University Press, New York) pp. 248-257.

    HOEFSLOOT, L.H., HOOGEVEEN-WESTERVELD, M.,KROOS, M., VAN BEEUMEN, J., REUSER, A.J.J., andOOSTRA, B.A. (1988). Primary structure and processing oflysosomal alpha glucosidase; homology with the intestinal su-crase-isomaltase complex. EMBO J 7, 1697-1704.

    HOFFMAN, E.K., TRUSKO, S.P., FREEMAN, N., andGEORGE, D.L. (1987). Structural and functional characteriza-tion of the promoter region of the mouse c-Ki-ras gene. Mol.

    Cell. Biol. 7, 2592-2596.HOWELL, R.R., and WILLIAM, J.C. (1983). In The Metabolic

    Basis of Inherited Disease. J.O. Stanbury, J.B. Wyngaarden,and D.S. Fredrickson, eds. (McGraw-Hill, New York) pp. 142-166.

    HUNZIKER, W., SPIESS, M., SEMENZA, G., and LODISH,H.F. (1986). The sucrase-isomaltase complex: Primary struc-ture, membrane orientation, and evolution of the stalked, in-trinsic brush border protein. Cell 46, 227-234.

    JONES, N.C., RIGBY, P.W.J., and ZIFF, E.B. (1988). Trans-acting protein factors and the regulation of eukaryotic tran-scription: Lessons from studies on DNA tumor viruses. GenesDevel. 2, 267-281.

    KADONAGA, J.T., CARNER, K.R., MASIARZ, S.R., andTJIAN, R. (1987). Isolation of cDNA encoding transcriptionfactor Spl and functional analysis of the DNA binding do-main. Cell 51, 1079-1090.

    KONINGS, A., HUPKES, P., VERSTEEG, R., GROSVELD,G., REUSER, A., and GALJAARD, H. (1984). Cloning of acDNA for the lysosomal alpha-glucosidase. Biochem. Biophys.Res. Commun. 119, 252-258.

    KORNFELD, S. (1986). Trafficking of lysosomal enzymes in nor-mal and disease states. J. Clin. Invest. 77, 1-6.

    KOST, T.A., THEODORAKIS, N., and HUGHES, S.H. (1983).The nucleotide sequence of the chick cytoplasmic beta actingene. Nucleic Acids Res. 11, 8287-8301.

    KOZAK, M. (1986). Point mutations define a sequence flankingthe AUG initiator codon that modulates translation by eukary-otic ribosomes. Cell 44, 283-292.

    LaBADIE, G.U. (1986). "Biochemical and immunologie studiesof acid alpha glucosidase deficiency, a genetically heterogene-

    ous, inherited neuromuscular disease." (Ph.D. Thesis, CityUniversity of New York, Mt. Sinai Hospital).

    LaBADIE, G.U., HARRIS, H BERATIS, N.G., and HIRSCH-HORN, K. (1985). Monoclonal antibodies to acid alpha glu-cosidase; further evidence for genetic heterogeneity in Pompe'sdisease. Am. J. Hum. Genet. 37, A12 (abstract).

    LONG, G.L., CHANDRA T., WOO, S.L.C., DAVIE, E.W.,and KURACHI, K. (1984). Complete sequence of the cDnA for

    human alpha, antitrypsin and the gene for the S variant. Bio-chemistry 23, 4828-4837.MANIATIS, T., FRITSCH, E.F., and SAMBROOK, J. (1982).

    Molecular Cloning: A Laboratory Manual. (Cold Spring Har-bor Laboratory, Cold Spring Harbor, NY).

    MARTINIUK, F., HONIG, J., and HIRSCHHORN, R. (1984).Further studies of the structure of human placental acid alphaglucosidase. Arch. Biochem. Biophys. 231, 454-460.

    MARTINIUK, F., MEHLER, M., PELLICER, A., TZALL, S.,LaBADIE, G.U., ELLENBOGEN, A., and HIRSCHHORN,R. (1986). Isolation of a cDNA for human acid alpha glucosi-dase and detection of genetic heterogeneity for mRNA in threealpha glucosidase deficient patients. Proc. Nati. Acad. Sei.USA 83, 9641-9644.

    MITCHELL, P.J., and TJIAN, R. (1989). Transcriptional regu-lation in mammalian cells by sequence-specific DNA bindingproteins. Science 245, 371-378.

    NICKEL, B.E., and McALPINE, P.J. (1982). Extension of hu-man acid alpha glucosidase polymorphism by isoelectric focus-ing in polyacrylamide gel. Ann. Hum. Genet. 46, 97-103.

    POMPE, J.C. (1932). Over idiopatische hypertrophie van hethart. Ned Tijdschr Generskd 76, 304-311.

    REUSER, A.J.J., KROOS, M., OUDE ELFERINK, R.P.J., andTAGER, J.M. (1985). Defects in synthesis, phosphorylation,and maturation of acid alpha glucosidase in glycogenosis typeII. J. Biol. Chem. 260, 8336-8341.

    REUSER, A.J.J., KROOS, M., WILLEMSEN, R., SWALLOW,D., TAGER, J.M., and GALJAARD, H. (1987). Clinical di-versity in glycogenosis type IL J. Clin. Invest. 79, 1689-1699.

    SANGER, F., NICKLEN, S., and COULSON, A.R. (1977).DNA sequencing with chain-terminating inhibitors. Proc. Nati.Acad. Sei. USA 74, 5463-5468.

    SWALLOW, D.M., CORNEY, G., HARRIS, H., and HIRSCH-HORN, R. (1975). Acid alpha glucosidase: A new polymor-phism in man demonstrable by "affinity" electrophoresis. Ann.Hum. Genet. 38, 391-406.

    Address reprint requests to:Dr. Frank Martiniuk

    New York University Medical CenterDepartment of Medicine

    550 First A venueNew York, NY 10016

    Received for

    publication July 11, 1989,and

    inrevised form

    November 14, 1989.