complete nucleotide sequence ofthe nucleoprotein …jvi.asm.org/content/47/3/642.full.pdf ·...

7
JOURNAL OF VIROLOGY, Sept. 1983, p. 642-648 0022-538X/83/090642-07$02.00/0 Copyright C 1983, American Society for Microbiology Vol. 47, No. 3 Complete Nucleotide Sequence of the Nucleoprotein Gene of Influenza B Virus DAVID R. LONDO, ALAN R. DAVIS, AND DEBI P. NAYAK* Jonsson Comprehensive Cancer Center, and Department of Microbiology and Immunology, School of Medicine, University of California, Los Angeles, California 90024 Received 6 April 1983/Accepted 3 June 1983 A DNA copy of influenza B/Singapore/222/79 viral RNA segment 5, containing the gene coding for the nucleoprotein (NP), has been cloned in Escherichia coli plasmid pBR322, and its nucleotide sequence has been determined. The influenza B NP gene contains 1,839 nucleotides and codes for a protein of 560 amino acids with a molecular weight of 61,593. Comparison of the influenza B NP amino acid sequence with that of influenza A NP (A/PR/8/34) reveals 37% direct homology in the aligned regions, indicating a common ancestor. However, influenza B NP has an additional 50 amino acids at its N-terminal end. As is the case with influenza A NP, influenza B NP is a basic protein, with its charged residues relatively evenly distributed rather than clustered. The structural homology suggests functional similarity between the NP of influenza A and B viruses. Influenza viruses are classified into type A, type B, or type C viruses based on the lack of serological cross-reactivity between their inter- nal components, particularly nucleoprotein (NP) and matrix (M) proteins (39). All virus isolates belonging to a single type possess the cross- reactive M and NP proteins, although their surface antigens, hemagglutinin (HA) and neur- aminidase (NA), may vary drastically. In addi- tion, there are major differences in the epidemi- ology and host range among these three types. For example, only type A viruses, although species specific, can infect and produce disease both in humans and in various animal species in nature. Type B and type C viruses, on the other hand, are primarily restricted to humans (see reference 21), although occasionally they have been isolated from animal hosts (41). Type A viruses are known to undergo both antigenic shift and drift, whereas B and C viruses are subject only to antigenic drift. Furthermore, the antigenic variation observed among type A vi- ruses is more pronounced than that observed in type B viruses, which in turn undergo more variation than type C viruses (20, 21). Addition- ally, genetic exchange yielding stable viruses of mixed genotypes has not been observed to occur between viruses of different types (14). The molecular basis of these biological differ- ences between the three types remains unde- fined. Clearly, an understanding of the structure of the genes as well as the structure and function of the gene products of all three types of viruses will be required before their differences in epide- miology and species specificity can be under- stood. Recently, cDNA cloning and DNA se- quencing have permitted the determination of complete nucleotide sequences and predicted amino acid sequences of protein products of all eight RNA segments from one or more type A strains (see reference 34). Similarly, sequences of the HA, NA, M, and nonstructural (NS) genes of type B viruses have been determined (3, 4, 16, 28, 35). However, little is known about the structure or function of the influenza B NP (B NP), although the primary structure of the NP gene of two influenza A strains has been determined (12, 33, 36). In both A and B types, NP appears to have a multifunctional role. First, it is the major com- ponent of the helical ribonucleoprotein (RNP) core (24), and it has been suggested that each influenza A NP (A NP) molecule may bind to ca. 20 bases (b) of RNA (5). Second, the RNP complexes in association with the polymerase proteins are active in transcription (2, 26). Tem- perature-sensitive mutants of influenza A virus defective in the NP gene fail to synthesize tnRNA or virion RNA (vRNA) in infected cells at nonpermissive temperatures (15, 30, 31). Also, monoclonal antibodies to A NP inhibit viral transcription in vitro, suggesting that NP may be involved in the process of transcription (40). Furthermore, NP may be important in binding the RNP complex to M protein and thus, may have an important role in the process of budding and virion assembly (23). Evidence of intracistronic complementation between tem- perature-sensitive mutants defective in the NP gene also supports its multifunctional character 642 on September 28, 2018 by guest http://jvi.asm.org/ Downloaded from

Upload: phungnguyet

Post on 28-Sep-2018

229 views

Category:

Documents


0 download

TRANSCRIPT

JOURNAL OF VIROLOGY, Sept. 1983, p. 642-6480022-538X/83/090642-07$02.00/0Copyright C 1983, American Society for Microbiology

Vol. 47, No. 3

Complete Nucleotide Sequence of the Nucleoprotein Gene ofInfluenza B Virus

DAVID R. LONDO, ALAN R. DAVIS, AND DEBI P. NAYAK*

Jonsson Comprehensive Cancer Center, and Department of Microbiology and Immunology, School ofMedicine, University of California, Los Angeles, California 90024

Received 6 April 1983/Accepted 3 June 1983

A DNA copy of influenza B/Singapore/222/79 viral RNA segment 5, containingthe gene coding for the nucleoprotein (NP), has been cloned in Escherichia coliplasmid pBR322, and its nucleotide sequence has been determined. The influenzaB NP gene contains 1,839 nucleotides and codes for a protein of 560 amino acidswith a molecular weight of 61,593. Comparison of the influenza B NP amino acidsequence with that of influenza A NP (A/PR/8/34) reveals 37% direct homology inthe aligned regions, indicating a common ancestor. However, influenza B NP hasan additional 50 amino acids at its N-terminal end. As is the case with influenza ANP, influenza B NP is a basic protein, with its charged residues relatively evenlydistributed rather than clustered. The structural homology suggests functionalsimilarity between the NP of influenza A and B viruses.

Influenza viruses are classified into type A,type B, or type C viruses based on the lack ofserological cross-reactivity between their inter-nal components, particularly nucleoprotein (NP)and matrix (M) proteins (39). All virus isolatesbelonging to a single type possess the cross-reactive M and NP proteins, although theirsurface antigens, hemagglutinin (HA) and neur-aminidase (NA), may vary drastically. In addi-tion, there are major differences in the epidemi-ology and host range among these three types.For example, only type A viruses, althoughspecies specific, can infect and produce diseaseboth in humans and in various animal species innature. Type B and type C viruses, on the otherhand, are primarily restricted to humans (seereference 21), although occasionally they havebeen isolated from animal hosts (41). Type Aviruses are known to undergo both antigenicshift and drift, whereas B and C viruses aresubject only to antigenic drift. Furthermore, theantigenic variation observed among type A vi-ruses is more pronounced than that observed intype B viruses, which in turn undergo morevariation than type C viruses (20, 21). Addition-ally, genetic exchange yielding stable viruses ofmixed genotypes has not been observed to occurbetween viruses of different types (14).The molecular basis of these biological differ-

ences between the three types remains unde-fined. Clearly, an understanding of the structureof the genes as well as the structure and functionof the gene products of all three types of viruseswill be required before their differences in epide-miology and species specificity can be under-

stood. Recently, cDNA cloning and DNA se-quencing have permitted the determination ofcomplete nucleotide sequences and predictedamino acid sequences of protein products of alleight RNA segments from one or more type Astrains (see reference 34). Similarly, sequencesof the HA, NA, M, and nonstructural (NS)genes of type B viruses have been determined(3, 4, 16, 28, 35). However, little is known aboutthe structure or function of the influenza B NP(B NP), although the primary structure of theNP gene of two influenza A strains has beendetermined (12, 33, 36).

In both A and B types, NP appears to have amultifunctional role. First, it is the major com-ponent of the helical ribonucleoprotein (RNP)core (24), and it has been suggested that eachinfluenza A NP (A NP) molecule may bind to ca.20 bases (b) of RNA (5). Second, the RNPcomplexes in association with the polymeraseproteins are active in transcription (2, 26). Tem-perature-sensitive mutants of influenza A virusdefective in the NP gene fail to synthesizetnRNA or virion RNA (vRNA) in infected cellsat nonpermissive temperatures (15, 30, 31).Also, monoclonal antibodies to A NP inhibitviral transcription in vitro, suggesting that NPmay be involved in the process of transcription(40). Furthermore, NP may be important inbinding the RNP complex to M protein and thus,may have an important role in the process ofbudding and virion assembly (23). Evidence ofintracistronic complementation between tem-perature-sensitive mutants defective in the NPgene also supports its multifunctional character

642

on Septem

ber 28, 2018 by guesthttp://jvi.asm

.org/D

ownloaded from

NOTES 643

B/Singapore/222/79 InsertpBR 322 l

nuclootides

+ strand S' -- strand 3'-

Alu I -'--BglII -Dde I -EcoR I -Hinf I -Hph I -RsaI -Sam 96 1-Taq I -

pBR 322

100 200 3~O 460 a00 600 700 900 900 1000 1160 1200 1300 M600150 60 1700 18 o 1K

~~~~~~~~~~~3'S'

I I o-

-4 0-

I I- o

FIG. 1. Endonuclease restriction enzyme cleavage map and strategy used for sequencing of the cloned B/SingNP DNA. The insert DNA was characterized by multiple restriction endonuclease digestions, mapping the sitesrelative to one another. DNA sequencing was carried out by using 5' singly end-labeled fragments by themethods of Maxam and Gilbert (18, 19). Approximately 85 to 90% of both strands were sequenced, and allrestriction sites used in sequencing were sequenced through an intact DNA segment cut at a different position.The sequence at the 5' end of the viral RNA segment was determined by using a primer extension procedure (11).Total influenza B vRNA was hybridized to a 5' singly end-labeled DNA fragment of clone B1-28. This primer wasextended by reverse transcriptase, and the resulting cDNA was purified and sequenced. The solid line along thetop shows the cloned insert, with the broken line representing sequences from pBR322. Listed along the left sideare the restriction endonuclease sites which were used for labeling with T4 polynucleotide kinase, as indicated byvertical bars, and sequenced in the direction of the arrows. Both strands are also represented showing theiroverlapping sequenced regions. The asterisk shows the HinF I to Taql fragment used in the primer extension.

(31). In this report, we present the completenucleotide sequence of the NP gene and thepredicted amino acid sequence of the NP poly-peptide of a type B influenza virus (B/Singa-pore/222/79 [B/Sing]) and a comparison of thesequence homology and diversity with type ANP.

Virus from the strain B/Sing was obtainedfrom Alan Kendal, Centers for Disease Control,and grown in 10-day-old embryonated chickeneggs. Virus-specific genomic RNA was isolatedfrom purified virions, enriched for specific RNAsegments by fractionation in sucrose velocitygradients, and used for cDNA cloning (7).The genes of influenza B/Sing were cloned

into the PstI restriction site of pBR322 by proce-dures previously described (7, 13). Briefly, thevRNA enriched in the 4th and 5th largest seg-ments was reverse transcribed by using a syn-thetic dodecanucleotide primer (5'-dAGCA-GAAGCAGA-3') complementary to thecommon 3' ends of influenza B vRNA segments(9). Full length cDNA copies were isolated on1.4% alkaline agarose gels and converted intodouble-stranded DNA by using the foldbackloop at the 3' end as a self-primer. The double-stranded DNA fragments were treated with S1nuclease, size fractionated on neutral agarosegels, and cloned into the PstI site of pBR322with deoxyguanine:deoxycytosine tailing. Esch-

erichia coli 294 was transformed with the recom-binant plasmids and screened for tetracyclineresistance and ampicillin sensitivity.Cloned DNAs were grouped by insert size and

restriction enzyme analyses (7). Groups wereidentified by hybridization to specific vRNAsegments. Clone B1-28, containing an insert ofca. 1.8 kilobases (kb), was identified as a cloneof the B NP gene on the following basis. (i) Ithybridized exclusively to segment 4 vRNAwhich has been shown to code for NP (22); (ii)segment 5 vRNA, on the other hand, known tocode for the HA gene (22, 32) hybridized exclu-sively to DNA of clones B1-61 and B2-6 and notto the B1-28 DNA; and (iii) neither B1-28, B1-61,nor B2-6 hybridized to the vRNAs of segments1, 2, 3 (polymerase genes), or 6, known to codefor the NA gene (22, 32). Additionally, thenucleotide sequence of B1-61 insert DNA (un-published data) is over 90% homologous to thereported sequence of the HA gene of influenzaB/Lee/40 (16).A partial restriction endonuclease cleavage

map of the cloned B NP gene was determinedexperimentally and is shown in Fig. 1. Thisserved as the basis for preparing 5' end-labeledfragments used in DNA sequencing.The complete nucleotide sequence of the

mRNA sense strand of the B NP gene is shownin Fig. 2. The first four nucleotides from the

VOL. 47, 1983

l- 7-'00IIil

44 I

on Septem

ber 28, 2018 by guesthttp://jvi.asm

.org/D

ownloaded from

644 NOTES J. VIROL.

B/Sirigapore/222/79 40 FW+ STRAND 5' AGCAGAAGC.A(if)(.;(.,A'I'I'ITC.'T'+T(il(',Aof'+T'+I'(:(,f')GT'AC'f.'AACAAAAAA('T(;A(iAAI+(',AAA f)'T(,+ W(". AAC A'H3 GAI All' GAC

EI/S,ingi,ipore/222/79 N-Terminus

A/PR/8/34

100 140 160ATC AAC.; ACT OGA ACA A'T'T GA( AAA ACA C`CA GAA GAA ATA A'Tl- TCT GGA ACU' o(+;I+ (36G (3CA A('(: A6A (To oT(- Alf', AC3,A CCA SCA ACC C1T

-110 20 0

180 200 220 240 260GCC` CCA CCA AGC AAC AAA CGA ACC CGG AAC CCA 'TCC CCIG GAA AGA GC'A ACC ACA AGC AG'Y' GAA OCT GAI (31C WiG AAA A.CC` C'.AA AAG.

Pro-61,jr-A S U. 1- S C? T- C3 lu A L a A,, r, V i- 1, G) 1,3 A r,:I,-. L. .:is Th r GI r'.

40 tlo 60H F, t A I a-S F:. r G I r. 6 I i L.,4s A r -4 r T- r G I r G I r. M e t 61.-.j-- Thr-AsF-61-

280 300 20 340AAA CAG ACC' C'1CG ACA GAG ATA AAG AAG AGC GITC TAC AAT A'TG GTA GI'G AAA CJS G(iT GAA TTC TAC AAC CAG A ril (3 ATC; C`,TC` AAA GC-1 [36ALus 611's Thr-Pro hr-61i.i-Ile Lws--L,4s c-r- Ttq r -Asr, Val-Va1--L-.iS--Le1_1 4 Glil- Phe-T4r- Asrs e Met-Val-U.4s-Ala-61.,;

70 80 90A r,4 G I n Asn--Ala Thr-Glu-Ile AT'A-Ala 11 a Gl,=j-L-js- I 1 e G 1 .:t 6 1 .-.j I 1 e G 1,_4 Ar-I Phe-Tvr Ile Gl -Me Cvs--Thr--Gl-.j-L.e1j-Lvs

20 30 40

360 380 400 420 440CTC AAC GA'T GAC ATO GAG AGA AAC CTA ATC CAA AAT GCA CAT GCT GTG GAA AGA ATT CI'A TTG GCI' GCC ACT GAI' GAC AAG AAA ACT GAA

Asri-Asp Met Ar-l-Asn e -Ile-Gln-A5n Ala-His-Ala-Val -j- r I le-Leki Ala Thr AsF:-Lvs-L.vs-Thr-Ghj100 110 11-0Ser r- T v r Ct 1,4 A r!A L-e-i-I le-Glr-i-Asr-, Ser-Leu-Thr-I le- Met-Val Ser Phe (31--i-Ar-4-Arg50 60

460 480 500 520TTC CAG AAG AAA AAG AAT GCC' AGA GAT GI'C AAA GAA GGG AAA 6AA 6AA Al A GAC CAC AAC AAA ACA GCiA GGC AC`C, I"Tl' TAC, AACi ATG GTAPhe--Gln--Lvs--L.4s-L,ds A 1 a A rA A s, F, V a I L -i s G I u 6 I .-s L,..$s 1; I -.f G'L i,j T I His-Am-, 4 s --T t-i r G 1 .-.j G I .:i l'ti r Ph e 9I, L.:ls-Met

I _30 140 150Pr c) L., vs, --is-Thr G 1 v G 1,,; PT, o I 1 e T,:; T- A r .A-) T- I

80 90

540 560 580 600 0AGA 6AI' GA'T' AAA ACC A'TC TAC rTC AGC CCI' ATA AGA Al'l' ACC TTT I"TA AAA GAA GAG G116 AAA ACA ATC, TAC AAA AC(.. AC`C ATG GGG

J--Gl,1_1V,1 f) T, I 1-i T,Ar-A,.-Asp.-Asp-L.vs-Thr--Ile--'T'vr Phe-Ser-Pro.-Ile-Ar.-i I'fi r P tie L e U 'LI - M e t160 170 18(

Leu-Tvr-Asp- L.,,-- D. I I e -1 Y, A T- G 1 r, A 1;.., A i:, i., Atio 1`0

640 660 680 700AGT GAT GGC TTC: AOT UG-A CIA AA-T CAC ATA ATG ATT_6SG C CAG ATG AAC GAT GTC, TGT TTC` C'AA AGA TCA AAG GCA CTA AAA A6A

Glt.4--Phe-Ser Blv-Leu Asri Ile Glv Glr-,-Ars. se:+ 1. 9 s 1 a-l- euSer r Bln Met n- sp V a I fi Phe L.,3190 200

Asp A 1 z-i ---T ti r A 1 a Thr Met T rr- Asr, -Leu- Ala-Thr-T-ir- 'T h A T, .-_l A 1. a L, Val.130 140 150

.7-)o 740 760 780 0(C)GTT GGA CT-T GAC CC'T TCA 'I'TA ATC AGT ACC TTT GCA GGA AGC ACA CTC CCC' AGA AGA T.CA GGT 6,':A ACT G(3 F 6TT GCO A TC` A AA C; G,A GG TVal t- eu r Ser.-Leu-Ile r Thr-Phe-Ala :Ser-Thr-[+.c-i.j-F'ro--Ar-q--Arg-Spr---Glt-!-Ala I'h r If Val Tle is- (1 1 .-t

220 230 240Thr Met Ar-i-Met-Cvs, r Leu Me t Ci 1. ril G I v S e r Th r -L.e-_.r-+, Y. o A r 9 A rg-Se Y- A 1 a Ala 6 I m A la V a 1. va I

170 180

820 840 860 880GOA ACT TTA GTG GCA GCC ATT TTT T 67A 6CA ATG 6CA GGG CJA TTG AGA GAC ATC AAA GCC' AAG

r la eG

v r Ala-Met-Ala G 1 .:i L e,, L, -I A Alzi--Lvs19-Th L. eu A Ala Ile 9 Ptie P UfA T,,jj s Ile250 -_ 0

lw-Thr Met- Met u Leki-Val Met. e L.,j s G 1,3 I 1 e Asn r g As r-i Phe-TrP -[A rj!j G I 9 G I -+j A s ri G 1 w A T, -J I'h T, A r .-l190 200 210

900 920 940 960ACG GCC TAT GAA AAG A'T'T CI'T CTG AAT CTG AAA AAC AAA TGC 'TCT GCG CCC CAA CAA AAr, GCI C'i A GTT GA'T CAA GTG_ AI'C GGA AGT AGAThr Ala-Tvr-Olli Lvs- I 1. e-Leu-Leu-Asn u- Vs Asri C-js-Ser-Ala-Pro-61n Gln-Lvs-Ala Le'.1-Val- Asr,--Glri--Va 1. e G I tl PTI_

280 290 30(Ile Ala-Tvr-Oloi Arg-Met-Cvs-Asn-Ile .qs G1,j v s Phe-GIn-Thr-Ala-Ala Gln-L+-.i-_,--A1a Met Met-Al: r- rg

2 230 240

980 1000 101.10 1.040 1060AAT CCA GOG ATT GCA 6AC: ATA GAA GAC CTA ACC CT6 CT1 GC-T CGA AGT A'16 GTC G'TT GT'T AGG CCC TCT G1'(3 GCG AGC AAA GTG GTG CAsn r Ile Asp-Ile 61-i-Asp-Leu- Ttir- Le-.1 P.-J-1AIa-Ar.1 Ser Met-Val-Val--k)PI Tfl Prc). Sc-r-Val .Ala-ser Val-Val

310 3.1( 330Asp ro-. Asn a Glu-Phe Glu.-Asp-Lou-Thr F't-ic- 'A 61..j Ser..-Val A Ta. H i s Ser-Cjs

250 260 270

FIG. 2

synthetic dodecanucleotide primer used to re- obtained from the insert of clone Bl-28, whichverse transcribe the virus-specific RNA were represents the complete coding region for themissing from the cloned DNA, presumably lost NP polypeptide. The sequence from nucleotidesduring manipulation of the transcribed DNA. 1,737 to 1,839 was obtained by the primer exten-The sequence from nucleotides 5 to 1,756 was sion of a DNA fragment (nucleotides 1,664 to

on Septem

ber 28, 2018 by guesthttp://jvi.asm

.org/D

ownloaded from

(CCC Al'A AGC(i( TT TAT CT OA1 ('1C (iCto , ...C wiT!1;;(1iTPi it(iII

AFee -CVsIieI-TT C)1' I. Fr. 1

I1160 19TAT AAT ATG GCA ACA LI'T OTT I'CC AlA 'T"A OVA ATIi (CGA (AC GAT1 ((0 6,.A 10 Oi CIt1 '0C0 IIIIIIITvrt-A-rn-Met-Ala-ThrFePo- Va1 Se lie LeiA.I l Met-G.a-vO', c'c O-AI'L to3--eOiL f77

L-eu-Cilr.---Asrto--Ser--Glri-Val-TyeSe Leu-Ile Ar Fie -Asr.Ot e t Alz-Hi,37'0' III-I-s-I

1240 1260 1280 l10(GCC TAT GAO GAC CTA AGA GTT TTG TCT GCA TTA ACA GGC ACA CAA TTC AAG CCIl AoA (CA C(A TIA AA(C f i .VI TI CT .1aT-rGli Ap-Lu-Arl-Val-Leu-Ser Ala-Leu-Th- b-Tr eCit-Tj-he A. ci",I AI a hAleTorClo-Ase--Let ~~~~~~40041 l

Al Fhe ii1i AsF,Ie-leeAr4-Vall-leu.-See Phe-Ilie-Lvs Cl-Th ro-Vt-e (eI TiI liii-i(i 111--1340 350

1440 1360 18)I0GCA AAG GAG0 COAt GT6 GAA GGA ATG GOG GCA GCT CTG ATG TCC ATC 006 CTC CAG 111' 10G3 GCT' CCA AIL Oil iii iCiI 1-00'r ill tI -Ale-lovsf Glri-Valf 6}ioMe Glv-Ala-Ala-Leu.j-Met-Ser-Ilre-L.o-s-Letj-Glr.--FPhe Itr---la Pro-c-t AilT

c-r-Ai.~rtCl Asn-Metf ~jThr e Glui-Ser-Ser--Ther-Leu--CitG--L.. j-Atl-A-ei--Or- -Tvt LT OeI, Ile-A~-. TiArolSe. Gl 'I370 3480 9

1420 1440 1460 1480 CGTA GCT 660 GAC GCA CGC TCT GGC COO ATA ACT TGC AliC CCA C'TC TTT GCA GTA COO 0(0 CI'T OTT 0C1 FI1G00 AA01'I(10ii 'L"iVal-C1bo-Glo-Ase-lu-Gl-Ci-Seerfffl7-Gr.TIle-SerC-i%--Ser FeVal Ihe Alia (al7F1 lli-OlG ci II4604/LAsrt-G-ln-Glnrg-Al-Oa--Ser-Ala Clo--Clit-Ile-Ser Ile-ClriFeehThPTe Sc rlQ'1 rlittj As-1, ', )11 I

400 410 42

1520 1540 1160 1580lACA ATG CTG TCA ATG OAT OTT G00 COO CC!' GOT OCA CAT GTC 000 CGO OAT C(TA CTC 000 A10 ATG 0i'J CAC TCAttAll TO oc TCAreg-Met-Leu-Ser-Met ri~Il l-l-r s-l a Atr --Atr -SCMi lO)I ,-- ;t

490 500 1Al-l-h-h-l nTrGi-l-r Tr-e e-r-trG--l-l-r Me GMtIt-.ISee ---- - PI I:c

430 44045

1600 1620 1640 166011OAT G60 001 GCT TTC OTT 066 006 000 016 TTT COO ATA TCA GAC 000 AOL 000 0CC OAT CCC GCI I 00G A1 ITCA ,TI -,01-6((.ci-1 ~OiAsen-Glw-Asri-Ala helie Oi Lws-Lvs-Mlet eGlr-Ile- r- speLos-Asnt-Los-Thr-Asn el Vai-G-j01 i- Fri-1ti-- I 11, tH

5I20 530 ".

Gl-s-a-e lpGwAgGvVl GuLe r l-LsAaAaSr I1e_ iaI-f'T. C?Ai0.- 1- I460 470 4i-

1700 1720 1740 16 7

CCC OAT TTC TTC TTT GGG AGGOGAC ACA GCA GAC COT TOT GOT GAC CTC GOT TOT TAAAGCAACAAAATAGACACTATGACTGTGATT0TTIlTCA0T'ALGT1TPeo-Asn-Phe-Phe GF~ly-Arg The O-Gui Asp-Tye sp~Ase-Leu-O-se-Tve

II1550111 II 0~~~~~~~~60Gluj-Gly-See-Tye+rtj Phe-Oly s Ole-See 4AspJ Aset

490 498

1820 1839

GCAATGTGGGTGTTTACTCTTATGAATAATATAAAAACGCTG3TTGTTTCTAC- 3'

FIG. 2-ContinutedFIG. 2. Complete nucleotide sequence of the NP gene (plus-sense strand) and the deduced amino acid

sequence of the NP polypeptide of influenza B/Sing. The first 12 nucleotides represent the synthetic primer usedto reverse transcribe the virus-specific RNA. Nucleotides are numbered from the 5t end of the plus-strand. Alsoshown in alignment with B NP is the amino acid sequence of A/PR/8/34 NP (36). Boxes are drawn to showregions of direct amino acid homology. Dashes indicate where deletions were placed to bring the two sequencesinto maximum alignment. The precise location of substitution, addition, or deletion of nucleotides is not known,therefore, amino acid deletions were placed in regions only to satisfy the criteria of maximum amino acidhomology. Both amino acid sequences are numbered starting from their N-terminal end.

1,732) labeled at the 5t end. The B NP gene adenine residues at position 1,819 to 1,823 prob-segment contains 1,839 nucleotides and has an ably represents the polyadenylation site for theopen reading frame from the first initiation co- virus mRNAs as has been reported for influenzadon (AUG) at position 60 to 62 to a termination A viruses (25). The B NP gene has a calculatedcodon (UAA) at position 1,740 to 1,742. This base composition of 35.2% uridine, 22.3% ade-open reading frame of 1,680 nucleotides could nine, 22.9% cytosine and 19.6% guanine, reflect-code for a polypeptide of 560 amino acids, ing the fact that adenine is more commonly usedleaving 97 and 59 noncoding nucleotides at the 3t in the third codon positions than guanine. Theand 5t ends, respectively, of the plus-sense frequency of the cytosine-guanine dinucleotideDNA strand. There are no other open reading as well as the use of cytosine-guanine-containingframes in either strand which could code for codons is low, as is commonly seen in othermore than 125 amino acids. A tract of five virus and eukaryotic genes (29).

VOL. 47, 1983 NOTES 645

on Septem

ber 28, 2018 by guesthttp://jvi.asm

.org/D

ownloaded from

646 NOTES

TABLE 1. Amino acid composition of A and B NP

Aminoacids

AlaArgAsnAspCysGlnGluGlyHislieLeuLysMetPheProSerThrTrpTyrVal

Total

B/Sing (%)

46 (8.2)31 (5.5)27 (4.8)34 (6.1)5 (0.9)18 (3.2)29 (5.2)45 (8.0)5 (0.9)

38 (6.8)38 (6.8)50 (8.9)27 (4.8)22 (3.9)25 (4.5)38 (6.8)35 (6.3)1 (0.2)

13 (2.3)33 (5.9)

560 (100)

Frequency (%)

A/PR/8/34 (%)"

39 (7.8)49 (9.8)25 (5.0)23 (4.6)6 (1.2)

21 (4.2)36 (7.2)41 (8.2)6 (1.2)

26 (5.2)32 (6.4)21 (4.2)25 (5.0)18 (3.6)17 (3.4)39 (7.8)29 (5.8)6 (1.2)15 (3.0)24 (4.8)

498 (100)

Avgprotein (%)a

(8.6)(4.9)(4.3)(5.5)(2.9)(6.0)(3.9)(8.4)(2.0)(4.5)(7.4)(6.6)(1.7)(3.6)(5.2)(7.0)(6.1)(1.3)(3.4)(6.6)(100)

"Amino acid compositions of A/PR/8/34 NP and anaverage protein were obtained from Winter and Fields(36) and Dayhoff et al. (8), respectively.

The B/Sing NP gene is 274 nucleotides longerthan the A NP gene (1,839 versus 1,565; seereference 36). These extra nucleotides are pre-sent in both the 5' noncoding (97 b versus 45 b)and the 3' noncoding (59 b versus 26 b) regionsas well as in the coding region (1,680 b versus1,494 b) of the B NP gene. The sequence datashow that RNA segments of influenza B virus,excluding the three unstudied polymerase RNAsegments, are all larger than the correspondingRNA segments of influenza A virus. However,

TABLE 2. Amino acid homology between regionsof B NP and A/PR/8 NP"

Amino acid region % Homology toof B NP A/PR/8 NP

51-150............................ 31.5151-250............................ 45.8251-350............................ 46.0351-450............................ 38.0451-550............................ 25.8

a The sequence homology was determined for every100 amino acids beginning with amino acid residue 51(Fig. 2). Since amino acids corresponding to the N-terminal 50 amino acids of B/Sing NP are not presentin A/PR/8 NP, they are not included in the calculation.Also unused were amino acids corresponding to areasof addition or deletion created to obtain optimumalignment of the two amino acid sequences.

since these extra sequences are present in mostcases in both coding and noncoding regions,their significance remains unclear. These dataalso show that the segment 4 RNA of B/Singencodes the NP gene and, therefore, agree withthe gene assignment of three other influenza Bstrains, namely B/Lee/40, B/Mass/1/71, andB/Md/59 (24). Furthermore, our data also showthat the B/Sing NP gene (1,839 nucleotides) issmaller than the B/Lee HA gene (1,882 nucleo-tides) as was predicted by the migration patternof glyoxalated RNAs (22).The predicted B/Sing NP polypeptide is 560

amino acids in length (Fig. 2), with a calculatedMr of 61,593, which is slightly less than the66,000 estimated in gels (22). Of the 560 aminoacids of B NP, 86 residues are basic (Arg, 31;Lys, 50; His, 5) and 63 residues are acidic (Asp,34; Glu, 29; Table 1). B NP is a basic proteinwith a net charge of +20.5 at pH 7.0, assuming acharge of + 1 for each arginine and lysine, -1 foreach glutamic acid and aspartic acid, and +0.5for each histidine, and, therefore, possesses agreater net positive charge than A NP, with acharge of +14 (36). The amino acid composition(Table 1) shows that B NP is remarkably rich inlysine and methionine but poor in histidine,cysteine, and tryptophan when compared withthat of a statistically average protein (8). Whencompared with A/PR/8/34 NP (36), the B NPpolypeptide is 62 amino acids longer than A NP(Fig. 2). Seven regions of deletion or addition orboth of amino acids were required for optimumalignment of the two sequences, leaving influen-za B NP with an extra three residues at its C-terminal end and 50 residues at its N-terminalend. The amino acid composition (Table 1)shows that both A and B NP are rich in basicamino acids and in methionine but low in cyste-ine as expected for soluble globular proteins.Only one of the cysteine residues appears con-served between B NP (amino acid 389) and ANP (amino acid 333), supporting the finding ofSelimova et al. that disulfide bonds are notimportant in the formation of the secondarystructure of NP (27). This is in sharp contrast tothe conservation of cysteine residues betweeninfluenza A and B viruses in the HA and NAproteins (A. R. Davis, T. J. Bos, and D. P.Nayak, Proc. Natl. Acad. Sci., in press; 16, 28).The distribution and composition of basic aminoacids are different between A and B NP. Forexample, B NP is rich in lysine and low inarginine, whereas the reverse is true for A NP.Both A and B NPs are low in histidine, and B NPcontains only one tryptophan residue. It remainsto be seen whether this divergence in the compo-sition of basic amino acids between B and A NPhas any structural or functional significance orprovides the basis for any differences in biologi-

J. VIROL.

on Septem

ber 28, 2018 by guesthttp://jvi.asm

.org/D

ownloaded from

NOTES 647

cal specificity. As with A NP, B NP does notpossess large clusters of charged residues. Rath-er, these are evenly distributed along the lengthof the polypeptide, suggesting multiple contactpoints between each NP molecule and the viralRNA. Since B NP has a greater net positivecharge than A NP, it may bind more tightly tothe negatively charged vRNA. Relative nucleasesensitivity of B RNP versus A RNP may deter-mine whether the greater positive charge in BNP provides an increased protection of viralRNA segments by B NP.Comparison of the amino acid sequence ho-

mology between B and A NP shows that someparts of the sequence are remarkably conserved,whereas other areas are quite divergent. Theconservation of sequence is most striking in thecentral region of the protein (amino acid 150 to350) which contains nearly 50% direct homology(Table 2). This region also contains a direct 10amino acid residue homology (amino acid 230 to239; B NP) between A and B NPs. This homolo-gy is remarkable because such a large stretch ofhomology is not present even in the highlyconserved hydrophobic region at the N terminusof HA 2 of A and B HAs (16, 37). This regionthen may provide some critical structure for thefunction of both A and B NP, such as a nucle-ation point in RNP formation or a critical con-tact with RNA or with M or polymerase (P)proteins. The entire aligned region (amino acids51 to 557) contains 37% direct homology(184/492 residues). Since 43% of the differences(133/308 residues) are of conservative nature,i.e., basic, acidic, polar, or nonpolar amino acidreplaced by similar residue, the combined struc-tural homology between A and B NP is 64%.The most remarkable difference in the primarystructure of A and B NP is the presence of 50additional amino acids at the N terminus (Fig.2), making it unique in that respect among theinfluenza virus proteins, except for influenza BNS1, which is 51 amino acids longer than influ-enza A NS1 (1, 38). Whether these extra se-quences are involved in providing functionaltype specificity remains to be seen.The proteins coded by the five genes (HA,

NP, NA, M, and NS) of influenza B virus thathave been completely sequenced show that in-fluenza B proteins possess structural featuressimilar to those of influenza A viral proteins,attesting for their functional similarity. Amongthese viral proteins, NP shows the greatestamino acid homology between the types. How-ever, diversity in the sequence between influen-za A and B proteins is far greater than thatobserved among the corresponding proteins ofinfluenza A virus strains, suggesting that influ-enza A and B viruses have undergone similar butindependent evolutionary pathways and that

none of these five RNA segments have beenrecently acquired by reassortment between in-fluenza A and B viruses. Since stable recombi-nants between A and B viruses have not beenobserved even under laboratory conditions (14),it is likely that these viral proteins are function-ally incompatible with the genes and gene prod-ucts of another type. Furthermore, since enve-lope proteins can form pseudotypes withdivergent viruses (6), the functional type speci-ficity is likely to be provided by the internalcomponents such as polymerases and NPs, etc.Since active influenza proteins, including NPs,now can be expressed from cloned cDNAs ineukaryotic cells (17), the function of these pro-teins as well as chimeric proteins (constructedby joining two or more DNAs) can now be testedby intertype complementation and should pro-vide insight into their functional specificity andincompatibility with each other.

We thank N. Sivasubramanian for his helpful discussion andadvice.

This paper was supported by grant PCM 7823220 from theNational Science Foundation, by Public Health Service grantRO1 A116348 from the National Institute of Allergy andInfectious Diseases, and by a biological research support grantBRSG RR 05304 from the National Institutes of Health.

LITERATURE CITED

1. Baez, M., R. Taussig, J. J. Zazra, J. F. Young, P. L.Palese, A. Reisfeld, and A. M. Skalka. 1980. Completenucleotide sequence of the influenza A/PR/8/34 virus NSgene and comparison with the NS genes of the A/U-dorn/72 and A/FPV/Rostock/34 strains. Nucleic AcidsRes. 8:5845-5859.

2. Bishop, D. H. L., P. Roy, W. J. Bean, Jr., and R. W.Simpson. 1971. Transcription of the influenza ribonucleicacid genome by a virion polymerase. III. Completeness ofthe transcription process. J. Virol. 10:689-697.

3. Briedis, D. J., and R. A. Lamb. 1982. Influenza B virusgenome: sequences and structural organization of RNAsegment 8 and the mRNAs coding for the NS, and NS,proteins. J. Virol. 42:186-193.

4. Briedis, D. J., R. A. Lamb, and P. W. Choppin. 1982.Sequence of RNA segment 7 of the influenza B virusgenome: partial amino acid homology between the mem-brane proteins (M,) of influenza A and B viruses andconservation of a second open reading frame. Virology116:581-588.

5. Choppin, P. W., and R. W. Compans. 1975. The structureof influenza virus, p. 15-51. In E. D. Kilbourne (ed.), Theinfluenza viruses and influenza. Academic Press, Inc.,New York.

6. Compans, R. W., M. G. Roth, and F. V. Alonso. 1981.Directional transport of viral glycoproteins in polarizedepithelial cells, p. 213-231. In D. P. Nayak (ed.), Geneticvariation among influenza viruses. Academic Press, Inc.,New York.

7. Davis, A. R., A. L. Hiti, and D. P. Nayak. 1980. Construc-tion and characterization of a bacterial clone containingthe hemagglutinin gene of the WSN strain (HON1) ofinfluenza virus. Gene 10:205-218.

8. Dayhoff, M. O., L. T. Hunt, and S. Hurst-Calderone. 1978.Composition of proteins, p. 363-373. In M. 0. 0. Dayhoff(ed.), Atlas of protein sequence and structure, vol. 5,suppl. 3. National Biomedical Research Foundation,Washington, D.C.

VOL. 47, 1983

on Septem

ber 28, 2018 by guesthttp://jvi.asm

.org/D

ownloaded from

648 NOTES

9. Desselberger, U., V. R. Racaniello, J. J. Zazra, and P.Palese. 1980. The 3' and 5'-terminal sequences of influen-za A, B and C virus RNA segments are highly conservedand show partial inverted complementarity. Gene 8:315-328.

10. Fields, S., and G. Winter. 1982. Nucleotide sequences ofinfluenza virus segments 1 and 3 reveal mosaic structureof a small viral RNA segment. Cell 28:303-313.

11. Hiti, A. L., A. R. Davis, and D. P. Nayak. 1981. Completesequence analysis shows that the hemagglutinins of theHO and H2 subtypes of human influenza virus are closelyrelated. Virology 111:113-124.

12. Huddleston, J. A., and G. G. Brownlee. 1982. The se-quence of the nucleoproteins gene of human influenza Avirus, strain A/NT/60/68. Nucleic Acids Res. 10:1029-1038.

13. Kaptein, J. S., and D. P. Nayak. 1982. Complete nucleo-tide sequence of the polymerase 3 gene of human influen-za virus A/WSN/33. J. Virol. 42:55-63.

14. Kilbourne, E. D. 1975. The influenza viruses and influen-za-an introduction, p. 1-13. In E. D. Kilbourne (ed.),The influenza viruses and influenza. Academic Press,Inc., New York.

15. Krug, R. M., M. Ueda, and P. Palese. 1975. Temperature-sensitive mutants of influenza WSN virus defective invirus-specific RNA synthesis. J. Virol. 16:790-796.

16. Krystal, M., R. M. Elliott, E. W. Benz, Jr., J. F. Young,and P. Palese. 1982. Evolution of influenza A and Bviruses: conservation of structural features in the hemag-glutinin genes. Proc. Natl. Acad. Sci. U.S.A. 79:4800-4804.

17. Lin, B. C., and C. J. Lai. 1983. The influenza virusnucleoprotein synthesized from cloned DNA in a simianvirus 40 vector is detected in the nucleus. J. Virol. 45:434-438.

18. Maxam, A. M., and W. Gilbert. 1977. A new method forsequencing DNA. Proc. Natl. Acad. Sci. U.S.A. 74:560-564.

19. Maxam, A. M., and W. Gilbert. 1980. Sequencing end-labelled DNA with base-specific chemical cleavages.Methods Enzymol. 65:499-560.

20. Palese, P., R. M. Elliott, M. Baez, J. J. Zazra, and J. F.Young. 1981. Genome diversity among influenza A, B,and C viruses and genetic structure of RNA 7 and RNA 8of influenza A viruses, p. 127-140. In D. P. Nayak (ed.),Genetic variation among influenza viruses. AcademicPress, Inc., London.

21. Palese, P., and J. F. Young. 1982. Variation of influenzaA, B, and C viruses. Science 215:1468-1474.

22. Racaniello, V. R., and P. Palese. 1979. Influenza B virusgenome: assignment of viral polypeptides to RNA seg-ments. J. Virol. 29:361-373.

23. Rees, P. J., and N. J. Dimmock. 1981. Electrophoreticseparation of influenza virus ribonucleoproteins. J. Gen.Virol. 53:125-132.

24. Rees, P. J., and N. J. Dimmock. 1981. Electrophoreticanalysis of influenza virus ribonucleoproteins from puri-fied virus and infected cells, p. 341-344. In D. H. L.Bishop and R. W. Compans (ed.), The replication ofnegative strand viruses. Proceedings of the InternationalSymposium of Negative Strand Viruses. Elsevier/North-Holland, Amsterdam.

25. Robertson, J. S., M. Schubert, and R. A. Lazzarini. 1981.Polyadenylation sites for influenza virus mRNA. J. Virol.38:157-163.

26. Rochovansky, 0. M. 1976. RNA synthesis by ribonucleo-protein-polymerase complexes isolated from influenzavirus. Virology 73:327-338.

27. Selimova, L. M., V. M. Zaides, and V. M. Zhdanov. 1982.Disulfide bonding in influenza virus proteins as revealedby polyacrylamide gel electrophoresis. J. Virol. 44:450-457.

28. Shaw, M. W., R. A. Lamb, B. W. Erickson, D. J. Briedis,and P. W. Choppin. 1982. Complete nucleotide sequenceof the neuraminidase gene of influenza B virus. Proc.NatI. Acad. Sci. U.S.A. 79:6817-6821.

29. Subak-Sharpe, J. H. 1967. Base doublet frequence pat-terns in the nucleic acid and evolution of viruses. Br.Med. Bull. 23:161-168.

30. Thierry, F., and 0. Danos. 1982. Use of specific singlestranded DNA probes cloned in M13 to study the RNAsynthesis of four temperature-sensitive mutants of HK/68influenza virus. Nucleic Acids Res. 10:2925-2938.

31. Thierry, F., S. B. Spring, and R. M. Chanock. 1980.Localization of the Ts defect in two ts mutants of influen-za A virus: evidence for the occurrence of intracistroniccomplementation between ts mutants of influenza A viruscoding for the neuraminidase and nucleoprotein polypep-tides. Virology 101:484-492.

32. Ueda, M., K. Tobita, A. Sugiura, and C. Enomoto. 1978.Identification of hemagglutinin and neuraminidase genesof influenza B virus. J. Virol. 25:685-686.

33. Van Rompuy, L., W. Min Jou, D. Huylebroeck, R. Devos,and W. Fiers. 1981. Complete nucleotide sequence of thenucleoprotein gene from the human influenza strainAIPR/8/34 (HON1). Eur. J. Biochem. 116:347-353.

34. Webster, R. G., W. G. Laver, G. M. Air, and G. C.Schild. 1982. Molecular mechanisms of variation in influ-enza viruses. Nature (London) 296:115-121.

35. Winter, G., and S. Fields. 1980. Cloning of influenzacDNA into M13: the sequence of the RNA segmentencoding the A/PR/8/34 matrix protein. Nucleic AcidsRes. 8:1965-1974.

36. Winter, G., and S. Fields. 1981. The structure of the geneencoding the nucleoprotein of human influenza virusA/PR18/34. Virology 114:423-428.

37. Winter, G., S. Fields, and G. G. Brownlee. 1981. Nucleo-tide sequence of the hemagglutinin gene of a humaninfluenza virus Hi subtype. Nature (London) 292:72-75.

38. Winter, G., S. Fields, M. J. Gait, and G. G. Brownlee.1981. The use of synthetic oligodeoxynucleotide primersin cloning and sequencing segment 8 of influenza virus(A/PR/8/34). Nucleic Acids Res. 9:239-244.

39. World Health Organization. 1971. Report: a revised sys-tem of nomenclature for influenza viruses. Bull. W.H.O.45:119-124.

40. van Wyke, K. L., W. J. Bean, Jr., and R. G. Webster.1981. Monoclonal antibodies to the influenza A virusnucleoprotein affecting RNA transcription. J. Virol.39:313-317.

41. Yuanji, G., J. Fengen, W. Ping, W. Min, and Z. Jiming.1983. Isolation of influenza C virus from pigs and experi-mental infection of pigs with influenza C virus. J. Gen.Virol. 64:177-182.

J. VIROL.

on Septem

ber 28, 2018 by guesthttp://jvi.asm

.org/D

ownloaded from