origin of the 1918 spanish influenza virus: a comparative genomic analysis

11
Origin of the 1918 Spanish influenza virus: A comparative genomic analysis Geoff Vana, Kristi M. Westover * Department of Biology, Winthrop University, Rock Hill, SC 29733, USA Received 13 September 2007; revised 5 February 2008; accepted 6 February 2008 Available online 14 February 2008 Abstract To test the avian-origin hypothesis of the 1918 Spanish influenza virus we surveyed influenza sequences from a broad taxonomic dis- tribution and collected 65 full-length genomes representing avian, human and ‘‘classicswine H1N1 lineages in addition to numerous other swine (H1N2, H3N1, and H3N2), human (H2N2, H3N2, and H5N1), and avian (H1N1, H4N6, H5N1, H6N1, H6N6, H6N8, H7N3, H8N4, H9N2, and H13N2) subtypes. Amino acids from all eight segments were concatenated, aligned, and used for phylogenetic analyses. In addition, the genes of the polymerase complex (PB1, PB2, and PA) were analyzed individually. All of our results showed the Brevig-Mission/1918 strain in a position basal to the rest of the clade containing human H1N1s and were consistent with a reassortment hypothesis for the origin of the 1918 virus. Our genome phylogeny further indicates a sister relationship with the ‘‘classicswine H1N1 lineage. The individual PB1, PB2, and PA phylogenies were consistent with reassortment/recombination hypotheses for these genes. These results demonstrate the importance of using a complete-genome approach for addressing the avian-origin hypothesis and predict- ing the emergence of new pandemic influenza strains. Ó 2008 Elsevier Inc. All rights reserved. Keywords: Avian-origin hypothesis; 1918 Spanish influenza; Genomic analysis H5N1; Reassortment 1. Introduction The next influenza outbreak and subsequent pandemic in humans will likely arise from a relatively new strain of avian influenza, H5N1, first seen in humans in 1997 (de Jong et al., 1997; Claas et al., 1998; Shortridge et al., 1998). What similarities are there between the deadly 1918 Spanish flu strain and those H5N1 strains emerging in human populations today? Are we right in assuming the consequences of an H5N1 pandemic will be similarly devastating? If hypotheses regarding the avian origin of the 1918 strain are true, the answer is most certainly yes. Even in the absence of support for the avian origin hypoth- esis, given the high mortality rates in humans infected with H5N1 (see Tam 2002) the world is likely to be caught unprepared. Influenza A viruses, single–stranded, negative sense RNA viruses of the family Orthomyxoviridae, co-circulate in humans in yearly epidemics and antigenically novel strains emerge sporadically as pandemic viruses (Cox and Subbar- ao, 2000). The eight-segmented (segments 1–8) genome is housed in an enveloped virion (Noda et al., 2006). Segments 1, 3, 4, 5, and 6 each encode a single protein, the polymerase basic 2 (PB2), polymerase acidic (PA), hemaglutinin (HA), nucleoprotein (NP), and neuraminidase (NA), respectively. Combinations of the 15 HA and 9 NA subtypes define viral strain and function in the human immune response. The major function of NA is to remove sialic acid from the prog- eny HA and NA virus particles, intercellular glycoproteins, and host-cell receptors, facilitating release from infected cells (Rogers and Paulson, 1983; Matrosovich et al., 1999; Plotkin and Dushoff, 2003). Segments 2, 7, and 8 each encode two proteins. PB1 and PB1-F2 are coded for by segment 2; matrix (M1) and M2 by segment 7; and nonstructural 1 (NS1) and NS2 by segment 8. The NS1 protein has been shown to 1055-7903/$ - see front matter Ó 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.ympev.2008.02.003 * Corresponding author. Fax: +1 803 323 3448. E-mail address: [email protected] (K.M. Westover). www.elsevier.com/locate/ympev Available online at www.sciencedirect.com Molecular Phylogenetics and Evolution 47 (2008) 1100–1110

Upload: geoff-vana

Post on 02-Jul-2016

215 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Origin of the 1918 Spanish influenza virus: A comparative genomic analysis

Available online at www.sciencedirect.com

www.elsevier.com/locate/ympev

Molecular Phylogenetics and Evolution 47 (2008) 1100–1110

Origin of the 1918 Spanish influenza virus: A comparativegenomic analysis

Geoff Vana, Kristi M. Westover *

Department of Biology, Winthrop University, Rock Hill, SC 29733, USA

Received 13 September 2007; revised 5 February 2008; accepted 6 February 2008Available online 14 February 2008

Abstract

To test the avian-origin hypothesis of the 1918 Spanish influenza virus we surveyed influenza sequences from a broad taxonomic dis-tribution and collected 65 full-length genomes representing avian, human and ‘‘classic” swine H1N1 lineages in addition to numerousother swine (H1N2, H3N1, and H3N2), human (H2N2, H3N2, and H5N1), and avian (H1N1, H4N6, H5N1, H6N1, H6N6, H6N8,H7N3, H8N4, H9N2, and H13N2) subtypes. Amino acids from all eight segments were concatenated, aligned, and used for phylogeneticanalyses. In addition, the genes of the polymerase complex (PB1, PB2, and PA) were analyzed individually. All of our results showed theBrevig-Mission/1918 strain in a position basal to the rest of the clade containing human H1N1s and were consistent with a reassortmenthypothesis for the origin of the 1918 virus. Our genome phylogeny further indicates a sister relationship with the ‘‘classic” swine H1N1lineage. The individual PB1, PB2, and PA phylogenies were consistent with reassortment/recombination hypotheses for these genes.These results demonstrate the importance of using a complete-genome approach for addressing the avian-origin hypothesis and predict-ing the emergence of new pandemic influenza strains.� 2008 Elsevier Inc. All rights reserved.

Keywords: Avian-origin hypothesis; 1918 Spanish influenza; Genomic analysis H5N1; Reassortment

1. Introduction

The next influenza outbreak and subsequent pandemicin humans will likely arise from a relatively new strain ofavian influenza, H5N1, first seen in humans in 1997 (deJong et al., 1997; Claas et al., 1998; Shortridge et al.,1998). What similarities are there between the deadly1918 Spanish flu strain and those H5N1 strains emergingin human populations today? Are we right in assumingthe consequences of an H5N1 pandemic will be similarlydevastating? If hypotheses regarding the avian origin ofthe 1918 strain are true, the answer is most certainly yes.Even in the absence of support for the avian origin hypoth-esis, given the high mortality rates in humans infected withH5N1 (see Tam 2002) the world is likely to be caughtunprepared.

1055-7903/$ - see front matter � 2008 Elsevier Inc. All rights reserved.

doi:10.1016/j.ympev.2008.02.003

* Corresponding author. Fax: +1 803 323 3448.E-mail address: [email protected] (K.M. Westover).

Influenza A viruses, single–stranded, negative sense RNAviruses of the family Orthomyxoviridae, co-circulate inhumans in yearly epidemics and antigenically novel strainsemerge sporadically as pandemic viruses (Cox and Subbar-ao, 2000). The eight-segmented (segments 1–8) genome ishoused in an enveloped virion (Noda et al., 2006). Segments1, 3, 4, 5, and 6 each encode a single protein, the polymerasebasic 2 (PB2), polymerase acidic (PA), hemaglutinin (HA),nucleoprotein (NP), and neuraminidase (NA), respectively.Combinations of the 15 HA and 9 NA subtypes define viralstrain and function in the human immune response. Themajor function of NA is to remove sialic acid from the prog-eny HA and NA virus particles, intercellular glycoproteins,and host-cell receptors, facilitating release from infected cells(Rogers and Paulson, 1983; Matrosovich et al., 1999; Plotkinand Dushoff, 2003). Segments 2, 7, and 8 each encode twoproteins. PB1 and PB1-F2 are coded for by segment 2; matrix(M1) and M2 by segment 7; and nonstructural 1 (NS1) andNS2 by segment 8. The NS1 protein has been shown to

Page 2: Origin of the 1918 Spanish influenza virus: A comparative genomic analysis

G. Vana, K.M. Westover / Molecular Phylogenetics and Evolution 47 (2008) 1100–1110 1101

counteract the effects of interferon and NS2 is associatedwith the M1 protein in the virus particles (Zhou et al.,2006). The shortest sequence, PB1-F2, plays a role in celldeath induced by the virus (Taubenberger et al., 2005). The11 proteins encoded by the genome include the two surfaceglycoproteins, M1 and M2, which are important targetsfor infection-induced antibodies. The M2 protein is an ionchannel functioning in the first and last stages of infection.The matrix protein (M1) constitutes the protein layerbeneath the lipid envelope. The nucleoprotein (NP) coatsthe RNA particles and the polymerase proteins (PB1, PB2,and PA) are used during replication of the virus (Voyles,2002).

Influenza A has caused three pandemics in the last cen-tury: the initial threat of ‘‘Spanish Flu” in 1918 (H1N1),the re-emerging ‘‘Asian Flu” in 1957 (H2N2), and therecombinant ‘‘Hong Kong Flu” in 1968 (H3N2) (Holmeset al., 2005). Influenza A viral strains easily circulate in ani-mals such as pigs, waterfowl and humans (Bender et al.,1999). Influenza A, the result of co-circulating H1N1 andH3N2 strains, is considered one of the ten leading causesof death in the United States (Heron and Smith, 2007)resulting in an average of 610,660 life-years lost and directmedical costs of $10.4 billion in 2003 alone (Molinari et al,2007). Due to seasonal antigenic shift creating new subtypesand various point mutations, new pandemic strains orhighly virulent strains are a constant concern. The 1918 pan-demic flu strain is thought to be the most virulent strain inhistory, killing approximately one-third of the world’s pop-ulation at that time (Taubenberger and Morens, 2006).

The origin of the 1918 influenza virus and its relation-ship with the highly pathogenic H5N1 strain, originally iso-lated from geese in Guangdong Province, China in 1996(Chen et al., 2006), is a persistently debated topic (seeMorens and Fauci, 2007). Recently, Taubenberger et al.(2005) proposed that the 1918 strain was not a reassort-ment virus, but was avian in origin based on a phylogeneticanalysis of the polymerase protein sequences. Becausemigratory birds serve as a reservoir for recombinant influ-enza strains including the highly pathogenic avian influ-enza virus (HPAI), H5N1 is predicted to be the nextpandemic threat. Recently, the interpretation of Tauben-berger’s analysis has been called into question (Gibbs andGibbs 2006; Antonovics et al., 2006). Gibbs and Gibbs(2006) argue that Taubenberger’s results instead supporta mammalian origin with reassortment. Antonovics et al.(2006) further assert that instead of an avian origin Tau-benberger’s nucleotide phylogenies support a human and‘‘classic” swine influenza clade containing the 1918 strain.Their own phylogenetic analysis of PB1 amino acid resi-dues supports this conclusion (Antonovics et al., 2006).

We tested the avian-origin hypothesis using a phyloge-netic analysis of full-length influenza A genomes includingrepresentatives from avian, human and ‘‘classic” swineH1N1 lineages in addition to numerous other swine(H1N2, H3N1, and H3N2), human (H2N2, H3N2, andH5N1), and avian (H1N1, H4N6, H5N1, H6N1, H6N6,

H6N8, H7N3, H8N4, H9N2, and H13N2) subtypes.Amino acids from all eight segments were concatenated,aligned, and used for phylogenetic analyses. This approachgreatly increased the number of informative sites (1766 of3836 variable; 1345 parsimony-informative) for the analy-sis. If the 1918 influenza strain occupies a basal positionwith respect to other human H1N1 lineages and is sisterto avian strains there may still be several scenarios to con-sider, including avian-strain infection of a mammal severalyears before the 1918 pandemic or direct transmission of anavian-strain to human populations before the appearanceof the 1918 strain. If the 1918 virus evolved in mammalsprior to the pandemic, our phylogenetic analysis will beconsistent with Gibbs and Gibbs (2006) and Antonovicset al. (2006) with a clade containing the 1918 strain andother human and mammal sequences. In addition, becausePB2, PB1, PA have been shown to evolve more slowly thanother proteins in human H3 lineages (Webster et al., 1992),to play an important role in host-specificity (Subbaraoet al., 1993; Naffakh et al., 2000), and to differ from avianconsensus sequences at only a small number of amino acidpositions (Taubenberger et al. 2005), we conducted a phy-logenetic analysis of these segments individually to furthertest the avian-origin hypothesis.

2. Methods

We conducted a phylogenetic analysis of 65 completeinfluenza A genomes representing a broad taxonomic dis-tribution (Table 1). The sequences were identified usingthe Influenza Sequence Database (ISD; Macken et al.,2001) and included representatives of avian, human, and‘‘classic” swine H1N1 lineages. In addition, we collectednumerous other swine (H1N2, H3N1, and H3N2), human(H2N2, H3N2, and H5N1), and avian (H1N1, H4N6,H5N1, H6N1, H6N6, H6N8, H7N3, H8N4, H9N2, andH13N2) subtypes. Only those with full-length, unambigu-ous nucleotide sequences for the neuraminidase (NA),hemaglutinin (HA), polymerase basic 1 (PB1), polymerasebasic 2 (PB2), nucleoprotein (NP), matrix (M), polymeraseacid (PA), and nonstructural (NS) segments were includedin the analysis. We were unable to include two classic swineH1N1 sequences (swine/Iowa/30 and swine/31) included inTaubenberger’s analysis (2005) because of incompleteinformation for all eight segments, but instead includedthe Tennessee/10/1977 and Ontario/1/1981 representatives,likely descendents of the 1930 strains (see review in Olsen2002). The complete data set is available upon request.

The nucleotide sequences were collected and translated.Amino acid sequences were aligned by the CLUSTALWprogram (Thompson et al., 1994), and in the phylogeneticanalysis we excluded all sites at which gaps were postulatedin any sequence. Phylogenies were based on concatenatedamino acid sequence alignments and were conducted usingthe following methods: (1) the maximum parsimony (MP)method (Felsenstein, 1985a), implemented in PAUP* pro-gram (Swofford, 2002); the quartet maximum likelihood

Page 3: Origin of the 1918 Spanish influenza virus: A comparative genomic analysis

Table 1Sequences used in the phylogenetic study of influenza A virus genomes

Subtype and isolate Protein(s) and accession numbers (reference)

H1N1 (human)

Brevig Mission/1/1918 Neuraminidase (NA): Reid et al. (2000), Hemagglutinin (HA): Taubenberger et al. (1997) from A/South Carolina/1/18),Polymerase PB1 (PB1): DQ208310 (Taubenberger et al., 2005), Polymerase PB2 (PB2): DQ208309 (Taubenberger et al.,2005), Nucleoprotein (NP): AY44935 (Reid et al., 2004), Matrix (M): AY130766 (Reid et al., 2002), Polymerase PA (PA):DQ208311 (Taubenberger et al., 2005),and Nonstructural (NS): AF333238 (Basler et al., 2001)

Cam/46 NA: CY009598, HA: CY009596, PB1: CY009602, PB2: CY009603, NP: CY009599, M: CY009597, PA: CY009601, andNS: CY009600 (Spiro et al., unpublished)

Canterbury/42/2001 NA: CY010782, HA: CY010780, PB1: CY010786, PB2: CY010787, NP: CY010783, M: CY010781, PA: CY010785, andNS: CY010784 (Ghedin et al., unpublished)

Charlottesville/31/95 NA: AF398869, HA: AF398878, PB1: AF398871, PB2: AF398866, NP: AF398867, M: AF398876, PA: AF398863, andNS: AF398877 (Nedyalkova et al., 2002)

Fort Worth/50 NA: CY009334, HA: CY009332, PB1: CY009338, PB2: CY009339, NP: CY009335, M: CY009333, PA: CY009337, andNS: CY009336 (Spiro et al., unpublished)

Malaysia/54 NA: CY009342, HA: CY009340, PB1: CY009346, PB2: CY009347, NP: CY009343, M: CY009341, PA: CY009345, andNS: CY009344 (Spiro et al., unpublished)

Melbourne/35 NA: CY009326, HA: CY009324, PB1: CY009330, PB2: CY009331, NP: CY009327, M: CY009325, PA: CY009329, andNS: CY009328 (Spiro et al., unpublished)

New Caledonia/20/1999 NA: DQ508859, HA: DQ508857, PB1: DQ508855, PB2: DQ508854, NP: DQ508858, M: DQ508860, PA: DQ508856, andNS: DQ508861 (Mbawuike et al., unpublished)

Puerto Rico/8/34 NA: NC002018 (Blok and Air, 1982), HA: NC002017 and PB1: J02151 (Winter et al., 1981), PB2: V00603 (Fields andWinter, 1982), NP: J02147 (Van Rompuy et al., 1981), M: V01099 (Winter and Fields, 1980), PA: V01106 (Fields andWinter, 1982), NS: NC002020 (Hall and Air, 1981; Winter and Fields, 1982)

New York/235/2001 NA: CY010854, HA: CY010852, PB1: CY010858, PB2: CY010859, NP: CY010855, M: CY010853, PA: CY010857, andNS: CY010856 (Ghedin et al., unpublished)

New York/230/2003 NA: CY002626, HA: CY002624, PB1: CY002630, PB2: CY002631, NP: CY002627, M: CY002625, PA: CY002629, andNS: CY002628 (Ghedin et al., unpublished)

Taiwan/01/1986 NA: DQ508875, HA: DQ508873, PB1: DQ508871, PB2: DQ508870, NP: DQ508874, M: DQ508876, PA: DQ508872, andNS: DQ508877 (Mbawuike et al., unpublished)

Texas/36/1991 NA: DQ508891, HA: DQ508889, PB1: DQ508887, PB2: DQ508886, NP: DQ508890, M: DQ508892, PA: DQ508888, andNS: DQ508893 (Mbawuike et al., unpublished)

USSR/90/1977 NA: DQ508899, HA: DQ508897, PB1: DQ508895, PB2: DQ508894, NP: DQ508898, M: DQ508900, PA: DQ508896, andNS: DQ508901 (Mbawuike et al., unpublished)

Weiss/43 NA: CY009454, HA: CY009452, PB1: CY009458, PB2: CY009459, NP: CY009455, M: CY009453, PA: CY009457, andNS: CY009456 (Spiro et al., unpublished)

Wilson-Smith/1933 NA: DQ508907, HA: DQ508905, PB1: DQ508903, PB2: DQ508902, NP: DQ508906, M: DQ508908, PA: DQ508904, andNS: DQ508909 (Mbawuike et al., unpublished)

H1N1 (Swine)

Ontario/1/1981 NA: CY022980, HA: CY022978, PB1: CY022984, PB2: CY022985, NP: CY022981, M: CY022979, PA: CY022983, andNS: CY022982 (Ghedin et al., unpublished)

Tennessee/10/1977 NA: CY022271, HA: CY022269, PB1: CY022275, PB2: CY022276, NP: CY022272, M: CY022270, PA: CY022274, andNS: CY022273 (Ghedin et al., unpublished)

H1N1 (Avian)

Pintail Duck/Alberta/238/1079

NA: CY004484, HA: CY004482, PB1: CY04488, PB2: CY004489, NP: CY004485, M: CY004483, PA: CY004487, andNS: CY004486 (Obenauer et al., 2006)

H1N2 (Swine)

Korea/CY02/02 NA: AY129157, HA: AY129156, PB1: AY129162, PB2: AY129163, NP: AY129159, M: AY129158, PA: AY129161, andNS: AY129160 (Choi et al., unpublished)

Minnesota/55551/00 NA: AF455694, HA: AF455678, PB1: AF455726, PB2: AF455734, NP: AF455702, M: AF455686, PA: AF455718, andNS: AF455710 (Karasin et al., 2002)

North Carolina/93523/01 NA: AF455693, HA: AF455677, PB1: AF455725, PB2: AF455733, NP: AF455701: M: AF455685, PA: AF455717, andNS: AF455709 (Karasin et al., unpublished)

Ohio/891/01 NA: AF455691, HA: AF455675, PB1: AF455723, PB2: AF455731, NP: AF455699, M: AF455683, PA: AF455715, andNS: AF455707 (Karasin et al., 2002)

H2N2 (human)

Japan/305/57 NA: DQ508843, HA: DQ508841, PB1: DQ508839, PB2: DQ508838, NP: DQ508842, M: DQ508844, PA: DQ508840, andNS: DQ508845 (Mbawuike et al., unpublished)

Pittsburgh/2/65 NA: AY209926, HA: AY209973, PB1: AY210018, PB2: AY209944, NP: AY210087, M: AY210049, PA: AY209999, andNS: AY210175 (Lindstrom et al., 2004)

Taiwan/1964 NA: DQ508883, HA: DQ508881, PB1: DQ508879, PB2: DQ508878, NP: DQ508882, M: DQ508884, PA: DQ508880, andNS: DQ508885 (Mbawuike et al., unpublished)

1102 G. Vana, K.M. Westover / Molecular Phylogenetics and Evolution 47 (2008) 1100–1110

Page 4: Origin of the 1918 Spanish influenza virus: A comparative genomic analysis

Table 1 (continued)

Subtype and isolate Protein(s) and accession numbers (reference)

H3N1 (Swine)

Minnesota/00395/2004 NA: DQ145538, HA: DQ145537, PB1: DQ145544, PB2: DQ145540, NP: DQ145541, M: DQ145542, PA: DQ145539, andNS: DQ145543 (Ma et al., 2006)

H3N2 (human)

Ashburton/280/2004 NA: CY002956, HA: CY002954, PB1: CY002960, PB2: CY002961, NP: CY002957, M: CY002955, PA: CY002959, andNS: CY002958 (Ghedin et al., unpublished)

Bangkok/01/1979 NA: DQ508827, HA: DQ508825, PB1: DQ508823, PB2: DQ508822, NP: DQ508826, M: DQ508828, PA: DQ508824, andNS: DQ508829 (Mbawuike et al., unpublished)

Bay of Plenty/332/2004 NA: CY007301, HA: CY007299, PB1: CY007305, PB2: CY007306, NP: CY007306, M: CY007300, PA: CY007304, andNS: CY007303 (Ghedin et al., unpublished)

Beijing/353/1989 NA: DQ508835, HA: DQ508833, PB1: DQ508831, PB2: DQ508830, NP: DQ508834, M: DQ508836, PA: DQ508832, andNS: DQ508837 (Mbawuike et al., unpublished)

Canterbury/01/2005 NA: CY007797, HA: CY007795, PB1: CY0073801, PB2: CY007802, NP: CY007798, M: CY007796, PA: CY007900, andNS: CY007799 (Ghedin et al., unpublished)

Christchurch/90/2004 NA: CY002940, HA: CY002938, PB1: CY002944, PB2: CY002945, NP: CY002941, M: CY002939, PA: CY002943, andNS: CY002942 (Ghedin et al., unpublished)

England/72 NA: CY009358, HA: CY009356, PB1: CY009362, PB2: CY009363, NP: CY009359, M: CY009357, PA: CY009361, andNS: CY009360 (Spiro et al., unpublished)

Hong Kong/1/68 NA: AF348184, HA: AF348176, PB1: AF348172, PB2: AF348170, NP: AF348180, M: AF348188, PA: AF348174, andNS: AF348198 (Brown et al., 2001)

Memphis/1/77 NA: CY006733, HA: CY006731, PB1: CY006737, PB2: CY006738, NP: CY006734, M: CY006732, PA: CY006736, andNS: CY006735 (Ghedin et al., unpublished)

New York/623/1995 NA: CY010814, HA: CY010812, PB1: CY010818, PB2: CY010819, NP: CY010815, M: CY010813, PA: CY010817, andNS: CY010816 (Ghedin et al., unpublished)

New York/392/2004 NA: NC007368, HA: NC007366, PB1: CY002070, PB2: NC007373, NP: CY002067, M: CY002065, PA: CY002069, andNS: NC007370 (Ghedin et al., unpublished)

Panama/2007/1999 NA: DQ508867, HA: DQ508865, PB1: DQ508863, PB2: DQ508862, NP: DQ508866, M: DQ508868, PA: DQ508864, andNS: DQ508869 (Mbawuike et al., unpublished)

Port Chalmers/73 NA: CY009350, HA: CY009348, PB1: CY009354, PB2: CY009355, NP: CY009351, M: CY009349, PA: CY009353, andNS: CY009352 (Spiro et al., unpublished)

Udorn/307/1972 NA: DQ508931, HA: DQ508929, PB1: DQ508927, and PB2: DQ508926 (Mbawuike et al., unpublished), NP: M14922(Buckler-White and Murphy 1986), M: DQ508932, PA: DQ508928, and NS: DQ508933 (Mbawuike et al., unpublished)

H3N2 (Swine)

Colorado/1/77 NA: CY009302, HA: CY009300, PB1: CY009306, PB2: CY009307, NP: CY009303, M: CY009301, PA: CY009305, andNS: CY009304 (Ghedin et al., unpublished)

H4N6 (Swine)

Ontario/1911-1/99 NA: AF285887, HA: AF285885, PB1: AF285891, PB2: AF285892, NP: AF285888, M: AF285886, PA: AF285890, andNS: AF285889 (Karasin et al., 2000a, 2000b)

H4N6 (Avian)

Ruddy Turnstone/NJ/47/1985

NA: CY004819, HA: CY005958, PB1: CY004823, PB2: CY004824, NP: CY004820, M: CY004818, PA: CY004822, andNS: CY004821 (Obenauer et al., 2006)

H5N1 (Avian)

Bar-headed Goose/Qinghai/61/05

NA: DQ095658, HA: DQ095618, PB1: DQ095738, PB2: DQ095758, NP: DQ95678, M: DQ095638, PA: DQ095718, andNS: DQ095698 (Chen et al., 2005)

Cygnus/Italy/742/2006 NA: CY017037, HA: CY017035, PB1: CY017041, PB2: CY017042, NP: CY017038, M: CY017036, PA: CY017040, andNS: CY017039 (Lee et al., unpublished)

Duck/China/E319-2/03 NA: AY518363, HA: AY518362, PB1: AY518366, PB2: AY518367, NP: AY518364, M: AY518361, PA: AY518365, andNS: AY518360 (Lee et al., unpublished)

Goose/Guangdong/1/96 NA: NC007361; HA: NC007362, PB1: NC007358, PB2: AF144300, NP: AF144303, M: AF144306, PA: AF144302, andNS: AF144307 (Xu et al., 1999)

H5N1 (human)

Hong Kong/483/97 NA: AF084273 (Hiromoto et al., 2000), HA: AF046097 (Suarez et al., 1998), PB1: AF084265, PB2 : AF084262, NP :AF084277, M: AF084283, and PA: AF084269 (Hiromoto et al., 2000), NS: AF256180 (Shaw et al., 2002)

Thailand/2(SP-33)/2004 NA: AY577315, HA: AY555153, PB1: AY627897, PB2: AY627898, NP: AY627895, M: AY627893, PA: AY627896, andNS: AY627894 (Puthavathana et al., 2005)

Vietnam/CL01/2004 NA: DQ493068, HA: DQ497719, PB1: DQ493418, PB2: DQ492894, NP: DQ493156, M: DQ492980, PA: DQ493332, andNS: DQ493244 (Smith et al., 2006)

(continued on next page)

G. Vana, K.M. Westover / Molecular Phylogenetics and Evolution 47 (2008) 1100–1110 1103

Page 5: Origin of the 1918 Spanish influenza virus: A comparative genomic analysis

Table 1 (continued)

Subtype and isolate Protein(s) and accession numbers (reference)

H6N1 (Avian)

Chicken/Taiwan.0705/99 NA: DQ376696. HA: DQ376624, PB1: DQ376803, PB2: DQ376876, NP: DQ376732, M: DQ376659, PA: DQ376803, andNS: DQ376768 (Lee et al., unpublished)

Chukka/Hong Kong/FY295/00

NA: AJ410563, PB1: AJ410508, PB2: AJ410501, NP: AJ410554, M: AJ410574, PA: AJ410517, and NS: AJ410583 (Chinet al., 2002)

Pheasant/Hong Kong/FY294/00

NA: AJ427310, HA: AJ427308, PB1: AJ427306, PB2: AJ427305, NP: AJ427309, M: AJ427311, PA: AJ427307, and NS:AJ427312 (Chin and Stockridge, unpublished)

Teal/Hong Kong/W312/97 NA: AF250481, HA: AF250479, PB1: AF250477, PB2: AF250476, NP: AF250480, M: AF250482, PA: AF250478, andNS: AF250483 (Hoffmann et al., 2000)

Quail/Hong Kong/1721-20/99

NA: AJ410558, HA: AJ410520, PB1: AJ410503, PB2: AJ410496, NP: AJ410549, M: AJ410569, PA: AJ410512, and NS:AJ410578 (Chin et al., 2002)

Quail/Hong Kong/SF550/00 NA: AJ410560, HA: AJ410522, PB1: AJ410505, PB2: AJ410498, NP: AJ410551, M: AJ410571, PA: AJ410514, and NS:AJ410580 (Chin et al., 2002)

H6N6 (Avian)

Turkey/Minnesota/957/1980 NA: CY014766, HA: CY014764, PB1: CY014770, PB2: CY014771, NP: CY014767, M: CY014765, PA: CY014769, andNS: CY014768 (Obenauer et al., 2006)

H6N8 (Avian)

Pintail/ALB/628/1979 NA: CY004116, HA: CY004114, PB1: CY004120, PB2: CY004121, NP: CY004117, M: CY004115, PA: CY004119, andNS: CY004118 (Obenauer et al., 2006)

H7N3 (Avian)

Ruddy Turnstone/NJ/65/1985

NA: CY004407, HA: CY005928, PB1: CY004411, PB2: CY004412, NP: CY004408, M: CY004406, PA: CY004410, andNS: CY004409 (Obenauer et al., 2006)

H7N7 (Avian)

Chicken/Germany/R28/03 NA: AJ620349, HA: AJ620350, PB1: AJ620348, PB2: AJ620347, NP: AJ620352, M: AJ619676, PA: AJ619677, and NS:AJ618678 (Ahnlan et al., unpublished)

H8N4 (Avian)

Pintail/ALB/114/1979 NA: CY004989, HA: CY005971, PB1: CY004993, PB2: CY004994, NP: CY004990, M: CY004988, PA: CY004809, andNS: CY004808 (Obenauer et al., 2006)

H9N2 (Avian)

Chicken/Shanghai/F/98 NA: AY253754, HA: AY743216, PB1: AY253751, PB2: AY253750, NP: AY253753, M: AY253755, PA: AY253752, andNS: AY253756 (Lu et al., unpublished)

Parakeet/Narita/92A/98 NA: AB049164, HA: AB049160, PB1: AB049156, PB2: AB049144, NP: AB049162, M: AB049166, PA: AB049158, andNS: AB049168 (Mase et al., 2001)

H13N6 (Avian)

Gull/Maryland/704/1977 NA: CY014696, HA: CY014694, PB1: CY014700, PB2: CY014701, NP: CY014697, M: CY014695, PA: CY014699, andNS: CY014698 (Obenauer et al., 2006)

1104 G. Vana, K.M. Westover / Molecular Phylogenetics and Evolution 47 (2008) 1100–1110

method (QLM), implemented in the Puzzle 5.2 program(Strimmer and von Haeseler, 1996); the minimum evolu-tion (ME) method (Rzhetsky and Nei, 1987), implementedin MEGA 3.1 program (Kumar et al., 2004); the neighbor-joining (NJ) method (Saitou and Nei, 1987) using uncor-rected p and Poisson distances, implemented in MEGA3.1 program (Kumar et al., 2004). The ME trees were basedon the uncorrected amino acid distance and the hypothesisthat each internal branch was equal to zero was tested bythe internal branch test; the standard error of the branch-length was estimated by bootstrapping (2000 replicates)(Nei and Kumar, 2000). The standard error test has anadvantage over traditional bootstrapping (Felsenstein,1985b) in that it is not sensitive to the number of sequencesincluded in the analysis (Sitknikova et al., 1995). The NJtrees were based on uncorrected amino-acid distances (p),Poisson-corrected distances, and gamma corrected dis-tances. Gamma (c) was estimated using the maximum like-lihood method and the JTT model of substitution (Joneset al., 1992) implemented in Puzzle 5.2 (Strimmer and

von Haeseler, 1996). Gamma ranged from 0.18 for theentire genome to 0.36 for PA.

3. Results

3.1. Concatenated segments representing entire genome

All phylogenetic methods yielded similar results; thereforeonly NJ trees using uncorrected p are shown in the following.Our genome phylogeny was consistent with the hypothesisthat the 1918 influenza virus represents a reassortment virus(Fig. 1). The Brevig-Mission/1918 strain occupied a basalposition in the clade containing human H1N1 lineages includ-ing the Melbourne/1935, Wilson-Smith/1933, and PuertoRican/8/1934 H1N1 strains and this relationship was sup-ported by a significant bootstrap (bs = 99, Fig. 1). We foundsupport for this topology using all tree-making methods(bs = 100 MP; bs = 62 ML; ibt = 99 ME, bs = 99 with Pois-son-corrected NJ; bs = 99 with gamma-corrected NJ). TheH1N1 human lineages were most closely related to the

Page 6: Origin of the 1918 Spanish influenza virus: A comparative genomic analysis

Fig. 1. Phylogenetic tree of influenza A genomes based on concatenatedsequences of individual protein-coding genes (NA, HA, PB1, PB2, NP, M,PA, and NS). Geographic location, isolate, and dates are shown. SeeTable 1 for complete list of accession numbers. The tree was constructedusing the NJ method (number of bootstraps = 1000) based uncorrected p-distance for 3836 aligned amino acid residue positions, 1766 of which werevariable.

G. Vana, K.M. Westover / Molecular Phylogenetics and Evolution 47 (2008) 1100–1110 1105

‘‘classic” swine H1N1 strains (bs = 84) and clustered with thepintail/ALB/238/1979 H1N1 strain (bs = 97, Fig. 1). Thisgroup was sister to a clade containing avian H6N1 andH5N1 influenza viruses (bs = 83, Fig. 1). The swine/Minne-sota/00395/2004 H3N1 sequence was basal to all of theH1N1, H6N1, and H5N1 representatives (bs = 97, Fig. 1).

3.2. Segments PB1, PB2, and PA

The phylogenies generated using individual segments donot support an avian-origin of the 1918 Spanish influenza.The Brevig-Mission/1918 strain was found to be basal tothe human H1N1 lineage when the polymerase segmentswere considered individually (Figs. 2A–C). This was mostsignificantly supported for the PA segment (bs = 88,Fig. 2C). The ‘‘classic” swine H1N1 representatives werebasal to all sequences in these phylogenies (Fig. 2A–C).

The H5N1 lineage, including avian and human represen-tatives, generally clustered together in all three individualsegment phylogenies (PB1 bs = 95, PB2 bs = NS, PAbs = 98) and were sister to other avian representativesincluding H6N1 sequences with the exception of the PB2segment (Fig. 2A–C). In the PB2 segment, the H5N1 line-age was more closely related to swine H1N2 and H3N1representatives (bs = 74, Fig. 2B). Although not signifi-cant, one avian H5N1 PB1 sequence, goose/Guangdong/1/96, was sister to the large clade containing swine H1, aswell as human H2 and H3 representatives (Fig. 2A). Oneadditional difference in the topology of the PB1 phylogenycompared to the others was that the human H2/H3 cladewas separate from the human H1N1 lineage and was sisterto the avian clade containing H5N1 and H6N1 representa-tives (Fig. 2A). In the PB2 and PA phylogenies, the humanH1, H2, and H3 lineages were found together, significantlysupported (bs = 88) in the PA tree (Fig. 2B and C).

4. Discussion

Pandemic viruses have emerged three times in thiscentury beginning with the 1918 Spanish influenza (H1N1subtype strain), then the 1957 Asian influenza (H2N2 sub-type strain), and finally with the 1968 Hong Kong influenza(H3N2 subtype strain) (Webster et al., 1992; Cox andSubbarao, 2000). The alarming implication that the 1918virus emerged from a human-adapted avian influenza viruspresents the scientific community with an obligation to rig-orously examine the hypotheses concerning the origin ofpandemic flu strains. Sampling from a broad taxonomicdistribution and using concatenated amino acids from alleight segments for full-length influenza genomes greatlyincreased the number of sites for the phylogenetic analysesdirected at testing the avian-origin of the 1918 Spanish influ-enza. Furthermore, because very few amino acid positionsof the polymerase complex consistently distinguish the1918 and other human flu viruses from avian consensus(Taubenberger et al., 2005), a phylogenetic study of theseregions individually should provide insight into relation-

Page 7: Origin of the 1918 Spanish influenza virus: A comparative genomic analysis

Fig. 2A. Phylogenetic trees of influenza PB1 using the NJ method based uncorrected p-distance for 697 (176 variable) aligned amino acid residuepositions. Numbers on the branches represent bootstrap replicates. Geographic location, isolate, and dates are shown. See Table 1 for complete list ofaccession numbers.

1106 G. Vana, K.M. Westover / Molecular Phylogenetics and Evolution 47 (2008) 1100–1110

Page 8: Origin of the 1918 Spanish influenza virus: A comparative genomic analysis

Fig. 2B. Phylogenetic trees of influenza PB2 using the NJ method baseduncorrected p-distance for 758 (202 variable) aligned amino acid residuepositions. Numbers on the branches represent bootstrap replicates.Geographic location, isolate, and dates are shown. See Table 1 forcomplete list of accession numbers.

Fig. 2C. Phylogenetic trees of influenza PA using the NJ method baseduncorrected p-distance for 709 (210 variable) aligned amino acid residuepositions. Numbers on the branches represent bootstrap replicates.Geographic location, isolate, and dates are shown. See Table 1 forcomplete list of accession numbers.

G. Vana, K.M. Westover / Molecular Phylogenetics and Evolution 47 (2008) 1100–1110 1107

ships between human H1N1 and avian H5N1 lineages.Differences in amino acid residues among flu viruses forthese protein-coding regions likely reflect response to host-species selective pressure given their ability to interact withhost factors (Fodor and Brownlee, 2002).

All of our results showed the Brevig-Mission/1918 strainin a position basal to the rest of the clade containinghuman H1N1s and were consistent with a reassortmenthypothesis for the origin of the 1918 virus. Our genome

Page 9: Origin of the 1918 Spanish influenza virus: A comparative genomic analysis

1108 G. Vana, K.M. Westover / Molecular Phylogenetics and Evolution 47 (2008) 1100–1110

phylogeny further indicated a sister relationship with the‘‘classic” swine H1N1 lineage. Taubenberger’s own PA,PB1, and PB2 phylogenies indicate well-supported relation-ships between the Brevig-Mission/1918 virus and classicswine H1N1s (2005). Antonovics et al. (2006) found a sim-ilar pattern using amino acid sequences from PB1 and PB1-F2 sequences suggesting that the 1918 strain was signifi-cantly related to other mammalian flu strains and not basalto those lineages. However, because influenza viruses arecontinually evolving by mechanisms of antigenic shift anddrift (Webster et al., 1992), human-adapted H5N1 subtypestrains still represent a dangerous public health risk andrequire further elucidation of the mechanisms by whichpandemic influenzas emerge. The differences in evolution-ary placement of strains between the individual PA, PB1,PB2, and complete genome phylogenies emphasize the needfor increasing the power of analysis.

PB1 genes of human H1N1, H2 and H3 viruses com-prise different sub-lineages following the introduction ofavian PB1 genes in 1957 and 1968 (Kawaoka et al.,1989). Our PB1 phylogeny supported these results and indi-cated a separation of these lineages, with the human H2and H3 group sister to a large clade containing avianH5N1s and H6N1s. Additionally, although not significant,one avian H5N1 PB1 sequence was sister to group ofhuman H2 and H3 representatives, consistent with theacquisition of avian PB1 genes (Kawaoka et al., 1989).

Reassortment (swine/human) H1N2 viruses in swinewere first isolated in 1978 (Sugimura et al., 1980) andmolecular characterization indicated that the NA genewas of human H3N2 origin and that the polymerase com-plex was avian in origin (Brown et al., 1998). The individ-ual gene trees for PB1, PB2, and PA genes and the genomephylogeny were consistent with this hypothesis. However aclade containing swine H1N2, avian H9N2, human H2N2,and human H3N2 lineages was only significantly supportedin the genome treatment.

Establishing the genetic basis for interspecies transmis-sion of influenza viruses is paramount for predicting thepotential human danger associated with new pandemicstrains. The claim that the 1918 influenza virus was derivedfrom an avian source (Taubenberger et al., 2005) has likelydistorted the public health risk associated with emergentavian H5N1 strains in Southeast Asia (de Jong et al.,1997; Claas et al., 1998; Shortridge et al., 1998). Untilnow the avian-origin hypothesis of the 1918 influenza virushad not been tested using a ‘‘whole” genome approach,which improves the statistical power by increasing thenumber of informative sites for phylogenetic analysis. Allof our results support a reassortment hypothesis for the1918 influenza virus consistent with both Gibbs and Gibbs(2006) and Antonovics et al. (2006). While the individualtopologies of the PB1, PB2, and PA phylogenies differ fromeach other and in some cases with the entire genome tree,the trees are consistent with reassortment/recombinationhypotheses for these individual genes, and more impor-tantly from our perspective, inconsistent with an avian ori-

gin of the 1918 influenza virus. Until the immediateancestor(s) are identified, there may be no definitive wayto test, for example, whether an unknown avian-straininfected a mammal where it may subsequently haveevolved before transmission to humans in 1918. Even ifthat were true, we would have expected to see a differentphylogenetic relationship between swine lineages and the1918 strain in our current study.

In spite of a non-avian origin of the 1918 influenza virus,continued molecular characterization of influenza genomesis necessary as more sequences become available, due to thepossibility that modern society may still face devastatinglosses if avian H5N1 strains become better adapted forhuman–human transmission.

Acknowledgments

This publication was made possible in part by NIHGrant No. P20 RR-016461 from the National Center forResearch Resources. Its contents are solely the responsibil-ity of the authors and do not necessarily represent the offi-cial views of the NIH. This research was also supported bya Winthrop University Research Council Grant to K.M.L.(2005–2006). We would also like to acknowledge the Win-throp University Department of Biology for support of thisstudy as well as Kristen Ledbetter and Jessica Cooke forlaboratory assistance.

References

Antonovics, J., Hood, M.E., Baker, C.H., 2006. Molecular virology: wasthe 1918 flu avian in origin?. Nature 440 E9, discussion E9–10.

Basler, C.F., Reid, A.H., Dybing, J.K., Janczewski, T.A., Fanning, T.G.,Zheng, H., Salvatore, M., Perdue, M.L., Swayne, D.E., Garcia-Sastre,A., Palese, P., Taubenberger, J.K., 2001. Sequence of the 1918pandemic influenza virus nonstructural gene (NS) segment andcharacterization of recombinant viruses bearing the 1918 NS genes.Proc. Natl. Acad. Sci. USA 98, 2746–2751.

Bender, C., Hall, H., Huang, J., Klimov, A., Cox, N., Hay, A., Gregory,V., Cameron, K., Lim, W., Subbarao, K., 1999. Characterization ofthe surface proteins of influenza A (H5N1) viruses isolated fromhumans in 1997–1998. Virology 254, 115–123.

Blok, J., Air, G.M., 1982. Block deletions in the neuraminidase genes fromsome influenza A viruses of the N1 subtype. Virology 118, 229–234.

Brown, E.G., Liu, H., Kit, L.C., Baird, S., Nesrallah, M., 2001. Pattern ofmutation in the genome of influenza A virus on adaptation toincreased virulence in the mouse lung: identification of functionalthemes. Proc. Natl. Acad. Sci. USA 98, 6883–6888.

Brown, I.H., Harris, P.A., McCauley, J.W., Alexander, D.J., 1998.Multiple genetic reassortment of avian and human influenza A virusesin European pigs, resulting in the emergence of an H1N2 virus of novelgenotype. J. Gen. Virol. 79, 2947–2955.

Buckler-White, A.J., Murphy, B.R., 1986. Nucleotide sequence analysis ofthe nucleoprotein gene of an avian and a human influenza virus strainidentifies two classes of nucleoproteins. Virology 155, 345–355.

Chen, H., Smith, G.J., Zhang, S.Y., Qin, K., Wang, J., Li, K.S., Webster,R.G., Peiris, J.S., Guan, Y., 2005. Avian flu: H5N1 virus outbreak inmigratory waterfowl. Nature 436, 191–192.

Chen, H., Smith, G.J., Li, K.S., Wang, J., Fan, X.H., Rayner, J.M.,Vijaykrishna, D., Zhang, J.X., Zhang, L.J., Guo, C.T., Cheung, C.L.,Xu, K.M., Duan, L., Huang, K., Qin, K., Leung, Y.H., Wu, W.L., Lu,H.R., Chen, Y., Xia, N.S., Naipospos, T.S., Yuen, K.Y., Hassan, S.S.,

Page 10: Origin of the 1918 Spanish influenza virus: A comparative genomic analysis

G. Vana, K.M. Westover / Molecular Phylogenetics and Evolution 47 (2008) 1100–1110 1109

Bahri, Nguyen, T.D., Webster, R.G., Peiris, J.S., Guan, Y., 2006.Establishment of multiple sublineages of H5N1 influenza virus in Asia:implications for pandemic control. Proc. Natl. Acad. Sci. USA 103,2845–2850.

Chin, P.S., Hoffmann, E., Webby, R., Webster, R.G., Guan, Y., Peiris,M., Shortridge, K.F., 2002. Molecular evolution of H6 influenzaviruses from poultry in Southeastern China: prevalence of H6N1influenza viruses possessing seven A/Hong Kong/156/97 (H5N1)-likegenes in poultry. J. Virol. 76, 507–516.

Claas, E.C., Osterhaus, A.D., Vanbeek, R., de Jong, J.C., Rimmelzwaan,G.F., 1998. Human influenza A H5N1 virus related to a highlypathogenic avian influenza virus. Lancet 351, 472–477.

Cox, N.J., Subbarao, K., 2000. Global epidemiology of influenza: past andpresent. A Rev. Med. 51, 407–421.

de Jong, J.C., Claas, E.C.J., Osterhaus, A.D.M.E., Webster, R.G., Lim,W.L., 1997. A pandemic warning? Nature 389, 554.

Fields, S., Winter, G., 1982. Nucleotide sequences of influenza virussegments 1 and 3 reveal mosaic structure of a small viral RNAsegment. Cell 28, 303–313.

Felsenstein, J., 1985a. Phylogenies and the comparative method. Am. Nat.125, 1–15.

Felsenstein, J., 1985b. Confidence limits on phylogenies: an approachusing the bootstrap. Evolution 39, 83–791.

Fodor, E., Brownlee, G.G., 2002. In: Potter, C.W. (Ed.), Influenza.Elsevier, Amsterdam, pp. 1–29.

Gibbs, M.J., Gibbs, A.J., 2006. Molecular virology: was the 1918pandemic caused by a bird flu? Nature 440, E8, discussion E9-10.

Hall, R.M., Air, G.M., 1981. Variation in nucleotide sequences coding forthe N-terminal regions of the matrix and nonstructural proteins ofinfluenza A viruses. J. Virol. 38, 1–7.

Heron, M.P., Smith, B.L., 2007. Deaths: leading causes for 2003. Natl.Vital Stat. Rep. 55 (10), 1–93.

Hiromoto, Y., Yamazaki, Y., Fukushima, T., Saito, T., Lindstrom, S.E.,Omoe, K., Nerome, R., Lim, W., Sugita, S., Nerome, K., 2000.Evolutionary characterization of the six internal genes of H5N1human influenza A virus. J. Gen. Virol. 81, 1293–1303.

Hoffmann, E., Stech, J., Leneva, I., Krauss, S., Scholtissek, C., Chin, P.S.,Peiris, M., Shortridge, K.F., Webster, R.G., 2000. Characterization ofthe influenza A virus gene pool in avian species in southern China: wasH6N1 a derivative or a precursor of H5N1? J. Virol. 74, 6309–6315.

Holmes, E.C., Ghedin, E., Miller, N., Taylor, J., Bao, Y., St. George, K.,Grenfell, B.T., Salzberg, S.L., Fraser, C.M., Lipman, D.J., Tauben-berger, J.K., 2005. Whole-genome analysis of human influenza A virusreveals multiple persistent lineages and reassortment among recentH3N2 viruses. PLoS Biol. 3 (9), e300.

Jones, D.T., Taylor, W.R., Thornton, J.M., 1992. The rapid generation ofmutation matrix data from protein sequences. Comput. Appl. Biosci.8, 275–282.

Karasin, A.I., Brown, I.H., Carman, S., Olsen, C.W., 2000a. Isolation andcharacterization of H4N6 avian influenza viruses from pigs withpneumonia in Canada. J. Virol. 74, 9322–9327.

Karasin, A.I., Landgraf, J., Swenson, S., Erickson, G., Goyal, S.,Woodruff, M., Scherba, G., Anderson, G., Olsen, C.W., 2002. Geneticcharacterization of H1N2 influenza A viruses isolated from pigsthroughout the United States. J. Clin. Microbiol. 40, 1073–1079.

Karasin, A.I., Olsen, C.W., Anderson, G.A., 2000b. Genetic character-ization of an H1N2 influenza virus isolated from a pig in Indiana. J.Clin. Microbiol. 38, 2453–2456.

Kawaoka, Y., Krauss, S., Webster, R.G., 1989. Avian to humantransmission of the PB1 gene of Influenza A viruses. J. Virol. 63,4603–4608.

Kumar, S., Tamura, K., Nei, M., 2004. MEGA3: Integrated software formolecular evolutionary genetics analysis and sequence alignment.Brief. Bioinform. 5, 150–163.

Lindstrom, S.E., Cox, N.J., Klimov, A., 2004. Genetic analysis of humanH2N2 and early H3N2 influenza viruses, 1957–1972: evidence forgenetic divergence and multiple reassortment events. Virology 328,101–109.

Ma, W., Gramer, M., Rossow, K., Yoon, K.J., 2006. Isolation andgenetic characterization of new reassortment H3N1 swine influenzavirus from pigs in the Midwestern United States. J. Virol. 80, 5092–5096.

Macken, C., Lu, H., Goodman, J., Boykin, L., 2001. The value of adatabase in surveillance and vaccine selection. In: Osterhaus,A.D.M.E., Cox, N., Hampson, A.W. (Eds.), Options for the Controlof Influenza IV. Elsevier Science, Amsterdam, pp. 103–106.

Mase, M., Imada, T., Sanada, Y., Etoh, M., Sanada, N., Tsukamoto,K., Kawaoka, Y., Yamaguchi, S., 2001. Imported parakeets harborH9N2 influenza A viruses that are genetically closely related tothose transmitted to humans in Hong Kong. J. Virol. 75, 3490–3494.

Matrosovich, M., Zhou, N., Kawaoka, Y., Webster, R., 1999. Thesurface glycoproteins of H5 influenza viruses isolated from humans,chickens, and wild birds have distinguishable properties. J. Virol. 73,1146–1155.

Molinari, N.A.M., Ortega-Sanchez, I.R., Messonnier, M.L., Thompson,W.W., Wortley, P.M., Weintraub, E., Bridges, C.B., 2007. The annualimpact of seasonal influenza in the US: measuring disease burden andcosts. Vaccine 25, 5086–5096.

Morens, D.M., Fauci, A.S., 2007. The 1918 influenza pandemic: insightsfor the 21st century. J. Infect. Dis. 195, 1018–1028.

Naffakh, N., Massin, P., Escriou, N., Crescenzo-Chiagne, B., can derWerf, S., 2000. Genetic analysis of the compatibility betweenpolymerase proteins from human and avian strains of influenza Aviruses. J. Gen. Virol. 81, 1283–1291.

Nedyalkova, M.S., Hayden, F.G., Webster, R.G., Gubareva, L.V., 2002.Accumulation of defective neuraminidase (NA) genes by influenza Aviruses in the presence of NA inhibitors as a marker of reduceddependence on NA. J. Infect. Dis. 185, 591–598.

Nei, M., Kumar, S., 2000. Molecular Evolution and Phylogenetics.Oxford University Press, New York.

Noda, T., Sagara, H., Yen, A., Takada, A., Kida, H., Cheng, R.H.,Kawaoka, Y., 2006. Architecture of ribonucleoprotein complexes ininfluenza A virus particles. Nature 439, 490–492.

Obenauer, J.C., Denson, J., Mehta, P.K., Su, X., Mukatira, S., Finkel-stein, D.B., Xu, X., Wang, J., Ma, J., Fan, Y., Rakestra, K.M.,Webster, R.G., Hoffmann, E., Krauss, S., Zheng, Z., Zhang, Z.,Naeve, C.W., 2006. Large-scale sequence analysis of avian influenzaisolates. Science 311, 1576–1580.

Olsen, C.W., 2002. The emergence of novel swine influenza viruses inNorth America. Virus Res. 85, 199–210.

Plotkin, J.B., Dushoff, J., 2003. Codon bias and frequency-dependentselection on the hemagglutinin epitopes of influenza A virus. Proc.Natl. Acad. Sci. 100, 7152–7157.

Puthavathana, P., Auewarakul, P., Charoenying, P.C., Sangsiriwut, K.,Pooruk, P., Boonnak, K., Khanyok, R., Thawachsupa, P., Kijphati,R., Sawanpanyalert, P., 2005. Molecular characterization of thecomplete genome of human influenza H5N1 isolates from Thailand.J. Gen. Virol. 86, 423–433.

Reid, A.H., Fanning, T.G., Janczewski, T.A., Taubenberger, J.K., 2000.Characterization of the 1918 ‘Spanish’ influenza virus neuraminidasegene. Proc. Natl. Acad. Sci. USA 97, 6785–6790.

Reid, A.H., Fanning, T.G., Janczewski, T.A., McCall, S., Taubenberger,J.K., 2002. Characterization of the 1918 ‘Spanish’ influenza virusmatrix gene segment. J. Virol. 76, 10717–10723.

Reid, A.H., Fanning, T.G., Janczewski, T.A., Lourens, R.M., Tauben-berger, J.K., 2004. Novel origin of the 1918 pandemic influenza virusnucleoprotein gene. J. Virol. 78, 12462–12470.

Rogers, G.N., Paulson, J.C., 1983. Receptor determinants of humanand animal influenza virus isolates: differences in receptor specificityof the H3 hemagglutinin based on species of origin. Virology 127,361–373.

Rzhetsky, A., Nei, M., 1987. A simple method for estimating and testingminimum-evolution trees. J. Mol. Biol. 35, 367–375.

Saitou, N., Nei, M., 1987. The neighbor-joining method: a new method forconstructing phylogenetic trees. Mol. Biol. Evol. 4, 406–425.

Page 11: Origin of the 1918 Spanish influenza virus: A comparative genomic analysis

1110 G. Vana, K.M. Westover / Molecular Phylogenetics and Evolution 47 (2008) 1100–1110

Shaw, M., Cooper, L., Xu, X., Thompson, W., Krauss, S., Guan, Y.,Zhou, N., Klimov, A., Cox, N., Webster, R., Lim, W., Shortridge, K.,Subbarao, K., 2002. Molecular changes associated with the transmis-sion of avian influenza A H5N1 and H9N2 viruses to humans. J. Med.Virol. 66, 107–114.

Shortridge, K.F., Zhou, N.N., Guan, Y., Gao, P., Ito, T., Kawaoka, Y.,Kodihallli, S., Krauss, S., Markwell, D., Gopal Murti, K., Norwood,M., Senne, D., Sims, L., Takada, A., Webster, R.G., 1998. Charac-terization of avian H5N1 influenza virus from poultry in Hong Kong.Virology 252, 331–342.

Sitknikova, T., Rzetsky, A., Nei, M., 1995. Interior-branch and bootstraptests of phylogenetic trees. Mol. Biol. Evol. 12, 319–333.

Smith, G.J., Naipospos, T.S., Nguyen, T.D., de Jong, M.D., Vijaykrishna,D., Usman, T.B., Hassan, S.S., Nguyen, T.V., Dao, T.V., Bui, N.A.,Leung, Y.H., Cheung, C.L., Rayner, J.M., Zhang, J.X., Zhang, L.J.,Poon, L.L., Li, K.S., Nguyen, V.C., Hien, T.T., Farrar, J., Webster,R.G., Chen, H., Peiris, J.S., Guan, Y., 2006. Evolution and adaptationof H5N1 influenza virus in avian and human hosts in Indonesia andVietnam. Virology 350, 258–268.

Strimmer, K., von Haeseler, A., 1996. Quartet puzzling: a quartetmaximum-likelihood method for reconstructing tree topologies. Mol.Biol. Evol. 13, 964–969.

Suarez, D.L., Perdue, M.L., Cox, N., Rowe, T., Bender, C., Huang, J.,Swayne, D.E., 1998. Comparisons of highly virulent H5N1 influenza Aviruses isolated from humans and chickens from Hong Kong. J. Virol.72, 6678–6688.

Subbarao, E.K., London, W., Murphy, B.R., 1993. A single amino acid inthe PB2 gene of influenza A virus is a determinant of host range. J.Virol. 67, 1761–1764.

Sugimura, T., Yonemochi, H., Ogawa, T., Tanaka, Y., Kumagai, T., 1980.Isolation of a recombinant influenza virus (Hsw 1 N2) from swine inJapan. Arch. Virol. 66, 271–274.

Swofford, D.L., 2002. PAUP*. Phylogenetic Analysis Using Parsimony(*and other methods). Sinauer Associates, Sunderland, MA.

Tam, J.S., 2002. Influenza A (H5N1) in Hong Kong: an overview. Vaccine20 (Suppl. 2), S77–S81.

Taubenberger, J.K., Reid, A.H., Krafft, A.E., Bijwaard, K.E., Fanning,T.G., 1997. Initial genetic characterization of the 1918 ‘Spanish’influenza virus. Science 275, 1793–1796.

Taubenberger, J.K., Morens, D.M., 2006. 1918 Influenza: the mother ofall pandemics. Emerg. Infect. Dis. 12, 15–22.

Taubenberger, J.K., Reid, A.H., Lourens, R.M., Wang, R., Jim, G.,Fanning, T.G., 2005. Characterization of the 1918 influenza viruspolymerase genes. Nature 437, 889–893.

Thompson, J.D., Higgins, D.G., Gibson, T.J., 1994. CLUSTAL W:improving the sensitivity of progressive multiple sequence alignmentthrough sequence weighting, position-specific gap penalties and weightmatrix choice. Nucleic Acids Res. 22, 4673–4680.

Van Rompuy, L., Min Jou, W., Huylebroeck, D., Devos, R., Fiers, W.,1981. Complete nucleotide sequence of the nucleoprotein gene fromthe human influenza strain A/PR/8/34 (HON1). Eur. J. Biochem. 116,347–353.

Voyles, B.J., 2002. The Biology of Viruses, second ed. McGraw-Hill, NewYork, USA, pp. 147–149, 338–341.

Webster, R.G., Bean, W.J., Gorman, O.T., Chambers, T.M., Kawaoka,Y., 1992. Evolution and ecology of influenza A viruses. Microbiol.Rev. 56, 152–179.

Winter, G., Fields, S., 1980. Cloning of influenza cDNA into M13: thesequence of the RNA segment encoding the A/PR/8/34 matrix protein.Nucleic Acids Res. 8, 1965–1974.

Winter, G., Fields, S., Brownlee, G.G., 1981. Nucleotide sequence of thehaemagglutinin gene of a human influenza virus H1 subtype. Nature292, 72–75.

Winter, G., Fields, S., 1982. Nucleotide sequence of human influenza A/PR/8/34 segment 2. Nucleic Acids Res. 10, 2135–2143.

Xu, X., Subbarao, K., Cox, N.J., Guo, Y., 1999. Genetic characterizationof the pathogenic influenza A/Goose/Guangdong/1/96 (H5N1) virus:similarity of its hemagglutinin gene to those of H5N1 viruses from the1997 outbreaks in Hong Kong. Virology 261, 15–19.

Zhou, H., Jin, M., Chen, H., Huag, Q., Yu, Z., 2006. Genome-sequenceanalysis of the pathogenic H5N1 avian influenza A virus isolated inChina in 2004. Virus Genes 32, 85–95.