virulence gene and crispr multilocus sequence typing

The Pennsylvania State University

The Graduate School

Department of Food Science

VIRULENCE GENE AND CRISPR MULTILOCUS SEQUENCE TYPING

SCHEME FOR SUBTYPING THE MAJOR SEROVARS

OF SALMONELLA ENTERICA SUBSPECIES ENTERICA

A Thesis in

Food Science

by

Fenyun Liu

2010 Fenyun Liu

Submitted in Partial Fulfillment

of the Requirements

for the Degree of

Master of Science

December 2010

ii

The thesis of Fenyun Liu was reviewed and approved* by the following:

Stephen J. Knabel

Professor of Food Science

Thesis Co-Advisor

Edward G. Dudley

Assistant Professor of Food Science

Thesis Co-Advisor

Bhushan M. Jayarao

Professor of Veterinary Science

Rodolphe Barrangou

Adjunct Professor of Food Science

John D. Floros

Professor of Food Science

Head of the Department of Food Science

*Signatures are on file in the Graduate School

iii

ABSTRACT

Salmonella enterica subsp. enterica is the leading cause of bacterial foodborne disease in the

United States. Molecular subtyping methods are powerful tools for tracking the farm-to-fork

spread of foodborne pathogens during outbreaks. In order to develop a novel multilocus

sequence typing (MLST) scheme for subtyping the most prevalent serovars of Salmonella, the

virulence genes fimH and sseL and Clustered Regularly Interspaced Short Palindromic Repeat

(CRISPR) regions were sequenced from 171 clinical isolates from serovars Typhimurium,

Enteritidis, Newport, Heidelberg, Javiana, I 4, [5], 12; i: -, Montevideo, Muenchen and Saintpaul.

Another 63 environmental isolates and 70 poultry isolates of S. Enteritidis from poultry industries

in PA were also analyzed. The MLST scheme using only virulence genes was insufficient to

separate all unrelated outbreak clones. However, the addition of CRISPR sequences dramatically

improved discriminatory power of this MLST method. Moreover, the present MLST scheme

provided better discrimination of S. Enteritidis strains than PFGE. Cluster analyses revealed the

current MLST scheme is highly congruent with serotyping and epidemiological data. For the

analyses with S. Enteritidis isolates, the current MLST scheme identified three persistent and

predominant sequence types circulating among humans in the U.S. and poultry and hen house

environments in PA. It also identified an environment-specific sequence type. Moreover, cluster

analysis based on fimH and sseL identified three epidemic clones and one outbreak clone of S.

Enteritidis. In conclusion, the novel MLST scheme described in the present study accurately

differentiated outbreak clones of the major serovars of Salmonella, and therefore may be an

excellent tool for subtyping this important foodborne pathogen during outbreak investigations.

Furthermore, the MLST scheme may provide information about the ecological origin of S.

Enteritidis isolates, potentially identifying strains that differ in virulence capacity.

iv

TABLE OF CONTENTS

LIST OF FIGURES……………………………………………………………………………vi

LIST OF TABLES…………………………………………………………………………….vii

LIST OF ABBREVIATIONS AND DEFINITIONS…………………………………………viii

ACKNOWLEDGEMENTS…………………………………………………………………….x

Chapter 1 Statement of the problem ........................................................................................ 1

Chapter 2 Literature review ..................................................................................................... 3

2.1 Salmonellosis ............................................................................................................. 3 2.1.1 Salmonella ....................................................................................................... 4 2.1.2 Salmonella taxonomy and serotyping ............................................................. 4 2.1.3 Evolution of pathogenicity .............................................................................. 5 2.1.4 Salmonella reservoirs ...................................................................................... 6 2.1.5 Salmonella association with foods .................................................................. 8 2.1.6 Most common Salmonella serovars associated with human illnesses ............. 9

2.2 Subtyping of Salmonella ............................................................................................ 15 2.2.1 Important definitions and performance criteria of subtyping methods ........... 16 2.2.2 Salmonella subtyping methods during epidemiologic investigations ............. 17 2.3.2.1 Phenotypic methods ..................................................................................... 18 2.2.2.1.1 Serotyping ................................................................................................. 18 2.2.2.1.2 Phage typing .............................................................................................. 18 2.2.2.1.3 Multilocus enzyme electrophoresis (MLEE)............................................. 19 2.2.2.2 Genotypic methods ....................................................................................... 19 2.2.2.2.1 DNA-fragment-pattern-based methods ..................................................... 20 2.2.2.2.1.1 Pulsed-Field Gel Electrophoresis (PFGE) .............................................. 20 2.2.2.2.1.2 Amplified Fragment Length Polymorphism (AFLP) ............................. 22 2.2.2.2.1.3 Multiple Loci Variable number tandem repeat Analysis (MLVA) ........ 23 2.2.2.2.2 DNA-sequence-based methods ................................................................. 24 2.2.2.2.2.1 Multilocus Sequence Typing (MLST) ................................................... 24 2.2.2.2.2.2 Multi-Virulence-Locus Sequence Typing (MVLST) ............................. 26 2.2.2.2.2.3 Single Nucleotide Polymorphism (SNP) analysis .................................. 27

2.3 Clustered Regularly Interspaced Palindromic Repeat (CRISPR) .............................. 28 2.3.1 CRISPR in Salmonella .................................................................................... 30

2.4 Conclusions ................................................................................................................ 30 2.5 References .................................................................................................................. 31

Chapter 3 Novel virulence gene and CRISPR multilocus sequence typing scheme for

subtyping the major serovars of Salmonella enterica subspecies enterica ...................... 46

3.1 Abstract ...................................................................................................................... 47

v

3.2 Introduction ................................................................................................................ 48 3.3 Materials and methods ............................................................................................... 52 3.4 Results ........................................................................................................................ 55 3.5 Discussion .................................................................................................................. 74 3.6 Acknowledgements .................................................................................................... 78 3.7 References .................................................................................................................. 79

Chapter 4 Characterization of clinical, poultry and environmental Salmonella Enteritidis

isolates using multilocus sequence typing based on virulence genes and CRISPRs ....... 86

4.1 Abstract ...................................................................................................................... 87 4.2 Introduction ................................................................................................................ 88 4.3 Materials and methods ............................................................................................... 91 4.4 Results ........................................................................................................................ 93 4.5 Discussion .................................................................................................................. 104 4.6 Acknowledgements .................................................................................................... 108 4.7 References .................................................................................................................. 109

Chapter 5 Conclusions and future research .............................................................................. 115

5.1 Conclusions ................................................................................................................ 115 5.2 Future research ........................................................................................................... 117

APPENDIX Supplemental materials………………………………………………………... 121

vi

LIST OF FIGURES

Figure 2.1 Model for the three-phase evolution of pathogenicity in Salmonella enterica

subspecies enterica. The phylogenetic tree is not drawn to scale (7). ............................ 6

Figure 2.2 Schematic view of the two CRISPR systems in Salmonella Typhimurium LT2.

.......................................................................................................................................... 29

Figure 3.1. Schematic view of the two CRISPR systems in Salmonella Typhimurium

LT2. .................................................................................................................................. 72

Figure 3.2. (a) Cluster diagram based on only fimH and sseL. (b) Cluster diagram based

on fimH, sseL and CRISPRs (combined allele of CRISPR1 and CRISPR2). .................. 73

Figure 4.1. Potential routes of transmission of S. Enteritidis contamination throughout

the egg food system. ......................................................................................................... 98

Figure 4.2. Schematic view of the two CRISPR systems in Salmonella Enteritidis strain

P125109. .......................................................................................................................... 99

Figure 4.3. Frequency of the five predominant sequence types (E ST1, 3, 4, 8 and 10) in

clinical, poultry and environmental isolates. .................................................................... 100

Figure 4.4. Cluster diagram based on only fimH and sseL for all 27 sequence types. ............ 101

Figure 4.5. Cluster diagram based on virulence genes and CRISPRs for all 27 sequence

types. ................................................................................................................................ 102

Figure 4.6. Graphic representation of spacer arrangements in CRISPR1 and CRISPR2 of

the 27 S. Enteritidis sequence types. ................................................................................ 103

Figure S1. Graphic representation of spacer arrangements in CRISPR1 and CRISPR2. ....... 124

vii

LIST OF TABLES

Table 2.1 Top ten most frequently reported serovars from human sources in 2005 ................ 10

Table 2.2 Top ten most frequently reported serovars from human sources in 2006 ................ 10

Table 3.1. Top nine most frequently reported serovars from human sources in 2005

which were analyzed in the present study ........................................................................ 60

Table 3.2. Outbreak information, PFGE profile and MLST results for the 171 isolates

analyzed in the present study ........................................................................................... 61

Table 3.3. Size, function and nucleotide location of the four markers targeted in the

present study .................................................................................................................... 65

Table 3.4. Primers used to amplify and sequence the four MLST markers ............................ 66

Table 3.5. Number of isolates, allelic types and sequence types in each serovar ................... 67

Table 3.6. Allelic polymorphisms and nucleotide substitutions in the nucleotide

sequences of fimH and sseL ............................................................................................. 68

Table 3.7. Analysis of CRISPR repeat sequences .................................................................. 69

Table 3.8. Analysis of CRISPR spacers in different serovars................................................. 70

Table 3.9. Comparison of epidemiologic concordance1 between PFGE and MLST based

on virulence genes and CRISPRs for the selected strains analyzed in the present

study ................................................................................................................................. 71

Table 4.1. Sources, sample types and isolation information for the 167 S. Enteritidis

isolates analyzed in the present study .............................................................................. 96

Table 4.2. Primers used to amplify and sequence the four MLST markers ............................ 97

Table S1. Primers used to amplify and sequence other virulence genes ................................ 121

Table S2. Source, isolate information and MLST results for the 167 isolates analyzed in

the present study ............................................................................................................... 125

viii

LIST OF ABBREVIATIONS AND DEFINITIONS

ADL Animal Diagnostic Lab

AFLP Amplified Fragment Length Polymorphism

bp Base Pair

C Cytosine

CDC Centers for Disease Control and Prevention

°C Degree Celsius

Clone† A group of isolates deriving from a common ancestor as part of a direct

chain of replication and transmission from host to host or from the

environment to host.

CRISPR Clustered Regularly Interspaced Short Palindromic Repeats

D Discriminatory Power

DNA Deoxyribonucleic Acid

dNTP Deoxyribonucleotide Triphosphate

DR Direct Repeat

E Epidemiological Concordance

EC Epidemic Clone

ml milliliter

G Guanine

MLEE Multi-Locus Enzyme Electrophoresis

MLST Multilocus Sequence Typing

MLVA Multiple-Locus Variable-number tandem repeat Analysis

MVLST Multi-Virulence-Locus Sequence Typing

NCBI National Center for Biotechnology Information

ix

PCR Polymerase Chain Reaction

PFGE Pulsed-Field Gel Electrophoresis

PEQAP Pennsylvania Egg Quality Assurance Program

RNA Ribonucleic Acid

rRNA Ribosomal Ribonucleic Acid

SNP Single Nucleotide Polymorphism

Strain† Isolate(s) that exhibit distinct phenotypic and/or genotypic characteristics

from other isolates of the same species

ST Sequence Type

T Thymine

USDA United States Department of Agriculture

μl microliter

WGS Whole Genome Shotgun

† Clone and strain were defined previously by Struelens et al. (101).

x

ACKNOWLEDGEMENTS

I thank my parents, Zijian Liu and Guixiang Liu, who support and encourage me to study

in the US. I am also grateful for the support of my sister, Fenni Liu.

I would like to give my sincere thanks my advisors, Dr. Stephen J. Knabel and Dr.

Edward G. Dudley. I learned from them not only how to do research but also how to lead my life.

I feel so grateful for the working experience with them. I also thank my committee members, Dr.

Rodolphe Barrangou, and Dr. Bhushan M. Jayarao for their guidance and encouragement.

Additionally, I thank Dr. Kariyawasam, Dr. Gerner-Smidt and Dr. Ribot for their help with the

research.

I thank my labmates, Jia Wen, Mei Lok, Gabari, Michelle, Carrie, and Mat for their help

and encouragement. I also want to give special thanks to Dr. Bindhu Verghese for her guidance

and help with my research. Furthermore, I want to thank all the faculty, graduate students and

staff in the Department of Food Science for their support.

At last, I thank USDA and the Department of Food Science for supporting my research.

1

Chapter 1

Statement of the problem

Salmonella is one of the most common foodborne bacteria worldwide. In the United

States alone, there were approximately 1.4 million cases of salmonellosis each year since 1996,

which resulted in a heavy burden on public health and the economy. In order to develop effective

intervention strategies to control salmonellosis during outbreaks, it is critical to rapidly and

accurately track the farm-to-fork spread of Salmonella. Molecular subtyping methods are

powerful tools for investigating the transmission of Salmonella by characterizing specific

outbreak clones. Serotyping has been one of the major subtyping methods employed during

outbreaks to provide base line information about the serovar involved. There are approximately

2,500 different serovars of Salmonella; however, the top ten serovars caused approximately 60%

of all outbreak cases. Each of those top serovars is known to cause numerous outbreaks, each of

which is typically caused by a specific outbreak clone. Therefore, molecular subtyping methods,

which are generally more discriminatory than serotyping, are needed to further distinguish

different strains of a particular serovar. Pulsed-field gel electrophoresis (PFGE) is currently

CDC’s ―gold standard‖ approach for subtyping Salmonella. However, PFGE sometimes lacks

discriminatory power and epidemiologic concordance for typing clonal serovars, such as S.

Enteritidis and S. Montevideo. Many studies have been conducted to develop alternative

subtyping methods, one of which is multi-locus sequence typing (MLST). Previous MLST

schemes for Salmonella focused mainly on discriminatory power; however, none of the previous

MLST studies examined the epidemiologic concordance of the MLST schemes or attempted to

distinguish strains within highly clonal Salmonella serovars, such as S. Enteritidis and S.

2

Montevideo. Moreover, for S. Enteritidis, our knowledge of their epidemiology is hindered due

to its clonal nature. Therefore, the main purpose of the present study was to enhance the

molecular epidemiology of Salmonella by developing an MLST scheme that has both high

discriminatory power and high epidemiologic concordance for subtyping the major serovars of

Salmonella.

3

Chapter 2

Literature review

2.1 Salmonellosis

Salmonella infections (salmonellosis) include three forms of disease: gastroenteritis,

bacteremia and typhoid fever. After ingestion of Salmonella into the gastrointestinal system,

gastroenteritis can develop, which is characterized by symptoms such as abdominal pain, nausea,

vomiting and diarrhea. More severe manifestations of salmonellosis, such as bacteremia and

typhoid fever can develop after the invasion of Salmonella into the bloodstream. Common

symptoms of bacteremia are fever, focal infections, sepsis and meningitis. Typhoid fever is a

deadly systemic infection for humans caused by S. Typhi.

The incidence of typhoid fever has declined in the U.S. with approximately 400 cases

annually (33). On the other hand, infections due to nontyphoidal Salmonella (mainly

gastroenteritis) have increased dramatically during the last 3 to 4 decades (29, 53). The increased

number of infections from nontyphoidal Salmonella may result from modern intensified farming

and food production methods and global trade. Increased spread of Salmonella may also be

promoted by the acquisition of genes for antibiotic resistance (102), and in the case of S.

Enteritidis, genes permitting colonization of chicken ovaries (49).

Globally, it is estimated that there are 93.8 million cases of gastroenteritis due to

Salmonella annually, out of which 80.3 million (86%) cases are foodborne (76). In the United

States, salmonellosis is the leading cause of foodborne bacterial disease, with approximately 1.4

million human cases each year, resulting in 17,000 hospitalizations, 585 deaths (28,116) and a

cost of 2.6 billion dollars due to loss of work, medical care and loss of life (112). Therefore, it is

4

imperative to study the origins, transmission and epidemiology of this pathogen in order to

control and prevent diseases in the future.

2.1.1 Salmonella

Salmonella is one of the most well-known and frequent foodborne bacterial pathogens throughout

the world (76). Salmonella is a genus of rod-shaped, gram negative, non-spore forming,

facultative anaerobic and motile bacteria belonging to the family Enterobacteriaceae.

2.1.2 Salmonella taxonomy and serotyping

The genus Salmonella is comprised of two species: S. enterica and S. bongori. The

species S. bongori is rarely associated with human disease. The species S. enterica has six

subspecies: enterica, salamae, arizonae, diarizonae, houtenae and indica (63, 107). S. enterica

subspecies enterica is responsible for 99% of the human cases of salmonellosis, so it is of greatest

clinical importance (2).

Salmonella subspecies are further differentiated based on serotyping. Serotyping

distinguishes Salmonella immunologically based upon O antigens (lipopolysaccharide) and H

antigens (peritrichous flagella). There are more than 2,500 recognized S. enterica serovars, each

with a unique combination of O and H antigens (54). Prior to 2000, serovars were sometimes

used as species names (16). For example, the original S. typhimurium is now referred to as S.

enterica subspecies enterica serovar Typhimurium or simply S. Typhimurium. The latter

nomenclature is used more commonly in publications and public health surveillance programs

such as those administrated by the Centers for Disease Control and Prevention (CDC).

5

2.1.3 Evolution of pathogenicity

S. enterica subspecies enterica was proposed to evolve in 3 main steps (Fig. 2.1) (7). The

first step involved acquisition of Salmonella pathogenicity island 1 (SPI1) which contributed to

the divergence of Salmonella from E. coli and other related organisms. SPI1 is a 40 kb DNA

region present in both S. enterica and S. bongori (78). It encodes a type III secretion system

(T3SS) required for the intestinal phase of infection and promotes inflammation, the invasion of

intestinal epithelial cells, and secretion of intestinal fluid (117).

The second step of evolution was hypothesized to be the acquisition of a second

pathogenicity island SPI2 in the species S. enterica but not in S. bongori (Fig. 2.1) (7). SPI2

encodes another T3SS and various effector proteins that are required for survival and replication

inside host cells during systemic infection (86, 97). For example, one of the many SPI2 effector

proteins, SseL, is involved in macrophage killing, thus promoting survival inside the host (95).

Due to the presence of SPI2, S. enterica has increased capacity for systemic spread and is thus

more virulent than S. bongori, which do not contain SPI2.

Finally, the host range of S. enterica subspecies enterica expanded to warm-blooded

animals, including humans (Fig. 2.1) (7). In contrast, the other five S. enterica subspecies and S.

bongori are mainly associated with cold-blooded animals. The expansion of host range to warm-

blooded animals requires that bacteria recognize the new hosts for the first step of infection.

Recognition and attachment to the host involves adherence and colonization factors called

adhesins. For example, fimbrial adhesin encoded by the gene fimH allows Salmonella to

recognize and adhere to different receptors on host cells (66, 99). Genetic changes of this gene

by point mutation or recombination might allow the subspecies enterica to recognize new

receptors in new hosts, thus helping to expand its host range. After recognition and attachment,

other processes allowing the subspecies enterica to infect warm blooded animals may include the

ability to survive the immune system and proliferating inside host cells (7). It is not clear which

6

genetic changes accounted for these processes during adaptation to new hosts because adaptation

to a new animal host is a complex process that probably involves a large number of genes.

In summary, acquisition of SPI1 separated the genus Salmonella from other related

organisms like E. coli. Then, acquisition of SPI2 separated the genus Salmonella into two distinct

lineages, S. bongori and S. enterica. Finally, the lineage of S. enterica branched into several

distinct phylogenetic groups. This latter phase of evolution was characterized by host range

expansion of the subspecies enterica to warm-blooded animals, including humans. Through all

these evolutionary steps, Salmonella enterica subspecies enterica (hereafter referred to as

Salmonella) became a highly successful human and animal pathogen.

Figure 2.1 Model for the three-phase evolution of pathogenicity in Salmonella enterica

subspecies enterica. The phylogenetic tree is not drawn to scale (7).

2.1.4 Salmonella reservoirs

Salmonella is mostly transmitted through the fecal-oral route. Salmonellosis occurs when

humans consume foods or water contaminated by animal and human feces containing Salmonella

during food-handling or harvesting. Therefore, foods serve as the main transmission vector for

7

Salmonella, which include animal foods that are not thoroughly cooked and contaminated

uncooked vegetables and fruits (116).

Generally speaking, transmission of Salmonella starts from its reservoirs, which are

defined as any person, animal, plant, soil or substance (or combination of these) in which a

microorganism normally lives and grows (67). Salmonella serovars have adapted to live in a

variety of hosts. Many wild animals, such as gorillas (10), rhinoceros (68), lizards (88), reptiles

and snakes (9) harbor Salmonella. More importantly, food animals including chickens, turkeys,

cattle, swine and sheep have also been found to frequently carry Salmonella.

Different serovars have different reservoirs and modes of pathogenesis. For example, S.

Typhi, which causes the deadly disease typhoid fever, is a strict human pathogen. Some other

serovars, such as S. Gallinarum in chickens, S. Choleraesuis in swine and S. Dublin in cattle, are

known to be associated mainly with one animal, but rarely cause disease in humans. In contrast,

other serovars like S. Typhimurium have adapted to a broad host range, including wild and

domestic animals and humans. Moreover, different animals have different predominant serovars

associated with them. Predominant serovars associated with poultry, cattle and swine will be

reviewed here in brief because those animals are the primary vectors for transmitting Salmonella

to humans and are the main focus of this study.

The most prevalent and important reservoirs for Salmonella are poultry (23). The most

common poultry-associated serovars, Enteritidis in eggs and Typhimurium in poultry, accounted

for 33.3 % of the total human foodborne diseases in the U.S. (20). The top 5 most common

serovars associated with broilers are Kentucky, Heidelberg, Enteritidis, Typhimurium and I 4, [5],

12: i: - (113). They represent 81% of all Salmonella isolates from broilers. Similarly, serovars

Hadar, Heidelberg, Reading, Schwarzengrund, and Saintpaul account for 68% of all Salmonella

isolates from turkeys (113).

8

Cattle are also frequently found to harbor Salmonella. They can carry many different

serovars of Salmonella, with Montevideo, Anatum, Muenster, Newport, Mbandanka the most

common serovars that account for 47 % of Salmonella isolates from cattle (114).

As for swine, another important reservoir for Salmonella, the 5 most frequent serovars

are Derby, Typhimurium, Infantis, Anatum and Saintpaul. These 5 serovars comprise 60% of all

isolates from swine (114).

It is noteworthy that most of these serovars found predominantly in food animals are the

same serovars that are frequently associated with human diseases. Given this fact, it is of great

importance to control and monitor levels of the most common serovars in animals and

subsequently prevent their transmission to humans.

2.1.5 Salmonella association with foods

Another important vehicle for transmitting Salmonella to humans is produce. Salmonella

can cycle through the food chain and the environment in soil, water, manure, and insects.

Therefore, contamination of produce can occur by various ways throughout the food system.

Like predominant serovars in animals, there are also predominant produce-associated serovars,

which include Enteritidis, Newport, Poona, Typhimurium, Braenderup, Javiana, Montevideo and

Muenchen (60). The overlap between serovars most commonly associated with animals and

those associated with produce suggests contamination of produce during growing or harvesting

processes directly or indirectly by animals containing Salmonella. Moreover, evidence is

accumulating that enteric bacteria have the ability to grow and persist on and in plants, such as

tomatoes, radish sprouts, bean sprouts, barley, and lettuce (15, 47, 62).

Contamination and persistence of Salmonella on produce promote the transmission of

this pathogen to humans. Salmonella outbreaks associated with fresh produce have increased in

the U.S in recent years (98). Many kinds of produce have been linked to Salmonella outbreaks,

9

such as tomatoes, sprouts, melons, cantaloupe, lettuce, peppers and mangos (98). Produce causes

the highest number of human diseases and second highest number of outbreaks among various

food vehicles in the U.S. (3). For example, the largest Salmonella outbreak to date occurred in

2008 and was caused by consumption of Jalapeño and Serrano peppers that were contaminated

with S. Saintpaul (22).

Besides foods of animal origin and produce, there has been an increase in Salmonella

outbreaks caused by new food vehicles, such as salami, peanut butter, veggie booty, pot pies, and

dry cereals. For instance, in 2010, Italian-style salami and its ingredients (red and black peppers

containing S. Montevideo) caused a multistate outbreak which infected 252 people from 44 states

(27). As a result, approximately 1,378,754 pounds of Italian sausage products were recalled by

Daniele International, Inc. (27). Another recent outbreak caused by a new food vehicle is the

2008-2009 peanut butter outbreak, which infected 714 people from 46 states and caused 6 deaths

(24). As a result, more than 2,100 peanut-containing products were recalled by over 200

companies.

Outbreaks due to those new food vehicles were not expected because they are more or

less processed foods which do not possess conditions that permit the growth of Salmonella. For

example, peanut butter is a dry food with an aw below the minimum level for growth (0.94).

Moreover, Salmonella can be inhibited or killed by heat, acid, high salt concentration, etc. during

food manufacturing processes (38). Persistence of Salmonella in processed foods might be due to

1) high levels of Salmonella in food ingredients; 2) inadequate sanitary practices; 3) and the

ubiquity of Salmonella in animals, produce and the environment.

2.1.6 Most common Salmonella serovars associated with human illnesses

Although there are over 2,500 Salmonella serovars, only a handful of Salmonella

serovars caused most human illnesses (Tables 2.1 and 2.2) (20, 21).

10

Table 2.1 Top ten most frequently reported serovars from human sources in 2005

Rank Serovar No. of laboratory-confirmed cases % of total cases

1 Typhimurium 6982 19.3

2 Enteritidis 6730 18.6

3 Newport 3295 9.1

4 Heidelberg 1903 5.3

5 Javiana 1324 3.7

6 I 4, [5], 12: i :- 822 2.3

7 Montevideo 809 2.2

8 Muenchen 733 2

9 Saintpaul 683 1.9

10 Braenderup 603 1.7

total 66

Laboratory-confirmed cases include both outbreak cases and sporadic cases.

Source: 2005 Salmonella annual review (20).

Table 2.2 Top ten most frequently reported serovars from human sources in 2006

Rank Serovar No. of laboratory-confirmed cases % of total cases



3 Newport 3373 8.3


5 Javiana 1433 3.5

6 I 4, [5], 12: i :- 1200 3.0


8 Muenchen 753 1.9

9 Oranienburg 719 1.8

10 Mississippi 604 1.5

total 60

Laboratory-confirmed cases include both outbreak cases and sporadic cases.

Source: 2006 Salmonella annual review (21).

11

Compared to all the other serovars of Salmonella, S. Typhimurium caused the highest

number of human illnesses and was associated with a broad range of foods (Table 2.3). As

mentioned before, S. Typhimurium has adapted to various hosts, including birds, amphibians, and

all food animals, especially poultry, cattle and swine. Not only can S. Typhimurium reside in so

many animals, but it can also be found in them at high frequency (114). The ubiquity and

relatively high numbers of S. Typhimurium might explain why it has caused so many outbreaks

via so many kinds of foods (Table 2.3).

The second most common serovar is S. Enteritidis, which caused nearly as many human

cases as S. Typhimurium (Tables 2.1 and 2.2). The major food vehicles for S. Enteritidis are shell

eggs, as 80% of the S. Enteritidis outbreaks were egg-associated (89). S. Enteritidis contaminates

eggs either through horizontal transmission, by which eggs are externally contaminated by feces

containing S. Enteritidis (36), or by vertical transmission, where the inside of the eggs is

contaminated by infected ovaries before the laying of the egg (50, 87). Vertical transmission is

believed to be the more important route because eggs contaminated by vertical transmission

produce a new generation of infected broilers or layers after hatching (50, 57, 79). In order to

control S. Enteritidis in poultry, one of the interventions employed in the U.S. is egg quality

assurance programs on farms. These voluntary programs involve acquisition of S. Enteritidis free

chicks, control of pests (including rodents and flies), use of S. Enteritidis-free feeds, and routine

microbiologic testing for S. Enteritidis in the farm environment (14).

The third most commonly reported serovar causing salmonellosis is S. Newport (Tables

2.1 and 2.2). S. Newport can be detected in many food animals, but is most frequently isolated

from cattle (113). S. Newport has been implicated in many outbreaks via a variety of food

vehicles, such as beef, chicken, pork, tomatoes, cantaloupes, melons, avocadoes and guacamole

12

(23). In 2010, S. Newport caused a multistate outbreak due to contaminated alfalfa sprouts, in

which 35 people became ill (26). Cases of illness caused by S. Newport have increased in recent

years, which might be due to the emerging multidrug-resistant S. Newport isolates (19).

The fourth most common serovar is S. Heidelberg (Tables 2.1 and 2.2). It is often

isolated from commercial broilers and ground chicken (113). As a result, poultry and eggs have

been identified as the major food vehicles for this serovar (32). The largest outbreak caused by S.

Heidelberg occurred in 2007, when 802 people became infected via contaminated hummus (Table

2.3).

Following S. Heidelberg, S. Javiana caused the fifth most human infections (Tables 2.1

and 2.2). Unlike other serovars, S. Javiana is rarely isolated from poultry, cattle or swine (113).

The major reservoirs for S. Javiana were considered to be amphibians, as direct contact with

amphibians has been associated with outbreaks. Amphibian feces-contaminated tomatoes were

identified to be the main food vehicles for S. Javiana (34). For example, tomatoes were identified

to be the food source of S. Javiana for a multistate outbreak in 2002, which resulted in 159 cases

(Table 2.3).

The sixth most common serovar I 4, [5], 12: i :- , a variant of serovar S. Typhimurium, is

antigenically similar to S. Typhimurium, but lacks the second-phase

flagella antigens (39). It is

also one of the most commonly identified serovar in broilers and ground chicken (113). I 4, [5],

12: i :- contaminated pot pies caused a multistate outbreak in 2007 (Table 2.3).

S. Montevideo is the next most commonly reported serovar. S. Montevideo is frequently

isolated from cattle and ground beef (113). Food vehicles of S. Montevideo include beef, turkey,

pork and sprouts (22). The most recent outbreak caused by S. Montevideo occurred in 2010 due

to contaminated Italian-style meats (27).

The eighth most common serovar is S. Muenchen. S. Muenchen can be detected in swine,

cattle, chicken etc. It has been associated with outbreaks due to multiple food vehicles, such as

13

chicken, sprouts, tomato, and cantaloupe (22). In 1999, a multistate outbreak was caused by S.

Muenchen in orange juice, which infected 398 people.

S. Saintpaul ranks as the ninth most common serovar in 2005, but dropped to eleventh in

2006 (20, 21). However, its ranking might have risen higher since then, because it caused the

largest Salmonella outbreak in 2008 due to contaminated peppers. S. Saintpaul is frequently

isolated from swine and has caused outbreaks due to foods like sprouts, tomatoes, mangoes,

orange juice, turkey etc.

The importance of the above top serovars is reflected by the high number of

salmonellosis cases they cause. Their success as human pathogens might be largely due to

adaptation to food animals. For example, 4 of the top 8 serovars are frequently found in poultry,

namely Typhimurium, Enteritidis, Heidelberg and I 4, [5], 12: i :-. Two other serovars, Newport

and Montevideo, are mainly found in cattle.

14

Table 2.3 Salmonella outbreaks caused by the top 8 serovars in the United States from 1998- 2010

Year Serovar Ill Hospitalizations Deaths Food vehicle

2008 Typhimurium 530 116 8 peanut butter

2001 Typhimurium 404 0 4 unidentified

2006 Typhimurium 199 39 0 deli meat

2006 Typhimurium 192 24 0 tomato

2005 Typhimurium 162 0 sauces; fajita

2006 Typhimurium 161 7 0 chicken

1998 Typhimurium 134 10 0 multiple foods

2002 Typhimurium 132 0 0 unidentified

2002 Typhimurium 116 4 0 milk

1999 Typhimurium 112 3 0 clover sprouts

2002 Typhimurium 107 6 0 milk

2007 Typhimurium 87 8 0 Veggie Booty

2007 Typhimurium 76 4 0 lettuce; spinach

2003 Typhimurium 67 2 0 eggs

2007 Typhimurium 66 3 0 pork

2003 Typhimurium 59 2 0 beef

2005 Typhimurium 57 8 0 cake

2003 Typhimurium 56 11 0 ground beef

1998 Typhimurium 50 1 0 smoked fish

2003 Typhimurium 50 7 0 queso fresco

2002 Enteritidis 700 3 0 salsa

2005 Enteritidis 304 56 1 turkey

1999 Enteritidis 256 0 0 ice cream

2001 Enteritidis 231 34 0 egg-based sauce

2002 Enteritidis 196 24 0 cake

2005 Enteritidis 126 15 0 cantaloupe

2006 Enteritidis 113 23 0 oil; chicken

2001 Enteritidis 113 0 0 eggs

2000 Enteritidis 106 14 0 macaroni cheese

2007 Enteritidis 106 14 0 chicken

2003 Enteritidis 104 12 0 crab cakes

2001 Enteritidis 92 7 0 eggs

2002 Enteritidis 90 2 0 beef; pork

2000 Enteritidis 88 orange juice

1999 Enteritidis 82 3 0 honeydew melon

2002 Newport 510 tomato

2006 Newport 115 8 0 tomato

2004 Newport 100 5 0 milk

2004 Newport 97 lettuce

2000 Newport 96 6 0 pico de gallo

1999 Newport 79 mango

2006 Newport 77 2 0 turkey

2003 Newport 68 13 2 honeydew melon

2007 Newport 67 5 0 pork

2007 Newport 65 11 0 tomato

2004 Newport 49 8 0 turkey and gravy

15

2002 Newport 47 12 1 ground beef

2007 Newport 46 tomato; avocado

2007 Heidelberg 802 29 0 hummus

2003 Heidelberg 517 chicken

2002 Heidelberg 239 22 0 beef

1998 Heidelberg 200 4 0 cake

2002 Heidelberg 104 22 macaroni cheese

2007 Heidelberg 79 mashed potato

2004 Heidelberg 78 2 0 turkey

2005 Heidelberg 75 5 0 sandwich; vanilla cake

2003 Heidelberg 65 14 0 Swiss cheese

2003 Heidelberg 57 7 0 eggs; pancakes

2000 Heidelberg 56 3 0 macaroni salad

1999 Heidelberg 41 chicken

2003 Javiana 227 9 0 fajita, chicken

2002 Javiana 159 3 0 tomato

2004 Javiana 60 1 0 beans

2000 Javiana 44 8 0 bread; chicken

2007 I 4,[5],12:i :- 401 108 3 pot pie

2010 Montevideo 252 Italian-style meats

2006 Montevideo 72 19 0 sandwich, beef

2002 Montevideo 55 6 0 beef

1999 Muenchen 398 orange juice

1999 Muenchen 61 6 0 alfalfa sprouts

2003 Muenchen 58 15 cantaloupe

2002 Muenchen 57 3 0 pasta salad

2005 Saintpaul ;

Typhimurium

157 orange juice

2008 Saintpaul 1442 286 2 peppers

2009 Saintpaul 235 alfalfa sprouts

Source: CDC foodborne outbreak database (23).

2.2 Subtyping of Salmonella

In order to control Salmonella outbreaks, it is important to trace back the sources and

identify the routes by which Salmonella are transmitted to foods. However, trace-back

investigation of outbreaks can be hindered due to the complexity of the food chain and the

limitations of traditional epidemiologic investigations. The limitations of traditional

epidemiologic investigations include 1) Only a limited number of cases are reported; 2) People

tend not to recall the foods that were eaten before disease onset; 3) Cases are often spread out in

16

time and space; and 4) Investigations can be hindered if the food source is not listed on the

investigation questionnaire (60).

Based on the reasons above, another trace-back method called subtyping is carried out

along with traditional epidemiologic investigations. Subtyping characterizes bacteria at the strain

level (101). By characterizing the outbreak-related strains and separating them from non-related

strains, subtyping can play an essential role in investigating Salmonella outbreaks.

Besides tracking pathogens in epidemiologic investigations, the other use of subtyping

methods is to study the population structure, evolution and diversity of bacteria on a long-term

scale. For example, one subtyping method called multilocus enzyme electrophoresis (MLEE) has

been used to study the genetic diversity of Salmonella populations (8). Studies like this can

provide insight into the evolutionary history and emergence of Salmonella serovars. However,

the focus of this review is on the short-term epidemiologic applications of subtyping methods.

2.2.1 Important definitions and performance criteria of subtyping methods

Before considering the epidemiology of Salmonella, it is important to first clarify the

definitions for outbreak, epidemic, strain, epidemic clone (EC), and outbreak clone (OC) used

frequently in epidemiologic studies. These definitions were previously compiled by Chen and

Knabel (30). Outbreak is an acute appearance of a cluster of an illness that occurs in numbers in

excess of what is expected for that time and place. Epidemic is defined as one or more outbreaks

that spread widely over a long period of time. Strain is defined as isolates that have distinct

phenotypic and genotypic characteristics from other isolates from the same species. Epidemic

clone is a strain or group of strains descended asexually from a single ancestral cell (source strain)

that is involved in one epidemic, and can often include several outbreaks. Outbreak clone is a

strain or group of strains descended asexually from a single ancestral cell (source strain) that is

involved in one outbreak (30).

17

To evaluate and compare different subtyping schemes, there are several performance

criteria, which include typeability, reproducibility, discriminatory power and epidemiologic

concordance. Typeability is the capability of a method to generate an interpretable result for each

strain typed. For example, strains that do not have plasmids cannot be typed by plasmid profiles.

Reproducibility is the ability of a subtyping method to generate the same result each time the

sample is tested. Discriminatory power is the ability of a subtyping method to differentiate

between unrelated epidemic or outbreak clones. Epidemiologic concordance is the capacity of a

typing method to correctly cluster epidemic and outbreak clones, and separate them from clones

that are not epidemiologically related (101). Many studies of subtyping methods focused on the

discriminatory power of the subtyping system. On the other hand, few studies have examined the

epidemiologic concordance of a particular subtyping method. The reason for the lack of studies

examining epidemiologic concordance might be that most studies did not utilize well-defined

strains from multiple outbreaks.

The choice of strain collection is critical when developing and evaluating a new

subtyping system for outbreak investigations. As mentioned before, an ideal strain collection

should include well-defined strains from multiple common-source outbreaks in order to access

both discriminatory power and epidemiologic concordance. A good subtyping system should

separate strains from different outbreaks, but not separate strains within the same

outbreak/outbreak clone.

2.2.2 Salmonella subtyping methods during epidemiologic investigations

Subtyping methods can be either phenotypic or genotypic approaches. Phenotypic

methods include screening for antibiotic resistance, bacteriophage susceptibility and surface

antigens, such as the H and O antigens. Genotypic methods differentiate strains based on

differences in genome sequence and/or structure. Major phenotypic and genotypic subtyping

18

methods available for Salmonella will be briefly discussed here with the primary focus on

genotypic methods.

2.3.2.1 Phenotypic methods

Before the advent of genotypic methods, many phenotypic methods were widely used for

typing Salmonella strains. Common phenotypic methods for Salmonella include serotyping,

phage typing and MLEE. In general, although phenotypic methods provide useful information

about the strains, they often lack enough discriminatory power.

2.2.2.1.1 Serotyping

As mentioned in the taxonomy section, serotyping distinguishes Salmonella based on

immunological classification of the H and O antigens (54). Serotyping is one of the most

important phenotypic methods for Salmonella, which provides baseline information before other

typing methods can be carried out to further separate strains in a particular serovar. Serotyping is

very useful because the serovar name often points to the specific reservoir and mode of

pathogenesis. However, serotyping alone is not suit for molecular epidemiology, because

individual serovars are responsible for multiple outbreaks (20, 21). As a result, other subtyping

methods with more resolution need to be carried out after serotyping.

2.2.2.1.2 Phage typing

Phage typing utilizes the selective capacity of individual bacteriophage to infect bacterial

cells. During phage typing, a panel of bacteriophages is used to infect bacteria and phage types

are assigned according to the patterns of lysis. Phage typing has been shown to be a good

19

indicator for pandemic clones of Salmonella. For instance, S. Enteritidis phage type (PT) 4 is the

most common PT in Europe, while PT8 is the most common PT in the U.S. Another example is

S. Typhimurium definitive type 104 (DT104), which is typically resistant to a number of

antibiotics and has had a major impact on global health (106). However, phage typing sometimes

suffers from low typeability in that many strains are resistant to all typing phages (1). Moreover,

it requires maintenance of the typing phage stocks and specially trained personnel (45).

2.2.2.1.3 Multilocus enzyme electrophoresis (MLEE)

MLEE differentiates strains based on the relative electrophoretic mobility of cellular

enzymes. The variation in amino acid sequences of the enzymes from different strains results in

differences in electrostatic charges. This leads to different migrations of the enzymes in an

electric field. By comparing the electrophoretic profiles, genetic relatedness of strains can then

be determined. MLEE has been carried out to analyze the population structure of Salmonella

serovars and the relatedness of strains within a serovar (8). Population studies by MLEE

subtyping revealed that while many serovars have similar electrophoretic types (ETs) that form a

single cluster, other serovars like S. Newport have divergent ETs clustered distantly in MLEE

trees. Using MLEE to determine phylogenetic relationships of bacteria is generally accepted.

However, MLEE has been replaced by a more reproducible and portable method called

multilocus sequence typing (MLST), which looks directly at DNA sequences of several genes

(75). MLST will be introduced later as one of the genotypic methods.

2.2.2.2 Genotypic methods

Genotypic methods target genetic differences between different strains of bacteria.

Generally speaking, genotypic methods have better reproducibility and increased discriminatory

20

power than phenotypic methods. Because of these advantages, genotypic methods are often

carried out after serotyping during Salmonella outbreak investigations. Two categories of

genotypic methods, DNA-fragment-pattern-based methods and DNA-sequence-based methods,

will be discussed.

2.2.2.2.1 DNA-fragment-pattern-based methods

Three DNA-fragment-pattern-based subtyping methods have been extensively studied for

subtyping Salmonella, which are pulsed-field gel electrophoresis (PFGE), amplified fragment

length polymorphism (AFLP) and multiple loci variable number tandem repeat analysis (MLVA).

2.2.2.2.1.1 Pulsed-Field Gel Electrophoresis (PFGE)

PFGE is currently the ―gold standard‖ method for subtyping Salmonella and is used by

public health surveillance systems such as the PulseNet program of CDC. During PFGE

procedures, bacterial cells are first immobilized in agarose plugs to avoid mechanical shearing of

the long genomic DNA. Cells in agarose plugs are then lysed and genomic DNA is digested by a

rare-cutting restriction endonuclease. Next, agarose plugs containing digested genomic DNA are

put into wells of an agarose gel. The agarose gel is then subjected to an electric field whose

orientation is periodically changing. This pulsed electrical field can resolve large DNA fragments

that could not be separated by a constant unidirectional electrical field. The standardized PFGE

protocol of Salmonella uses two restriction endonucleases XbaI and BlnI in separate reactions

(40).

PFGE has been used in detection, investigation and control of numerous outbreaks and is

generally very successful (51). The main advantage of PFGE is its comparatively high

discriminatory power for subtyping most serovars of Salmonella. However, PFGE lacks

21

discriminatory power for clonal serovars like Enteritidis (25, 120) and Montevideo (27), or clonal

phage types like S. Typhimurium DT104 (51). This is reflected by low PFGE pattern diversity

for those serovars and clonal phage types in the PulseNet database (51). In the cases of such low

discriminatory power, outbreak clones cannot be separated from sporadic isolates and other non-

outbreak related isolates, which can hinder epidemiologic detection and investigation. For

example, during the recent Italian-style meat outbreak, the outbreak clone of S. Montevideo had

the most common PFGE pattern in PulseNet database, which made it difficult to detect the

outbreak (27).

Besides low discriminatory power for clonal serovars, another limitation of PFGE is the

ambiguous interpretation of banding patterns. Banding patterns can change due to insertions,

deletions and point mutations. For instance, a single nucleotide mutation might cause up to 3-

fragment changes in the PFGE banding pattern. Because of this difficulty, interpretation of PFGE

banding patterns has been proposed to follow several guidelines: 1) strains showing no fragment

differences with the outbreak strain are part of the outbreak; 2) strains showing 1 fragment

difference with the outbreak strain are probably part of the outbreak; 3) strains showing 2-3

fragment differences with the outbreak strain are possibly part of the outbreak; 4) strains showing

more than 3-fragment differences with the outbreak strain are not part of the outbreak (105).

More recommendations for interpretation of PFGE patterns have been published recently. The

recommendations include taking into account the quality of the PFGE gel, the diversity of the

organism and the temporal and geographical information during analysis of PFGE patterns (40).

Although those suggestions helped standardize the interpretation of PFGE patterns, these

recommendations are still not completely objective.

Another drawback of PFGE is low reproducibility if the standardized protocol is not

strictly followed. As a result, subsequent comparison of PFGE banding patterns cannot be carried

out, especially when comparing PFGE patterns between different laboratories. To overcome this

limitation, PulseNet implemented an extensive quality assurance system (51). This system

22

requires laboratories to obtain PFGE gel preparation and gel analysis certification and participate

in the annual proficiency testing program. All these steps help ensure comparability and

reproducibility, but at the same time it requires personnel specially trained by the quality

assurance system.

To sum up, although it is the current ―gold standard‖ subtyping method, PFGE suffers

from several drawbacks which limit its performance for subtyping Salmonella.

2.2.2.2.1.2 Amplified Fragment Length Polymorphism (AFLP)

AFLP is a method that employs both restriction digestion and polymerase chain reaction

(PCR) techniques. In AFLP, genomic DNA is digested with one or more restriction enzymes.

The ends of the digested DNA fragments are then ligated to adaptors that are complementary to

the restriction sites. The digested and ligated DNA fragments are then selectively amplified using

PCR primers targeting the adaptor sequences. PCR primers typically contain one to three

additional nucleotides on their 3’-end to reduce the number of amplified fragments to a

manageable number. PCR products are then subjected to electrophoresis and characteristic

banding patterns are then produced.

AFLP is a relatively simple and fast approach. The discriminatory power of AFLP is

equal to that of PFGE for subtyping S. Typhimurium (73, 103), but higher than that of PFGE for

subtyping S. Enteritidis (52) and other serovars (109). However, its discriminatory power has

been reported to be insufficient to separate all epidemiologically unrelated S. Typhimurium

strains (92).

Like PFGE, the reproducibility of AFLP among different laboratories is problematic

since comparing AFLP results among different laboratories is difficult (48). Variability in the

AFLP profile can be generated by minor changes in the amplification conditions. Therefore,

replicates of the sample could be identified as different strains (45). To enhance reproducibility,

23

PCR should be performed under highly stringent conditions (84) and gel electrophoresis should

be standardized.

2.2.2.2.1.3 Multiple Loci Variable number tandem repeat Analysis (MLVA)

MLVA targets tandem repeats of short DNA sequences in bacterial genomes. The

difference in the number of repeated DNA motifs is employed to differentiate strains. In a

MLVA assay, a number of well-selected and characterized loci are amplified by PCR using

primers targeting the flanking regions of the repeated loci. PCR products are then separated and

the number of repeat units at each locus can be measured according to the size of the PCR

products. Differences in the number of repeats in each locus are used to distinguish different

strains.

Since this method is based on PCR, MLVA has the advantage of being easy to perform

and rapid. Moreover, MLVA yields discreet and unambiguous data, reported as the number of

repeat units at each locus. Comparison of MLVA profiles between laboratories can be made with

a simple nomenclature recently proposed (70). The discriminatory power of MLVA was reported

to be higher than PFGE and AFLP for subtyping S. Typhimurium (72, 108) and higher than

PFGE for S. Enteritidis (11, 93). However, in some circumstances, strains that have the same

MLVA type were separated by PFGE profiles (13). This indicates that strains of same MLVA

type might not be closely related.

However, the reproducibility of MLVA is a potential problem. The instability of MLVA

alleles has been observed for subtyping S. Newport and S. Typhimurium (18, 35). Replicates of

the same strains have been shown to have different number of repeat units at a specific locus (35).

The instability of the MLVA loci is probably due to DNA polymerase slippage during genome

replication (110). This instability might make interpretation difficult when strains have slightly

different MLVA types.

24

To conclude, by providing improved discriminatory power and having a short turnaround

time, MLVA can be used as a complementary method to PFGE in epidemiologic investigations of

Salmonella. MLVA has been used successfully along with other subtyping methods in outbreak

investigations to track Salmonella (12, 83, 85). However, MLVA also suffered from some

drawbacks and thus it has not been widely used for this purpose.

2.2.2.2.2 DNA-sequence-based methods

DNA-sequence-based methods differentiate strains by the detection of polymorphic DNA

sequences. Multilocus sequence typing (MLST) and single nucleotide polymorphism (SNP)

analysis are both DNA-sequence-based methods and will be briefly reviewed here.

2.2.2.2.2.1 Multilocus Sequence Typing (MLST)

MLST discriminates among bacterial strains by comparing nucleotide sequences of

several DNA loci in bacteria chromosomes. For each locus in the MLST scheme, every new

allele is assigned a unique number in order of discovery and is designated an allelic type. The

collective allelic types make up the allelic profile or sequence type, which may also be assigned a

unique and arbitrary number. For example, in the MLST database (www.mlst.net) based on the

seven loci: aroC, dnaN, hemD, hisD, purE, sucA, and thrA, one of the strains in the database has

an allelic profile of (1, 1, 2, 1, 1, 1, 9) for each of the seven genes, and was assigned sequence

type 3 (80). The collective allelic types and sequence types are compared among bacterial strains

and then cluster analysis can be carried out.

Compared to PFGE, MLST is a less labor-intensive method and involves common

techniques including primer design, PCR amplification and DNA sequencing. Furthermore,

DNA sequence represents discreet, unambiguous, highly informative, highly portable and

http://www.mlst.net/

25

reproducible data. Many MLST data sets are available over the internet (www.mlst.net) so that a

uniform nomenclature is ensured and comparison of results among laboratories can be conducted

rapidly. The application of MLST is promoted due to the increased speed and reduced cost of

nucleotide sequencing and improved internet database and tools (74). These advantages make

MLST an attractive subtyping approach.

MLST schemes originally target housekeeping genes, which are genes required for

fundamental metabolic functions and are found within all members of a given species (75). For

example, 7 housekeeping genes were targeted in the first MLST scheme for Neisseria

meningitidis (75). Housekeeping genes are excellent genetic markers for studying the population

structure, long-term evolution and diversity of bacteria. A good overview of Salmonella diversity

and evolution is provided by the internet-based MLST data. Based upon MLST data, Salmonella,

especially S. enterica subspecies enterica, is highly clonal (69). Moreover, the data suggest that

many serovars including Typhimurium, Enteritidis, Newport and Saintpaul may have more than

one origin (69). However, MLST schemes based on housekeeping genes for typing Salmonella

usually have much lower discriminatory power than that of PFGE (43, 61, 109). The results of

those studies suggested that housekeeping genes do not provide sufficient resolution to

distinguish closely related strains. Therefore, MLST schemes based on housekeeping genes are

not suitable for outbreak investigations.

To conclude, MLST possesses many attractive advantages. It is an excellent tool for

global phylogenetic studies. However, housekeeping genes selected in previous MLST studies

lacked sequence variation and thus were ineffective for subtyping Salmonella for epidemiologic

purposes. To track strains of this important pathogen during outbreaks, genetic markers that give

sufficient DNA sequence variations need to be identified.

26

2.2.2.2.2.2 Multi-Virulence-Locus Sequence Typing (MVLST)

Besides housekeeping genes, virulence genes which are responsible for pathogenesis,

have been selected as genetic markers for MLST schemes. MLST schemes that only target

virulence genes have been referred to as multi-virulence-locus sequence typing (MVLST) (31,

119). Unlike housekeeping genes, virulence genes are commonly under positive selection (41).

As a result, DNA sequences of virulence genes tend to be more variable than housekeeping genes

and thus are able to provide increased discrimination. It is also speculated that virulence genes

can provide high epidemiologic concordance because they are responsible for causing diseases

and thus outbreaks. For example, six virulence genes were targeted in an MVLST scheme for

subtyping Listeria monocytogenes, which showed very high discriminatory power (0.99) and

perfect epidemiologic concordance (1.0) (31).

No MVLST scheme has yet been developed for subtyping Salmonella. However, MLST

based on both virulence genes and housekeeping genes has been published for typing Salmonella

enterica subspecies enterica serovars, which targeted flagellin genes fliC and fljB along with two

housekeeping genes, gyrB and atpD (104). This study included several strains from all

subspecies and 22 of the more prevalent Salmonella enterica subspecies enterica serovars

attempting to develop a DNA-based assay for serotype identification. However, the use of this

MLST scheme to further characterize strains under serovar level was not tested. Another MLST

based on both virulence genes and housekeeping genes has been developed for subtyping S.

Typhimurium and showed high discriminatory power (0.98), which was slightly higher than that

of PFGE (0.96) (46). In that MLST scheme, three virulence genes were included together with

the 16S rRNA gene and three housekeeping genes. One of the virulence genes in that MLST

scheme is hilA which regulates transcription of invasion proteins (4). The other two virulence

genes, pefB and fimH, encode different fimbriae and both mediate adherence to host cells (6, 66).

Although this MLST scheme seems to have adequate discriminatory power for subtyping S.

27

Typhimurium, its capacity to discriminate strains from more clonal serovars such as S. Enteritidis

has not yet been tested. Currently, there is no published MLST study for differentiating strains

within S. Enteritidis. In the SNP database of NCBI (National Center for Biotechnology

Information), two strains of S. Enteritidis were compared side by side to examine their SNPs (56).

Nearly all virulence genes were identical between the two, suggesting that MVLST might not be

discriminatory enough for differentiating strains of S. Enteritidis.

In summary, although MVLST has higher discriminatory power than MLST using

housekeeping genes, it may not provide enough discrimination for clonal serovars like Enteritidis.

In order to develop an MLST scheme for outbreak investigations, additional genetic markers with

even higher sequence variability need to be identified.

2.2.2.2.2.3 Single Nucleotide Polymorphism (SNP) analysis

SNP analysis differentiates strains by nucleotide substitutions at specific sites in the

bacterial genome. SNP analysis often involves three steps: 1) Select SNP sites that are variable to

provide discrimination among strains; 2) Determine the nucleotide bases at the selected sites of

different strains; and 3) Compare the SNPs among strains. Selection of the SNP sites is often

based on previous knowledge of specific polymorphic genes (42, 71) or comparative genomic

studies (118). To determine the nucleotide base (adenine, guanine, cytosine, and thymine) at a

defined SNP site, multiple methods can be used, such as pyrosequencing or realtime PCR (82, 91,

111).

Because SNP analysis targets SNPs in the bacterial genome, it has the potential to be

more rapid and cost efficient than MLST. However, there are very few SNP analysis studies for

subtyping Salmonella. SNP analysis targeting genes associated with quinolone resistance has

been used to study the antibiotic resistance of Salmonella (42, 71). Another SNP analysis study

targeted SNPs in flagella antigens in order to develop a SNP typing method to replace serotyping

28

(82). No SNP typing methods have been developed for differentiating Salmonella strains for

outbreak investigations. The reason might be that the SNP loci of Salmonella that could provide

the desirable discrimination have not been identified.

In conclusion, although SNP analysis has the potential to be rapid, cost efficient and

high-throughput, the lack of information about SNP sites suitable for subtyping Salmonella make

it difficult to develop a SNP typing protocol for epidemiologic purposes.

2.3 Clustered Regularly Interspaced Short Palindromic Repeat (CRISPR)

Since virulence genes alone might not provide enough discrimination for subtyping

clonal Salmonella serovars, additional genetic elements that are evolving faster than virulence

genes are needed. One of the fastest evolving genetic elements in bacteria genomes are CRISPRs

(Clustered Regularly Interspaced Palindromic Repeat) (100). CRISPRs are regions of direct

repeats (DRs) and spacers in the chromosomes of archaea and bacteria, including S. enterica (Fig.

2.2) (65, 100). DRs are 21-47 bp long, separated by spacers of similar size (Fig. 2.2). Sequences

of DRs are generally conserved, except the repeat at one end of the CRISPR is not totally

conserved, and is thus called a degenerate direct repeat (Fig. 2.2). On the other hand, sequences

of spacers are quite variable from each other. It was recently demonstrated that CRISPR spacers

are derived from phages or plasmids, which when inserted into the CRISPR of a bacterial cell

help protect that cell from subsequent infection by those same phages and plasmids (5). CRISPR

is generally flanked at one end by a common leader sequence of 200-350 bp, which is believed to

act as a promoter to transcribe CRISPR into small RNAs (77). Immediately upstream from the

CRISPR there are CRISPR-associated (Cas) proteins that carry functional domains of nucleases,

helicases, polymerases and polynucleotide-binding proteins (58). Some Cas proteins can

recognize foreign DNA invading the bacteria, and then integrate a new repeat-spacer unit into

CRISPR at the leader end. Therefore, when the same exogenous nucleic acid invades next time,

29

the CRISPR transcribed crRNAs (CRISPR RNAs) can recognize the foreign nucleic acid and

lead the Cas proteins to degrade these invading nucleic acid (17, 59, 65). In this way, CRISPR

along with Cas proteins can block foreign sequences, such as sequences of phages and plasmids.

Figure 2.2 Schematic view of the two CRISPR systems in Salmonella Typhimurium LT2.

Direct repeats and spacers are represented by black diamonds and white rectangles, respectively.

The degenerate direct repeats are represented by white diamonds. Numbers of direct repeats and

spacers are represented by the numbers of diamonds and white rectangles, respectively. L stands

for leader sequence. cas genes are in grey while other core flanking genes (ygcF, iap and ptps)

are in white. The graph is not drawn to scale.

As a bacterial immune system against phages and plasmids, CRISPRs evolve rapidly and

adaptively (115). As mentioned before, new spacers could be added when foreign DNA invades

the bacteria. Besides addition of new spacers, deletion of spacers is also frequently observed (37,

90). However, the mechanism of deletion of spacers is not clear. The addition of new spacers

and deletion of one or several spacers make CRISPR one of the most variable DNA loci in

bacteria and form a high degree of polymorphism among strains (90).

CRISPRs have been used for subtyping Mycobacterium tuberculosis, this subtyping

method is called Spacer oligotyping or spoligotyping (55). In this method, PCR is carried out

using primers designed according to the sequence of the DR so that each spacer can be amplified.

The PCR products are then hybridized to a membrane containing probes for specific spacers. The

hybridization patterns showing the presence or absence of spacers are then compared among

strains. Spoligotyping is now the standard method for subtyping M. tuberculosis for outbreak

CRISPR1

CRISPR2

30

investigations. It has also been used in subtyping Corynebacterium diphtheria (81). Other than

spoligotyping, CRISPR sequence analysis has also been used for other bacteria, such as Yersinia

pestis (90), Streptococcus (64), and Campylobacter jejuni (96). As for Salmonella, although

CRISPRs have the potential to be excellent markers for separating Salmonella strains, they have

not been widely used for subtyping purposes.

2.3.1 CRISPR in Salmonella

CRISPR can be found in multiple numbers in bacteria. Two CRISPR loci are found in all

Salmonella serovars in the CRISPRs database (http://crispr.u-psud.fr/crispr/). CRISPR direct

repeats in Salmonella are 27-31 bp long. Salmonella CRISPRs have great polymorphism even

among strains belonging to the same serovar. Therefore, CRISPRs might serve as good markers

for subtyping Salmonella during epidemiologic investigations.

2.4 Conclusions

Salmonella is the leading cause of foodborne bacterial disease in the U.S. Most human

illnesses are caused by a handful of serovars, such as Typhimurium, Enteritidis, Newport,

Heidelberg, I 4, [5], 12; i: -, Montevideo, Muenchen and Saintpaul. Salmonella can reside in

many wild and domestic animals and can spread from numerous reservoirs to contaminate

numerous kinds of foods, which makes it especially challenging to track this pathogen during

outbreaks. Therefore, to reduce outbreaks caused by the most common serovars of Salmonella, it

is critical to employ a subtyping method that can accurately identify its sources and pathways of

transmission. Many subtyping methods have been developed for differentiating Salmonella

strains, such as PFGE, AFLP and MLVA. Each method has its own advantages and drawbacks.

PFGE is currently the ―gold standard‖ method for outbreak investigations. However, PFGE

http://crispr.u-psud.fr/crispr/

31

produces ambiguous data that are hard to interpret and more importantly PFGE often lacks

discriminatory power for subtyping clonal serovars such as Enteritidis. In contrast, MLST

generates highly informative and discreet data consisting of nucleotide sequences that can be

easily interpreted and rapidly compared on internet databases. Previous MLST schemes targeting

housekeeping genes were not very successful largely due to low discriminatory power associated

with conserved housekeeping genes. Unlike housekeeping genes, virulence genes can provide

important information about the pathogenesis of strains and improve the discriminatory power of

MLST. However, the discriminatory power of virulence genes may still not be enough for

subtyping clonal serovars of Salmonella. CRISPRs are one of the fastest evolving genetic

elements that could be implemented in an MLST scheme to provide increased discrimination. In

order to develop an MLST scheme for outbreak investigation, virulence genes and CRISPRs were

targeted in the present study to subtype the top 10 serovars of Salmonella. This MLST scheme

was speculated to provide high discriminatory power and epidemiologic concordance for

subtyping Salmonella for epidemiologic purposes.

2.5 References

1. Amavisit, P., P. F. Markham, D. Lightfoot, K. G. Whithear, and G. F. Browning. 2001.

Molecular epidemiology of Salmonella Heidelberg in an equine hospital. Vet. Microbiol.

80:85-98.

2. Anjum, M. F., C. Marooney, M. Fookes, S. Baker, G. Dougan, A. Ivens, and M. J.

Woodward. 2005. Identification of Core and Variable Components of the Salmonella

entericas subspecies I genome by microarray. Infect. Immun. 73:7894-7905.

3. Anonymous. 2006. Outbreak alert! Closing the gaps in our federal food safety net.

http://www.cspinet.org/new/pdf/outbreakalert2004.pdf

32

4. Bajaj, V., R. L. Lucas, C. Hwang, and C. A. Lee. 1996. Co-ordinate regulation of

Salmonella Typhimurium invasion genes by environmental and regulatory factors is mediated

by control of hilA expression. Mol. Microbiol.. 22:703-714.

5. Barrangou, R., C. Fremaux, H. Deveau, M. Richards, P. Boyaval, S. Moineau, D. A.

Romero, and P. Horvath. 2007. CRISPR provides acquired resistance against viruses in

prokaryotes. Science. 315:1709-1712.

6. Baumler, A., R. Tsolis, F. Bowe, J. Kusters, S. Hoffmann, and F. Heffron. 1996. The pef

fimbrial operon of Salmonella Typhimurium mediates adhesion to murine small intestine and

is necessary for fluid accumulation in the infant mouse. Infect. Immun. 64:61-68.

7. Baumler, A. J., R. M. Tsolis, T. A. Ficht, and L. G. Adams. 1998. Evolution of host

adaptation in Salmonella enterica. Infect. Immun. 66:4579-4587.

8. Beltran, P., J. M. Musser, R. Helmuth, J. J. Farmer, W. M. Frerichs, I. K. Wachsmuth,

K. Ferris, A. C. McWhorter, J. G. Wells, and A. Cravioto. 1988. Toward a population

genetic analysis of Salmonella: genetic diversity and relationships among strains of serotypes

S. Choleraesuis, S. Derby, S. Dublin, S. Enteritidis, S. Heidelberg, S. Infantis, S. Newport, and

S. Typhimurium. Proc. Natl. Acad. Sci. U.S.A. 85:7753-7757.

9. Bemis, D. A., L. M. Grupka, S. Liamthong, D. W. Fooland, J. M. Sykesiv, and E. C.

Ramsay. 2007. Clonal relatedness of Salmonella isolates associated with captive and wild-

caught rattlesnakes. Vet. Microbiol. 300-307.

10. Benirschke, K., and F. D. Adams. 1980. Gorilla diseases and causes of death. J. Reprod.

Fertil. 28:139-148.

11. Beranek, A., C. Mikula, P. Rabold, D. Arnhold, C. Berghold, I. Lederer, F. Allerberger,

and C. Kornschober. 2009. Multiple-locus variable-number tandem repeat analysis for

subtyping of Salmonella enterica subsp. enterica serovar Enteritidis. Int. J. Med. Microbiol.

299:43-51.

33

12. Best, E. L., M. D. Hampton, S. Ethelberg, E. Liebana, F. A. Clifton-Hadley, and E. J.

Threlfall. 2009. Drug-resistant Salmonella Typhimurium DT 120: use of PFGE and MLVA

in a putative international outbreak investigation. Microb. Drug Resist.15:133-138.

13. Boxrud, D., K. Pederson-Gulrud, J. Wotton, C. Medus, E. Lyszkowicz, J. Besser, and J.

M. Bartkus. 2007. Comparison of multiple-locus variable-number tandem repeat analysis,

pulsed-field gel electrophoresis, and phage typing for subtype analysis of Salmonella enterica

serotype Enteritidis. J. Clin. Microbiol. 45:536-543.

14. Braden, C. R. 2006. Salmonella enterica serotype Enteritidis and eggs: a national epidemic

in the United States. Clin. Infect. Dis. 43:512-517.

15. Brandl, M. T. 2006. Fitness of human enteric pathogens on plants and implications for food

safety. Annu. Rev. Phytopathol. 44:367-392.

16. Brenner, F. W., R. G. Villar, F. J. Angulo, R. Tauxe, and B. Swaminathan. 2000.

Salmonella Nomenclature. J. Clin. Microbiol. 38:2465-2467.

17. Brouns, S. J. J., M. M. Jore, M. Lundgren, E. R. Westra, R. J. H. Slijkhuis, A. P. L.

Snijders, M. J. Dickman, K. S. Makarova, E. V. Koonin, and J. van der Oost. 2008.

Small CRISPR RNAs guide antiviral defense in prokaryotes. Science. 321:960-964.

18. Call, D., L. Orfe, M. Davis, S. Lafrentz, and M. Kang. 2008. Impact of compounding error

on strategies for subtyping pathogenic bacteria. Foodborne Pathog. Dis.5:505-516.

19. CDC. 2002. Outbreak of multidrug-resistant Salmonella Newport--United States, January-

April 2002. JAMA. 288:951-953.

20. CDC. 2006. Salmonella Annual Summary 2005.

http://www.cdc.gov/ncidod/dbmd/phlisdata/salmtab/2006/SalmonellaAnnualSummary2006.p

df

34


http://www.cdc.gov/ncidod/dbmd/phlisdata/salmtab/2006/SalmonellaAnnualSummar

y2006.pdf

22. CDC. 2008. Outbreak of Salmonella serotype Saintpaul infections associated with multiple

raw produce items --- United States, 2008. MMWR Morb Mortal Wkly Rep. 57:929-934.

23. CDC. 2008. OutbreakNet Foodborne Outbreak Online Database.

http://wwwn.cdc.gov/foodborneoutbreaks/

24. CDC. 2009. Multistate outbreak of Salmonella infections associated with peanut butter and

peanut butter--containing products --- United States, 2008--2009. MMWR Morb Mortal

Wkly Rep. 58:1-6.

25. CDC. 2010. Investigation update: multistate outbreak of human Salmonella Enteritidis

infections associated with shell eggs. http://www.cdc.gov/salmonella/enteritidis/

26. CDC. 2010. Investigation announcement: multistate outbreak of human Salmonella Newport

infections linked to raw alfalfa sprouts. http://www.cdc.gov/salmonella/newport/index.html

27. CDC. 2010. Investigation update: multistate outbreak of human Salmonella Montevideo

infections. http://www.cdc.gov/salmonella/montevideo/index.html

28. CDC. 2010. Preliminary FoodNet data on the incidence of infection with pathogens

transmitted commonly through food --- 10 States, 2009. MMWR Morb Mortal Wkly Rep.

58:333-337

29. Chalker, R. B., and M. J. Blaser. 1988. A review of human salmonellosis: III. magnitude of

Salmonella infection in the United States. Rev. Infect. Dis. 10:111-124.

30. Chen, Y., and S. Knabel. 2008. Strain typing, p. 203-239. In D. Liu (ed.), Handbook of

Listeria monocytogenes.

35

31. Chen, Y., W. Zhang, and S. J. Knabel. 2007. Multi-virulence-locus sequence typing

identifies single nucleotide polymorphisms which differentiate epidemic clones and outbreak

strains of Listeria monocytogenes. J. Clin. Microbiol. 45:835-846.

32. Chittick, P., A. Sulka, R. V. Tauxe, and A. M. Fry. 2006. A summary of national reports of

foodborne outbreaks of Salmonella Heidelberg infections in the United States. J. Food Prot.

69:1150-1153.

33. Christie, A. B. 1987. Infectious diseases: epidemiologic and clinical practice. Churchill

Livingstone, New York.

34. Clarkson, L. S., M. Tobin-D’angelo, C. Shuler, S. Hanna, J. Benson, and A. Voetsch.

2010. Sporadic Salmonella enterica serotype Javiana infections in Georgia and Tennessee: a

hypothesis-generating study. Epidemiol. Infect. 138:340-346.

35. Davis, M. A., K. N. K. Baker, D. R. Call, L. D. Warnick, Y. Soyer, M. Wiedmann, Y.

Grohn, P. L. McDonough, D. D. Hancock, and T. E. Besser. 2009. Multilocus variable-

number tandem-repeat method for typing Salmonella enterica serovar Newport. J. Clin.

Microbiol. 47:1934-1938.

36. De Reu, K., K. Grijspeerdt, W. Messens, M. Heyndrickx, M. Uyttendaele, J. Debevere,

and L. Herman. 2006. Eggshell factors influencing eggshell penetration and whole egg

contamination by different bacteria, including Salmonella Enteritidis. Int. J. Food Microbiol.

112:253-260.

37. Deveau, H., R. Barrangou, J. E. Garneau, J. Labonte, C. Fremaux, P. Boyaval, D. A.

Romero, P. Horvath, and S. Moineau. 2007. Phage response to CRISPR-encoded resistance

in Streptococcus thermophilus. J. Bacteriol. 190:1390-1400.

38. Doyle, M. P., and L. R. Beuchat eds. 2007. Food microbiology: fundamentals and frontiers.

ASM Press, Washington, DC.

36

39. Echeita, M. A., S. Herrera, and M. A. Usera. 2001. Atypical, fljB-negative Salmonella

enterica subsp. enterica strain of serovar 4,5,12:i:- appears to be a monophasic variant of

serovar Typhimurium. J. Clin. Microbiol. 39:2981-2983.

40. Efrain, M. R., M. A. Fair, R. Gautom, D. N. Cameron, S. B. Hunter, B. Swaminathan,

and T. J. Barrett. 2006. Standardization of pulsed-field gel electrophoresis protocols for the

subtyping of Escherichia coli O157:H7, Salmonella, and Shigella for PulseNet. Foodborne

Pathog. Dis. 3:59-67.

41. Endo, T., K. Ikeo, and T. Gojobori. 1996. Large-scale search for genes on which positive

selection may operate. Mol. Biol. Evol. 13:685-690.

42. Esaki, H., K. Noda, N. Otsuki, A. Kojima, T. Asai, Y. Tamura, and T. Takahashi. 2004.

Rapid detection of quinolone-resistant Salmonella by real time SNP genotyping. J. Microbiol.

Methods. 58:131-134.

43. Fakhr, M. K., L. K. Nolan, and C. M. Logue. 2005. Multilocus sequence typing lacks the

discriminatory ability of pulsed-field gel electrophoresis for typing Salmonella enterica


44. Fakhr, M. K., J. S. Sherwood, J. Thorsness, and C. M. Logue. 2006. Molecular

characterization and antibiotic resistance profiling of Salmonella isolated from retail turkey

meat products. Foodborne Pathog. Dis. 3:366-374.

45. Foley, S. L., S. Zhao, and R. D. Walker. 2007. Comparison of molecular typing methods

for the differentiation of Salmonella foodborne pathogens. Foodborne Pathog. Dis. 4:253-276.

46. Foley, S. L., D. G. White, P. F. McDermott, R. D. Walker, B. Rhodes, P. J. Fedorka-

Cray, S. Simjee, and S. Zhao. 2006. Comparison of subtyping methods for differentiating

Salmonella enterica serovar Typhimurium isolates obtained from food animal sources. J. Clin.

Microbiol. 44:3569-3577.

47. Franz, E., and A. H. C. van Bruggen. 2008. Ecology of E. coli O157:H7 and Salmonella

enterica in the primary vegetable production chain. Crit. Rev. Microbiol. 34:143-161.

37

48. Fry, N. K., B. Afshar, P. Visca, D. Jonas, J. Duncan, E. Nebuloso, A. Underwood, and T.

G. Harrison. 2005. Assessment of fluorescent amplified fragment length polymorphism

analysis for epidemiological genotyping of Legionella pneumophila serogroup 1. Clin.

Microbiol. Infect. 11:704-712.

49. Gantois, I., R. Ducatelle, F. Pasmans, F. Haesebrouck, R. Gast, T. J. Humphrey, and F.

Van Immerseel. 2009. Mechanisms of egg contamination by Salmonella Enteritidis. FEMS

Microbiology Reviews. 33:718-738.

50. Gast, R. K., and C. W. Beard. 1990. Production of Salmonella Enteritidis-contaminated

eggs by experimentally infected hens. Avian Dis. 34:438-446.

51. Gerner-Smidt, P., K. Hise, J. Kincaid, S. Hunter, S. Rolando, E. Hyytiä-Trees, E. M.

Ribot, B. Swaminathan, and Pulsenet Taskforce. 2006. Pulsenet USA: a five-year update.

Foodborne Pathog. Dis. 3:9-19.

52. Giammanco, G. M., C. Mammina, C. Romani, I. Luzzi, A. M. Dionisi, and A. Nastasi.

2007. Evaluation of a modified single-enzyme amplified fragment length polymorphism (SE-

AFLP) technique for subtyping Salmonella enterica serotype Enteritidis. Res. Microbiol.

158:10-17.

53. Gomez, T. M., Y. Motarjemi, S. Miyagawa, F. K. Käferstein, and K. Stöhr. 1997.

Foodborne salmonellosis. World Health Stat Q. 50:81-89.

54. Grimont, P. A. D., and F. X. Weill. 2007. Antigenic formulae of the Salmonella serovars,

9th ed. WHO Collaborating Centre for Reference and Research on Salmonella. Institute

Pasteur, Paris, France.

55. Groenen, P. M., A. E. Bunschoten, D. Soolingen, and J. D. Errtbden. 1993. Nature of

DNA polymorphism in the direct repeat cluster of Mycobacterium tuberculosis; application

for strain differentiation by a novel typing method. Mol. Microbiol.10:1057-1065.

38

56. Guard, J. 2010. Evolutionary trends in two strains of Salmonella enterica subsp. I serovar

Enteritidis PT13a that vary in virulence potential.

http://www.ncbi.nlm.nih.gov/genomes/static/Salmonella_SNPS.html

57. Guard-Petter, J. 2001. The chicken, the egg and Salmonella Enteritidis. Environ. Microbiol.

3:421-430.

58. Haft, D. H., J. Selengut, E. F. Mongodin, and K. Nelson. 2005. A Guild of 45 CRISPR-

associated (Cas) protein families and multiple CRISPR/cas subtypes exist in prokaryotic

genomes. PLoS Comput Biol. 1:e60.

59. Hale, C. R., P. Zhao, S. Olson, M. O. Duff, B. R. Graveley, L. Wells, R. M. Terns, and M.

P. Terns. 2009. RNA-guided RNA cleavage by a CRISPR RNA-cas protein complex. Cell.

139:945-956

60. Hanning B. Irene, J.D. Nutt, Steven C. Ricke. 2009. Salmonellosis outbreaks in the United

States due to fresh produce: sources and potential intervention measures. Foodborne Pathog.

Dis. 6:635-648.

61. Harbottle, H., D. G. White, P. F. McDermott, R. D. Walker, and S. Zhao. 2006.

Comparison of multilocus sequence typing, pulsed-field gel electrophoresis, and

antimicrobial susceptibility typing for characterization of Salmonella enterica serotype

Newport isolates. J. Clin. Microbiol.

62. Harris, L. J., J. N. Farber, L. R. Beuchat, M. E. Parish, T. V. Suslow, E. H. Garrett, and

F. F. Busta. 2003. Outbreaks associated with fresh produce: incidence, growth, and survival

of pathogens in fresh and fresh-cut produce. Compr Rev Food Sci Food Saf. 2:78-141.

63. Heyndrickx, M., F. Pasmans, R. Ducatelle, A. Decostere, and F. Haesebrouck. 2005.

Recent changes in Salmonella nomenclature: The need for clarification. Vet. J. 170:275-277.

64. Hoe, N., K. Nakashima, D. Grigsby, X. Pan, S. J. Dou, S. Naidich, M. Garcia, E. Kahn,

D. Bergmire-Sweat, and J. M. Musser. 1999. Rapid molecular genetic subtyping of

serotype M1 group A Streptococcus strains. Emerg Infect Dis. 5:254-263.

39

65. Horvath, P., and R. Barrangou. 2010. CRISPR/cas, the immune system of bacteria and

archaea. Science. 327:167-170.

66. Humphries, A. D., S. M. Townsend, R. A. Kingsley, T. L. Nicholson, R. M. Tsolis, and A.

J. Baumler. 2001. Role of fimbriae as antigens and intestinal colonization factors of

Salmonella serovars. FEMS Microbiology Letters. 201:121-125.

67. Jay, J. M. 2005. Modern Food Microbiology. Springer Science, New York.

68. kenny, D. E. 1999. Salmonella spp. survey of captive rhinoceroses in U.S. zoological

institutions and private ranches. J. Zoo Wildl. Med. 30:383-388.

69. Lan, R., P. R. Reeves, and S. Octavia. 2009. Population structure, origins and evolution of

major Salmonella enterica clones. Infect. Genet. Evol. 9:996-1005

70. Larsson, J. T., M. Torpdahl, R. F. Petersen, G. Sørensen, B. A. Lindstedt, and E. M.

Nielsen. 2009. Development of a new nomenclature for Salmonella Typhimurium multilocus

variable number of tandem repeats analysis (MLVA). Euro Surveill. 14:19174-19174.

71. Levy, D. D., B. Sharma, and T. A. Cebula. 2004. Single-nucleotide polymorphism mutation

spectra and resistance to quinolones in Salmonella enterica serovar Enteritidis with a mutator

phenotype. Antimicrob. Agents Chemother. 48:2355-2363.

72. Lindstedt, B., E. Heir, E. Gjernes, and G. Kapperud. 2003. DNA fingerprinting of

Salmonella enterica subsp. enterica serovar Typhimurium with emphasis on phage type

DT104 based on variable number of tandem repeat loci. J. Clin. Microbiol. 41:1469-1479.

73. Lindstedt, B., E. Heir, T. Vardund, and G. Kapperud. 2000. Fluorescent amplified-

fragment length polymorphism genotyping of Salmonella enterica subsp. enterica serovars

and comparison with pulsed-field gel electrophoresis typing. J. Clin. Microbiol. 38:1623-

1627.

74. Maiden, M. C. J. 2006. Multilocus sequence typing of bacteria. Annu. Rev. Microbiol.

60:561-588.

40

75. Maiden, M. C. J., J. A. Bygraves, E. Feil, G. Morelli, J. E. Russell, R. Urwin, Q. Zhang,

J. Zhou, K. Zurth, D. A. Caugant, I. M. Feavers, M. Achtman, and B. G. Spratt. 1998.

Multilocus sequence typing: a portable approach to the identification of clones within

populations of pathogenic microorganisms. Proc. Natl. Acad. Sci. U.S.A. 95:3140-3145.

76. Majowicz, S., J. Musto, E. Scallan, F. Angulo, M. Kirk, S. O’Brien, T. Jones, A. Fazil,

and R. Hoekstra. 2010. Food safety: The global burden of nontyphoidal Salmonella

gastroenteritis. Clin. Infect. Dis.50:882-889.

77. Marraffini, L. A., and E. J. Sontheimer. Self versus non-self discrimination during

CRISPR RNA-directed immunity. Nature. 463:568-571.

78. Mills, D. M., V. Bajaj, and C. A. Lee. 1995. A 40 kb chromosomal fragment encoding

Salmonella typhimurium invasion genes is absent from the corresponding region of the

Escherichia coli K-12 chromosome. Mol. Microbiol. 15:749-v.

79. Miyamoto, T., E. Baba, T. Tanaka, K. Sasai, T. Fukata, and A. Arakawa. 1997.

Salmonella Enteritidis contamination of eggs from hens inoculated by vaginal, cloacal, and

intravenous routes. Avian Dis. 41:296-303.

80. MLST. 2010. http://mlst.ucc.ie/

81. Mokrousov, I., E. Limeschenko, A. Vyazovaya, and O. Narvskaya. 2007.

Corynebacterium diphtheriae spoligotyping based on combined use of two CRISPR loci.

Biotechnol J. 2:901-906.

82. Mortimer, C., T. Peters, S. Gharbia, J. Logan, and C. Arnold. 2004. Towards the

development of a DNA-sequence based approach to serotyping of Salmonella enterica. BMC

Microbiol. 4:31.

83. Much, P., J. Pichler, S. Kasper, H. Lassnig, C. Kornschober, A. Buchner, C. König, and

F. Allerberger. 2009. A foodborne outbreak of Salmonella Enteritidis phage type 6 in

Austria, 2008. Wien. Klin. Wochenschr. 121:132-136.

41

84. Mueller, U. G., and L. L. Wolfenbarger. 2008. AFLP genotyping and fingerprinting.

Trends Ecol Evol. 14:389-394.

85. Munnoch, S., K. Ward, S. Sheridan, G. Fitzsimmons, C. Shadbolt, J. Piispanen, Q.

Wang, T. Ward, T. Worgan, C. Oxenford, J. Musto, J. McAnulty, and D. Durrheim.

2009. A multi-state outbreak of Salmonella Saintpaul in Australia associated with cantaloupe

consumption. Epidemiol Infect. 137:367-374.

86. Ochman, H., F. C. Soncini, F. Solomon, and E. A. Groisman. 1996. Identification of a

pathogenicity island required for Salmonella survival in host cells. Proc. Natl. Acad. Sci.

U.S.A. 93:7800-7804.

87. Okamura, M., Y. Kamijima, T. Miyamoto, H. Tani, K. Sasai, and E. Baba. 2001.

Differences among six Salmonella serovars in abilities to colonize reproductive organs and to

contaminate eggs in laying hens. Avian Dis. 45:61-69.

88. Pasmans, F., A. Martel, F. Boyen, D. Vandekerchove, I. Wybo, F. V. Immerseel, M.

Heyndrickx, J. M. Collard, R. Ducatelle, and F. Haesebrouck. 2005. Characterization of

Salmonella isolates from captive lizards. Vet. Microbiol. 110:285-291.

89. Patrick, M. E., P. M. Adcock, T. M. Gomez, S. F. Altekruse, B. H. Holland, R. V. Tauxe,

and D. L. Swerdlow. 2004. Salmonella Enteritidis infections, United States, 1985-1999.

Emerging Infect. Dis. 10:1-7.

90. Pourcel, C., G. Salvignol, and G. Vergnaud. 2005. CRISPR elements in Yersinia pestis

acquire new repeats by preferential uptake of bacteriophage DNA, and provide additional

tools for evolutionary studies. Microbiology. 151:653-663.

91. Roos, A., P. Dieltjes, R. H. A. M. Vossen, M. R. Daha, and P. de Knijff. 2006. Detection

of three single nucleotide polymorphisms in the gene encoding mannose-binding lectin in a

single pyrosequencing reaction. J. Immunol. Methods. 309:108-114.

42

92. Ross, I. L., and M. W. Heuzenroeder. 2005. Use of AFLP and PFGE to discriminate

between Salmonella enterica serovar Typhimurium DT126 isolates from separate food-

related outbreaks in Australia. Epidemiol. Infect. 133:635-644.

93. Ross, I. L., and M. W. Heuzenroeder. 2009. A comparison of two PCR-based typing

methods with pulsed-field gel electrophoresis in Salmonella enterica serovar Enteritidis. Int. J.

Med. Microbiol. 299:410-420.

94. Rychlik, I., D. Gregorova, and H. Hradecka. 2006. Distribution and function of plasmids

in Salmonella enterica. Vet. Microbiol. 112:1-10.

95. Rytkönen, A., J. Poh, J. Garmendia, C. Boyle, A. Thompson, M. Liu, P. Freemont, J. C.

D. Hinton, and D. W. Holden. 2007. SseL, a Salmonella deubiquitinase required for

macrophage killing and virulence. Proc. Natl. Acad. Sci. U.S.A. 104:3502-3507.

96. Schouls, L. M., S. Reulen, B. Duim, J. A. Wagenaar, R. J. L. Willems, K. E. Dingle, F. M.

Colles, and J. D. A. Van Embden. 2003. Comparative genotyping of Campylobacter jejuni

by amplified fragment length polymorphism, multilocus sequence typing, and short repeat

sequencing: strain diversity, host range, and recombination. J. Clin. Microbiol. 41:15-26.

97. Shea, J. E., M. Hensel, C. Gleeson, and D. W. Holden. 1996. Identification of a virulence

locus encoding a second type III secretion system in Salmonella Typhimurium. Proc. Natl.

Acad. Sci. U.S.A. 93:2593-2597.

98. Sivapalasingam, S., C. R. Friedman, L. Cohen, and Robert V. T. 2004. Fresh produce: a

growing cause of outbreaks of foodborne illness in the United States, 1973 through 1997. J.

Food Prot. 67:2342-2353.

99. Sokurenko, E. V., H. S. Courtney, D. E. Ohman, P. Klemm, and D. L. Hasty. 1994.

FimH family of type 1 fimbrial adhesins: functional heterogeneity due to minor sequence

variations among fimH genes. J. Bacteriol. 176:748-755.

43

100. Sorek, R., V. Kunin, and P. Hugenholtz. 2008. CRISPR — a widespread system that

provides acquired resistance against phages in bacteria and archaea. Nat. Rev. Microbiol.

6:181-186.

101. Struelens, M. J. 1996. Consensus guidelines for appropriate use and evaluation of microbial

epidemiologic typing systems. Clin. Microbiol. Infect. 2:2-11.

102. Swartz, M. 2002. Human diseases caused by foodborne pathogens of animal origin. Clin.

Infect. Dis. 34:S111-S122.

103. Tamada, Y., Y. Nakaoka, K. Nishimori, A. Doi, T. Kumaki, N. Uemura, K. Tanaka, S.

Makino, T. Sameshima, M. Akiba, M. Nakazawa, and I. Uchida. 2001. Molecular typing

and epidemiological study of Salmonella enterica serotype Typhimurium isolates from cattle

by fluorescent amplified-fragment length polymorphism fingerprinting and pulsed-field gel

electrophoresis. J. Clin. Microbiol. 39:1057-1066.

104. Tankouo-Sandjong, B., A. Sessitsch, E. Liebana, C. Kornschober, F. Allerberger, H.

Hächler, and L. Bodrossy. 2006. MLST-v, multilocus sequence typing based on virulence

genes, for molecular typing of Salmonella enterica subsp. enterica serovars. J. Microbiol.

Methods. 69:23-36.

105. Tenover, F. C., R. D. Arbeit, R. V. Goering, P. A. Mickelsen, B. E. Murray, D. H.

Persing, and B. Swaminathan. 1995. Interpreting chromosomal DNA restriction patterns

produced by pulsed-field gel electrophoresis: criteria for bacterial strain typing. J Clin

Microbiol. 33:2233-2239.

106. Threlfall, E. J. 2000. Epidemic Salmonella Typhimurium DT 104--a truly international

multiresistant clone. J. Antimicrob. Chemother. 46:7-10.

107. Tindall, B. J., P. A. D. Grimont, G. M. Garrity, and J. P. Euzeby. 2005. Nomenclature

and taxonomy of the genus Salmonella. Int. J. Syst. Evol. Microbiol. 55:521-524.

44

108. Torpdahl, M., G. Sørensen, B. Lindstedt, and E. M. Nielsen. 2007. Tandem repeat

analysis for surveillance of human Salmonella Typhimurium infections. Emerg Infect Dis.

13:388-395.

109. Torpdahl, M., M. N. Skov, D. Sandvang, and D. L. Baggesen. 2005. Genotypic

characterization of Salmonella by multilocus sequence typing, pulsed-field gel

electrophoresis and amplified fragment length polymorphism. J. Microbiol. Methods. 63:173-

184.

110. Torres-Cruz, J., and M. W. van der Woude. 2003. Slipped-strand mispairing can function

as a phase variation mechanism in Escherichia coli. J. Bacteriol. 185:6990-6994.

111. Tyagi, S., D. P. Bratu, and F. R. Kramer. 1998. Multicolor molecular beacons for allele

discrimination. Nat. Biotechnol. 16:49-53.

112. USDA-ERS. 2009. Foodborne illness cost calculator: Salmonella.

http://www.ers.usda.gov/Data/FoodborneIllness/salm_Intro.asp

113. USDA-FSIS. 2006. Serotypes profile of Salmonella isolates from meat and poultry products

January 1998 through December 2005.

http://www.fsis.usda.gov/Science/Serotypes_Profile_Salmonella_Isolates/index.asp

114. USDA-FSIS. 2007. Progress report on Salmonella testing of raw meat and poultry products,

1998-2006. http://www.fsis.usda.gov/science/progress_report_salmonella_testing/index.asp

115. Vale, P. F., and T. J. Little. 2010. CRISPR-mediated phage resistance and the ghost of

coevolution past. Proc. R. Soc. Lond., B, Biol. Sci. 277:2097-2103.

116. Voetsch, A., T. Van Gilder, F. Angulo, M. Farley, S. Shallow, R. Marcus, P. Cieslak, V.

Deneen, and R. Tauxe. 2004. FoodNet estimate of the burden of illness caused by

nontyphoidal Salmonella infections in the United States. Clin. Infect. Dis. 38:S127-S134.

117. Wallis, T. S., and E. E. Galyov. 2000. Molecular basis of Salmonella-induced Enteritis. Mol.

Microbiol. 36:997-1005.

45

118. Zhang, W., W. Qi, T. J. Albert, A. S. Motiwala, D. Alland, E. K. Hyytia-Trees, E. M.

Ribot, P. I. Fields, T. S. Whittam, and B. Swaminathan. 2006. Probing genomic diversity

and evolution of Escherichia coli O157 by single nucleotide polymorphisms . Genome Res.

16:757-767.

119. Zhang, W., B. M. Jayarao, and S. J. Knabel. 2004. Multi-virulence-locus sequence typing

of Listeria monocytogenes. Appl. Environ. Microbiol. 70:913-920.

120. Zheng, J., C. E. Keys, S. Zhao, J. Meng, and E. W. Brown. 2007. Enhanced subtyping

scheme for Salmonella Enteritidis. Emerging Infect. Dis. 13:1932-1935.

46

Chapter 3 Novel virulence gene and CRISPR multilocus sequence typing

scheme for subtyping the major serovars of Salmonella enterica subspecies

enterica

Fenyun Liu1, Rodolphe Barrangou

2, Peter Gerner-Smidt

3, Efrain Ribot

3, Stephen Knabel

1, and

Edward Dudley1*

1Department of Food Science, the Pennsylvania State University, University Park, Pennsylvania

16802;

2 Danisco USA Incorporation, 3329 Agriculture Drive, Madison, Wisconsin 53716

3 Centers for Disease Control and Prevention, Atlanta, Georgia 30333

*Corresponding author. Mailing address: 326 Food Science Building, The Pennsylvania State

University, University Park, PA 16802, US. Phone: 814-867-0439. Email: [email protected]

47

3.1 Abstract

Salmonella enterica subsp. enterica is the leading cause of bacterial foodborne disease in

the United States. Molecular subtyping methods are powerful tools for tracking the farm-to-fork

spread of foodborne pathogens during outbreaks. In order to develop a novel multilocus

sequence typing (MLST) scheme for subtyping the major serovars of S. enterica subspecies

enterica, the virulence genes sseL and fimH and Clustered Regularly Interspaced Short

Palindromic Repeat (CRISPR) regions were sequenced from 171 clinical isolates from serovars

Typhimurium, Enteritidis, Newport, Heidelberg, Javiana, I 4, [5], 12; i: -, Montevideo, Muenchen

and Saintpaul. The MLST scheme using only virulence genes identified epidemic clones, but was

insufficient to separate outbreak clones. However, the addition of CRISPR sequences

dramatically improved discriminatory power of this MLST method by accurately differentiating

individual outbreak clones. Moreover, the present MLST scheme provided better discrimination

of S. Enteritidis strains than PFGE. Cluster analyses also revealed the current MLST scheme is

highly congruent with serotyping. In conclusion, the novel MLST scheme described in the

present study accurately differentiated outbreak clones of the major serovars of Salmonella, and

therefore maybe an excellent method for subtyping this important foodborne pathogen during

outbreak investigations.

48

3.2 Introduction

Salmonella enterica subsp. enterica (Salmonella) is the leading cause of bacterial

foodborne disease in the United States, with approximately 1.4 million human cases each year

since 1996, resulting in an estimated 17,000 hospitalizations, more than 500 deaths (10, 53) and a

cost of 2.6 billion dollars (51). The nine most common serovars, S. Typhimurium, S. Enteritidis,

S. Newport, S. Heidelberg, S. Javiana, S. I 4, [5], 12; i: -, S. Montevideo, S. Muenchen and S.

Saintpaul, were responsible for more than 60% of human illnesses based on the Centers for

Disease Control and Prevention’s (CDC’s) annual summaries of 2005 and 2006 (Table 3.1) (4, 5).

Salmonella has been isolated from a broad range of foods, including raw animal foods (poultry,

eggs, pork, beef, mutton and seafood), produce (sprouts, lettuce, spinach, tomatoes and peppers)

and various processed foods including Italian-style salami, peanut butter, veggie booty and dry

cereal (7). Widespread distribution of these foods makes tracking the transmission of Salmonella

difficult during outbreak investigations. Outbreak investigations can also be hindered if the food

source is not listed on the investigation questionnaire (27). For example, for the 2008 Salmonella

outbreak due to consumption of Jalapeño and Serrano peppers, initial questionnaires did not ask if

peppers were recently eaten (6). In order to define the routes of transmission of Salmonella

within the food system, molecular subtyping methods have been employed to distinguish

outbreak clones from non-related clones (18).

Serotyping is one of the most common molecular subtyping methods for Salmonella.

Serotyping distinguishes Salmonella based on immunological classification of the H and O

antigens (21) and is typically the first subtyping method utilized during an outbreak. However,

serotyping alone cannot distinguish outbreak clones of Salmonella.

Several nucleic acids-based molecular subtyping methods have been used to subtype

Salmonella, including amplified fragment length polymorphism (AFLP) (20, 36, 39,45, 49),

49

multiple loci variable number tandem repeat analysis (MLVA) (2, 34, 35, 40), and pulsed-field

gel electrophoresis (PFGE) (13). PFGE is currently considered the ―gold standard‖ method for

subtyping foodborne pathogens and is the subtyping method used by PulseNet, the molecular

surveillance network in the U.S. and throughout the world to investigate foodborne illnesses and

outbreaks (19). The main advantage of PFGE is its high discriminatory power (i.e. ability to

separate unrelated strains) for subtyping foodborne pathogens, including many of the major

serovars of Salmonella (31). However, PFGE lacks discriminatory power for highly clonal

serovars of Salmonella, such as S. Enteritidis (19, 54) and S. Montevideo (9), or highly clonal

phage types like S. Typhimurium DT104 (19). The multistate S. Enteritidis outbreak associated

with shell eggs in 2010 was caused by the most common PFGE-XbaI pattern (JEGX01.0004) for

S. Enteritidis in the PulseNet database (8). A similar scenario was also observed recently during

the 2010 Italian-style salami outbreak, when the outbreak clone of S. Montevideo had the most

common PFGE pattern in the PulseNet database (9). Besides inadequate discriminatory power,

PFGE sometimes produces ambiguous data that are hard to compare and interpret between

different laboratories. To enhance comparability and interpretation, a standardized PFGE

protocol and an extensive quality assurance system were established by CDC (13, 19).

Compared to PFGE, multilocus sequence typing (MLST), which targets nucleotide

sequence differences of several DNA loci, has the potential to be a less labor-intensive method.

Moreover, DNA sequence data are discreet, unambiguous, highly informative, portable and

reproducible. Although MLST is an attractive subtyping approach, no satisfactory MLST scheme

has yet been developed for subtyping Salmonella during outbreak investigations. MLST schemes

targeting housekeeping genes have been developed; however, these schemes usually have much

lower discriminatory power than that of PFGE (16, 28, 33, 49). This suggests that housekeeping

genes do not provide sufficient resolution for investigating Salmonella outbreaks.

In order to increase discriminatory power, virulence genes have been included in MLST

schemes for subtyping Salmonella (17). Unlike housekeeping genes, virulence genes are

50

commonly under positive, diversifying selection (15). As a result, DNA sequences of virulence

genes tend to be more variable than housekeeping genes, and thus able to provide increased

discrimination (11, 17). Virulence genes have also been shown to provide high epidemiologic

concordance (i.e. able to group related strains together). For example, six virulence genes were

targeted in an MVLST (multi-virulence-locus sequence typing) scheme for subtyping Listeria

monocytogenes, which showed very high discriminatory power (0.99) and perfect epidemiologic

concordance (1.0) (11). Tankouo-Sandjong et al. (47) developed an MLST scheme based on both

virulence genes and housekeeping genes to identify serovars of Salmonella. This scheme targeted

the virulence genes fliC and fljB along with two housekeeping genes, gyrB and atpD (47).

However, the use of this MLST scheme to further characterize strains below serovar level was not

tested. In another MLST study, virulence genes and housekeeping genes showed high

discriminatory power (0.98) for subtyping S. Typhimurium, which was slightly higher than that of

PFGE (0.96) (17). In that MLST scheme three virulence genes, hilA, pefB and fimH, were

included together with the 16S rRNA gene and three housekeeping genes. Although this MLST

scheme appeared to have adequate discriminatory power for subtyping S. Typhimurium, its

capacity to discriminate strains from more clonal serovars such as S. Enteritidis was not tested.

Comparative genomic analysis (25) suggested that virulence genes alone are not discriminatory

enough for differentiating outbreak clones of S. Enteritidis. Therefore, additional genome targets

with greater sequence diversity than virulence genes are needed in order to create an effective

MLST scheme for Salmonella.

One of the fastest evolving genetic elements in bacteria genomes is CRISPRs (Clustered

Regularly Interspaced Palindromic Repeats) (43). CRISPRs have been identified within the

genomes of many archaeal and bacterial species, including Salmonella (30, 43, 50). CRISPRs

encode tandem sequences containing 21-47 bp direct repeats (DRs) separated by spacers of

similar size (Fig. 3.1). Spacers are derived from foreign nucleic acids such as phage or plasmids

and can protect bacteria from subsequent infection by homologous phage and plasmids (1).

51

Many CRISPR loci are flanked at the 3’ end by an AT-rich leader sequence and CRISPR-

associated (Cas) genes (Fig. 3.1) (1, 3, 26). As a bacterial immune system against foreign DNA,

CRISPRs must evolve rapidly to adapt to different phage pools (52). Besides addition of new

spacers, deletion of spacers is also frequently observed (12, 38). Because of the high

polymorphism of CRISPRs, they have been successfully used to subtype M. tuberculosis during

outbreak investigations (24). CRISPR sequence analysis has also characterized a number of other

bacteria, including Yersinia pestis (38), serotype M1 group A Streptococcus strains (29), and

Campylobacter jejuni (42).

Two CRISPR loci are found in all Salmonella serovars in the CRISPR database

(http://crispr.u-psud.fr/crispr/) (22, 50). Generally, the two CRISPR loci have different number

of repeats/spacers and different set of spacers. There have been no reports of CRISPRs being

used as markers in an MLST scheme for subtyping Salmonella. Therefore, the purpose of the

present study was to investigate whether MLST based on both virulence genes and CRISPRs can

accurately differentiate outbreak clones of the major serovars of Salmonella.

http://crispr.u-psud.fr/crispr/

52

3.3 Materials and methods

Bacterial isolates and DNA extraction. All 171 Salmonella isolates used in this study

(Table 3.2) were from culture collections maintained by the Centers for Disease Control and

Prevention (CDC) in Atlanta, GA, USA. This set of isolates represents the 9 serovars most

commonly associated with human disease and includes isolates involved in multiple outbreaks,

with 2 to 3 isolates per outbreak. In some cases, isolates obtained from the same outbreak which

had different PFGE patterns (had poor epidemiologic concordance by PFGE) were deliberately

included. All isolates were previously analyzed by serotyping and most isolates were analyzed

by PFGE by CDC. Bacterial isolates were stored at -80°C in 20% glycerol. When needed,

isolates were grown overnight in Tryptic Soy Broth (TSB) (Difco Laboratories, Becton Dickinson,

Sparks, MD) at 37°C. For all isolates, DNA was extracted using the UltraClean Microbial DNA

extraction kit (Mo Bio Laboratories, Solana Beach, CA) and stored at -20°C before use.

Selection of virulence genes. Two virulence genes (fimH and sseL) and two Clustered

Regularly Interspaced Short Palindromic Repeats regions (CRISPR1 and CRISPR2) were

selected as markers for MLST. The lengths and functions of these MLST markers are listed in

Table 3.3. Additionally, 12 other virulence genes (hilA, fimH2, pipB, sopE, sseF, sseJ, siiA, sifB,

stdA, fimA, bcfC and phoQ) (Table S1) were initially investigated, but were excluded from the

MLST scheme due to inadequate sequence variation.

PCR amplification. Primers were designed using Primer 3.0

(http://frodo.wi.mit.edu/primer3/) and are listed in Tables 3.4 and S1. Primers for CRISPR1 were

designed based upon consensus alignments of the published S. Typhimurium LT2 (accession

number AE006468) and S. Newport str. SL254 genomes (accession number CP001113), and the

S. Javiana str. GA_MM04042433 (accession number ABEH00000000) whole genome shotgun

sequence (Table 3.4). Primers for the other three markers were designed based on the published S.

http://frodo.wi.mit.edu/primer3/

53

Typhimurium LT2 genome. PCR amplifications were performed using a Taq PCR Master Mix

Kit (Qiagen Inc., Balencia, CA) and a Mastercycler PCR thermocycler (Eppendorf Scientific,

Hamburg, Germany). A 25 µl PCR reaction system contained 12.5 µl Taq PCR 2×master mix,

9.5 µl PCR-grade water, 1.0 µl DNA template, 1.0 µl forward primer (final concentration, 0.4 µM)

and 1.0 µl reverse primer (final concentration, 0.4 µM). A single PCR cycling condition was

used for separately amplifying all four markers (initial denaturation at 94 °C for 10 min; 28

cycles of 94°C for 1 min, 55°C for 1 min,72°C for 1 min; final extension at 72°C for 10 min).

DNA sequencing. After PCR, products for sequencing were treated with 1/20 volume of

shrimp alkaline phosphatase (1 U/µl, USB Corp. Cleveland, OH) and 1/20 volume of exonuclease

I (10 U/µl, USB Corp). The mixture was then incubated at 37°C for 45 min to degrade remaining

primers and unincorporated dNTPs. After that, the mixture was incubated at 80°C for 15 min to

inactivate the added enzymes. PCR products were sent to the Genomics Core Facility at the

Pennsylvania State University for sequencing using the ABI Data 3730XL DNA Analyzer. In

order to obtain complete DNA sequences of fimH and sseL, two more primers targeting the

internal regions of these two genes were used together with the forward and reverse primers

(Table 3.4). Both DNA strands of the amplicons were sequenced.

Sequence analysis and sequence type assignment. For fimH and sseL, sequences were

aligned and single nucleotide polymorphisms (SNPs) were identified using MEGA 4.0 (46). For

CRISPR1 and CRISPR2, analyses of the spacer arrangements were performed using

CRISPRcompar (23) and spacers were visualized as described by Deveau et al. (12). Different

allelic types (ATs) (sequences with at least one-nucleotide difference or one-spacer difference in

the case of CRISPRs) were assigned arbitrary numbers. The combination of 4 alleles (fimH, sseL,

CRISPR1 and CRISPR2) determined its allelic profile and each unique allelic profile was

designated as a unique sequence type (ST).

54

Calculation of epidemiologic concordance (E). Epidemiologic concordance (E) was

calculated using the equation developed by the European Study Group on Epidemiologic Markers

(44).

Cluster analysis. Cluster analyses were performed based on allelic profile data and

results were visualized using the tree drawing tool on PubMLST (www.pubmlst.org). CRISPR1

and CRISPR2 were combined into one allele for a more accurate cluster analysis, because

CRISPR1 and CRISPR2 might be spatially linked (50).

Nucleotide sequence accession number. DNA sequences of the four genetic MLST

markers were deposited in GenBank under accession numbers HQ329797 to HQ329931.

55

3.4 Results

Results of MVLST. We began this study by sequencing 14 virulence genes (fimH, sseL,

hilA, fimH2, pipB, sopE, sseF, sseJ, siiA, sifB, stdA, fimA, bcfC and phoQ) from 20 S.

Typhimurium, 15 S. Newport, and 15 S. Enteritidis isolates. Two virulence genes, fimH and sseL,

were found to provide discrimination equal to the combined discrimination of all 14 virulence

genes (data not shown), therefore, the other 12 virulence genes were excluded from the rest of the

study. fimH and sseL were sequenced from the remaining isolates, and the total number of allelic

types was about the same for fimH (17 allelic types) and sseL (16 allelic types) (Table 3.5). The

total number of polymorphic sites and percentage of polymorphic sites for fimH was 48 and

4.76% and for sseL it was 69 and 7.23%, respectively (Table 3.6). For sequence variations of

fimH within each serovar, the percentage of polymorphic sites ranged from 0% to 1.79%. For

sseL, the percentage of polymorphic sites ranged from 0% to 3.88%. For both fimH and sseL,

less polymorphism was observed for serovars Typhimurium, Enteritidis, Heidelberg, Javiana and

I 4, [5], 12: i :-, compared to serovars Newport, Montevideo, Muenchen and Saintpaul (Table 3.6).

Sequences of sseL were especially conserved in serovars Typhimurium, Heidelberg, Javiana and I

4, [5], 12: i :-, with no SNPs observed within each serovar. For all serovars, a total of 39

polymorphic sites in sseL were nonsynonymous, and 13 polymorphic sites in fimH were

nonsynonymous (Table 3.6).

Addition of CRISPR1 and CRISPR2 in the MLST scheme. Since the discrimination

provided by virulence genes was limited (separation to outbreak level was not achieved), addition

of CRISPR1 and CRISPR2 into the MLST scheme was investigated. The number of allelic types

for CRISPR1 (49 allelic types) and CRISPR2 (53 allelic types) were significantly greater than

those for virulence genes (Table 3.5). In total, there were 69 sequence types based on both

virulence genes and CRISPRs for all 171 isolates (Table 3.5). An equal number of allelic types

56

was observed in both CRISPR1 and CRISPR2 for serovars Javiana and Montevideo (Table 3.5).

However, for serovars Typhimurium, Enteritidis, Newport, Heidelberg and Saintpaul, CRISPR2

contained more allelic types than CRISPR1. In contrast, for serovar Muenchen, CRISPR1

contained more allelic types than CRISPR2 (Table 3.5).

Repeat sequences of the two CRISPRs were generally conserved as shown by the typical

repeat in Table 3.7. However, SNPs were sometimes observed in the repeat sequences in both

CRISPRs and we define these as ―repeat variants‖ (Table 3.7). The repeat variant of CRISPR1

had one SNP at the first nucleotide, which is A instead of C (Table 3.7). Terminal repeat

sequences which are located furthest from the leader sequence (Fig. 3.1) had more SNPs than the

repeat variants’ sequences when compared to the typical repeat sequence (Table 3.7).

The total numbers of unique spacers in CRISPR1 and CRISPR2 for all 171 isolates

analyzed were 166 and 182, respectively (Table 3.8). The number of spacers in CRISPR1 ranged

from 3 spacers to 24, while the number of spacers in CRISPR2 ranged from 2 to 25 (Table 3.8

and Fig. S1). CRISPR2 had more spacers than CRISPR1 for all serovars except S. Muenchen, in

which CRISPR1 had more spacers than CRISPR2 (Table 3.8 and Fig. S1). The number of

spacers also varied between different serovars. For example, the average number of spacers in

CRISPR2 of S. Muenchen was 2.5, while the average number of spacers in CRISPR2 of S.

Typhimurium was 19.6 (Table 3.8).

Cluster analyses. Cluster diagrams based on allelic profiles were constructed using only

the two virulence genes (Fig. 3.2a) and also using virulence genes combined with CRISPRs (Fig.

3.2b), respectively. Again, significantly greater separation was provided by the addition of

CRISPR1 and CRISPR2 for all serotypes, compared to the separation provided by virulence

genes alone. MLST results showed high congruence with serotypes of Salmonella. On both

cluster trees, different serovars occupied distinct branches, except serovars Typhimurium and I 4,

[5], 12: i :- , which were clustered together. Also, three singletons Mvo ST3, Mcn ST12 and S

ST4 were observed on both cluster trees (Fig. 3.2a and ab).

57

Comparison of MLST with PFGE. Compared to PFGE, the addition of CRISPRs into

the present MLST scheme provided greater discrimination of outbreak clones of S. Enteritidis

(Table 3.2 and 3.5). Most isolates of S. Enteritidis (25 out of 34) had either XbaI and BlnI PFGE

profile (JEGX01.0005, JEGA26.0004) or (JEGX01.0004, JEGA26.0002) (Table 3.2). Isolates

SE1, SE2, SE23, SE18, SE17, SE20, SE32 and SE33 had the same PFGE profile (JEGX01.0005,

JEGA26.0004), but had two MLST sequence types (E ST1 and E ST 9) (Table 3.2). Also, the

PFGE profile (JEGX01.0004, JEGA26.0002), which included isolates SE6, SE7, SE8, SE9, SE15,

SE16, SE19, SE30, SE12, SE13, SE14, SE26, SE31, SE28, SE29, SE24 and SE34, were further

separated into five sequence types (E ST3, E ST4, E ST6, E ST7, E ST8) by MLST (Table 3.2).

However, in the case of some serovars (S. Newport, S. Typhimurium, S. I 4, [5], 12: i :-, S.

Montevideo) PFGE provided greater separation than MLST for strains associated with different

outbreaks. For example, PFGE separated S. I 4, [5], 12: i :- isolates (ST1, ST2 and ST3) of the

turkey potpie outbreak (cluster 0706PAJPX-1c) from isolates (ST14 and ST15) of cluster

0607INjpx-1c, while these isolates could not be distinguished by MLST (Table 3.2). This was

also the case for S. Typhimurium isolates from the Noble Farm raw milk outbreak and outbreak

cluster 0309ORJPX-1c (Table 3.2). Another example when PFGE was more discriminatory than

MLST was the raw chicken outbreak (cluster 0807AZJIX-1c) and the salami/pepper outbreak

(cluster 0908ORJIX-1) of S. Montevideo (Table 3.2).

For S. Heidelberg, the most accurate outbreak identification was achieved by combining

MLST and PFGE. MLST provided separation for the cruise ship outbreak (cluster 0607NYJF6-

1c) and religious camp outbreak (cluster 0607PAJF6-1c), and for the hummus outbreak (cluster

JF6X01.0032) and outbreak cluster 0702TNJF6-1c, which could not be distinguished by PFGE

(Table 3.2). However, PFGE separated the cruise ship outbreak from the hummus outbreak,

which were indistinguishable by MLST (Table 3.2).

For S. Saintpaul, both methods allowed accurate separation and identification of all

outbreaks due to this serovar (Table 3.2).

58

Comparison of MLST results with epidemiologic data. Isolates with the same cluster

code had identical MLST sequence types for serovars S. Typhimurium, S. Newport, S. I 4, [5], 12:

i :- , S. Saintpaul and S. Montevideo (Table 3.2). MLST sequence types were the same among

isolates with the same cluster code for S. Enteritidis, except clusters 0505GAJEC-1c and

0612MEJEC-1c, and S. Heidelberg, except for 0704AZJPX-1c. Isolates SE10 and SE11, which

have the same cluster code (0505GAJEC-1c), had different sequence types and also different

PFGE patterns (Table 3.2). Three isolates of S. Enteritidis (SE12, SE13 and SE14) and three

isolates of S. Heidelberg (SH18, SH19 and SH20) which have the same cluster code

(0612MEJEC-1c and 0704AZJPX-1c), respectively, also had different sequence types (Table 3.2).

For S. Muenchen, almost all isolates within each of the six cluster codes had different sequence

types (Table 3.2). We could not perform a similar analysis with S. Javiana, because it did not

contain any isolates with a cluster code identified by PulseNet.

Epidemiologic concordance of MLST. Values of epidemiologic concordance of MLST

and PFGE for each serovar are listed in Table 3.9, except for the serovar Javiana which didn’t

contain any isolates with a cluster code identified by PulseNet which prevented calculation of an

epidemiologic concordance value for this serovar. Epidemiologic concordance values were

calculated based on isolates from well-defined outbreaks (isolates with cluster codes), so sporadic

isolates and isolates without cluster codes were excluded from epidemiologic concordance

calculations. Values of epidemiologic concordance were biased against PFGE, because isolates

from outbreaks with variations in PFGE patterns were deliberately included in this study, which

reduced the epidemiologic concordance of PFGE. For instance, isolates ST6, ST7 and ST8 were

all from the 2008 peanut butter outbreak, but each of them had a distinct PFGE pattern (Table

3.2). Generally speaking, MLST showed high epidemiologic concordance for subtyping all

serovars included in this study, except for S. Muenchen (epidemiologic concordance= 0.39)

(Table 3.9). MLST showed higher epidemiologic concordance than PFGE for serovars

Enteritidis, Typhimurium, Newport and Montevideo, equal epidemiologic concordance for

59

serovar Saintpaul, but lower epidemiologic concordance for serovars Heidelberg and Muenchen

(Table 3.9).

60

Table 3.1. Top nine most frequently reported serovars from human sources in 2005 which were

analyzed in the present study

Rank Serovar No. of laboratory-confirmed cases1

% of total cases



3 Newport 3295 9.1


5 Javiana 1324 3.7

6 I 4, [5], 12: i :- 822 2.3


8 Muenchen 733 2

9 Saintpaul 683 1.9

total 64.4

1Laboratory-confirmed cases include both outbreak cases and sporadic cases.

Data reproduced from CDC’s Salmonella annual review

(http://www.cdc.gov/ncidod/dbmd/phlisdata/salmtab/2006/SalmonellaAnnualSummary2006.pdf).

61

Table 3.2. Outbreak information, PFGE profile and MLST results for the 171 isolates analyzed in

the present study

CDC

Code1 Source State Food vehicle Cluster PFGE XbaI PFGE BlnI MLST

ST2

ST29 Water filter UT Frog 0909MAJPX-1 JPXX01.0177 JPXA26.0459 T ST1

ST30 Human /Stool MD Frog 0909MAJPX-1 JPXX01.0177 JPXA26.0459 T ST1

ST31 Human /Stool OH Frog 0909MAJPX-1 JPXX01.0177 JPXA26.0459 T ST1

ST4 Human /Stool CO Water 0803COJPX-1c JPXX01.0002 JPXA26.0002 T ST2

ST5 Water CO Water 0803COJPX-1c JPXX01.0002 JPXA26.0002 T ST2

ST6 Human /Stool OH peanut butter 0811MLJPX-1c JPXX01.0459 JPXA26.0462 T ST3

ST7 Human /Stool OH peanut butter 0811MLJPX-1c JPXX01.1825 JPXA26.0462 T ST3

ST8 Food/peanut butter MN peanut butter 0811SDCJPX-1c JPXX01.1818 JPXA26.0462 T ST3

ST9 Stool MA Raw milk Noble Farm outbreak JPXX01.0083 JPXA26.0019 T ST4

ST10 Raw milk MA Raw milk Noble Farm outbreak JPXX01.0083 JPXA26.0019 T ST4

ST17 NA OR NA 0309ORJPX-1c JPXX01.0981 JPXA26.0174 T ST4

ST18 NA OR NA 0309ORJPX-1c JPXX01.0981 JPXA26.0174 T ST4

ST11 Stool NM NA Santa Fe JPXX01.0003 JPXA26.0007 T ST5



ST26 Human /Stool OR Snake / mouse 0908ORJPX-1 JPXX01.0003 JPXA26.0003 T ST5

ST27 Human /Stool OR Snake / mouse 0908ORJPX-1 JPXX01.0003 JPXA26.0003 T ST5

ST28 Animal OR Snake / mouse 0908ORJPX-1 JPXX01.0003 JPXA26.0003 T ST5

ST39 Human /Stool VA Sporadic Sporadic JPXX01.0003 JPXA26.0042 T ST5

ST16 Stool MA Veggie booty 0704WIWWS-c JPXX01.1037 JPXA26.0333 T ST6

ST19 Stool VT Veggie booty 0704WIWWS-1c JPXX01.1037 JPXA26.0333 T ST6

ST20 Stool VT Veggie booty 0704WIWWS-1c JPXX01.1037 JPXA26.0333 T ST6

ST32 Human /Stool AR Day care 0602ARJPX-2c JPXX01.0010 JPXA26.0233 T ST7



ST40 Human /Stool NY Sporadic Sporadic JPXX01.0003 JPXA26.0042 T ST8

SE1 Human/stool MN Stuffed chicken 0603MNJEG-1c JEGX01.0005 JEGA26.0004 E ST1

SE2 Human/stool MN Stuffed chicken 0603MNJEG-1c JEGX01.0005 JEGA26.0004 E ST1

SE23 Human/stool MN NA 0603MNJEG-1c JEGX01.0005 JEGA26.0004 E ST1

SE18 Human/Stool MN NA 0803MNJEG-1 JEGX01.0005 JEGA26.0004 E ST1

SE3 Environment CA Almonds Almonds 2001 JEGX01.0012 NA E ST2

SE4 Food/raw almonds CA Almonds Almonds 2001 JEGX01.0012 NA E ST2

SE5 Environment CA Almonds Almonds 2001 JEGX01.0012 NA E ST2

SE21 Environment NA NA Almonds 2001 JEGX01.0013 NA E ST2

SE25 Environment NA Prison Almonds 2001 JEGX01.0013 NA E ST2

SE6 Human/stool ME NA 0612MEJEG-1c JEGX01.0004 JEGA26.0002 E ST3


SE26 Human/stool CO NA NA JEGX01.0004 JEGA26.0002 E ST3

SE31 Human/stool CO NA NA JEGX01.0004 JEGA26.0002 E ST3

SE24 Human/Stool WV NA NA JEGX01.0004 JEGA26.0002 E ST3

SE8 Human/Stool PA Egg 0801PAJEG-1 JEGX01.0004 JEGA26.0002 E ST4

SE9 Human/Stool PA Egg 0801PAJEG-1 JEGX01.0004 JEGA26.0002 E ST4

SE15 Human/Stool PA NA 0801PAJEG-1 JEGX01.0004 JEGA26.0002 E ST4

SE34 Human/Stool CT NA NA JEGX01.0004 JEGA26.0002 E ST4

SE11 Human/stool GA Hospital eggs 0505GAJEG-1c JEGX01.0018 JEGA26.0005 E ST4

SE10 Human/stool GA Hospital eggs 0505GAJEG-1c JEGX01.0034 JEGA26.0005 E ST5

SE12 NA ME NA 0612MEJEG-1c JEGX01.0004 JEGA26.0002 E ST6



SE16 Human/stool GA NA 0506GAJEG-1c JEGX01.0004 JEGA26.0002 E ST8

SE19 Human/stool GA NA 0506GAJEG-1c JEGX01.0004 JEGA26.0002 E ST8

SE30 Human/stool GA Prison 0506GAJEG-1c JEGX01.0004 JEGA26.0002 E ST8

SE22 Human/stool OR NA 0509ORJEG-1c JEGX01.0004 JEGA26.0025 E ST8

SE27 Human/stool OR NA 0509ORJEG-1c JEGX01.0004 JEGA26.0025 E ST8

SE28 Human SC NA 0504SCJEG-1c JEGX01.0004 JEGA26.0002 E ST8

SE29 Human/stool ID NA 0504CAOCJEG-1c JEGX01.0004 JEGA26.0002 E ST8

62

SE17 NA OH Frozen chicken Outbreak 2005-28-076 JEGX01.0005 JEGA26.0004 E ST9

SE20 NA OH NA Outbreak 2005-28-076 JEGX01.0005 JEGA26.0004 E ST9

SE32 Human/Stool MI NA 0708MIJEG-1c JEGX01.0005 JEGA26.0004 E ST9

SE33 Human/Stool MI NA 0708MIJEG-1c JEGX01.0005 JEGA26.0004 E ST9

SN1 NA IL NA NA JJPX01.0014 NA N ST1

SN2 NA IL NA NA JJPX01.0014 NA N ST1

SN3 NA NA NA 0509NHJJP-1c. JJPX01.0061 JJPA26.0021 N ST2

SN4 NA NA NA 0509NHJJP-1c. JJPX01.0061 JJPA26.0021 N ST2

SN5 NA NA NA NA JJPX01.0001 NA N ST3

SN6 NA NA NA NA JJPX01.0001 NA N ST3

SN7 Human/Stool CA NA 0710CAJJP-1c JJPX01.0422 JJPA26.0196 N ST4

SN8 Human/Stool CA NA 0710CAJJP-1c JJPX01.0422 JJPA26.0196 N ST4

SN11 Human/Stool SD NA 0712SDJJP-1c JJPX01.0654 JJPA26.0208 N ST4

SN12 Human/Stool SD NA 0712SDJJP-1c JJPX01.0654 JJPA26.0208 N ST4

SN9 Human/Stool AZ NA 0802AZJJP-1c JJPX01.0696 JJPA26.0212 N ST5

SN10 Human/Stool AZ NA 0802AZJJP-1c JJPX01.0438 JJPA26.0212 N ST5

SN13 Human/Stool GA NA 0711GAJJP-1c JJPX01.1319 JJPA26.0542 N ST6



SH1 Human DE cruise ship 0607NYJF6-1c JF6X01.0022 NA H ST1

SH2 Human NY cruise ship 0607NYJF6-1c JF6X01.0022 NA H ST1

SH3 Human NY cruise ship 0607NYJF6-1c JF6X01.0022 NA H ST1

SH8 Human IL hummus 0707ILJF6-1c JF6X01.0032 JF6A26.0076 H ST1




SH16 NA NA Sporadic Sporadic JF6X01.0122 NA H ST1


SH18 Human NA NA 0704AZJPX-1c JF6X01.0022 NA H ST1

SH4 Human PA a religious camp 0607PAJF6-1c JF6X01.0022 NA H ST2





SH12 Human TN NA 0702TNJF6-1c JF6X01.0032 JF6A26.0076 H ST3

SH13 Human TN NA 0702TNJF6-1c JF6X01.0032 JF6A26.0076 H ST3




SJ1 NA AL NA NA JGGX01.0012 NA J ST1

SJ5 NA AR NA NA JGGX01.0012 NA J ST1

SJ13 NA LA NA NA NA NA J ST1

SJ15 NA

outbreak NA JGGX01.0036 JGGA26.0017 J ST1

SJ2 NA TX NA NA JGGX01.0213 NA J ST2



SJ4 NA TX NA NA NA NA J ST4



SJ7 NA TX NA NA JGGX01.1525 NA J ST6

SJ10 NA HU NA NA NA NA J ST7

SJ11 NA MD NA NA JGGX01.0362 NA J ST8

SJ12 NA IL NA NA JGGX01.1352 NA J ST9

SJ14 NA NV NA NA NA NA J ST10

ST15 Stool CA Turkey potpie 0706PAJPX-1c JPXX01.0206 JPXA26.0180 I ST1

ST25 Stool GA Turkey potpie 0706PAJPX-1c JPXX01.0206 JPXA26.0180 I ST1

ST35 Food/Turkey potpie WI Turkey potpie 0706PAJPX-1c JPXX01.0206 JPXA26.0180 I ST1

ST145 Stool IN NA4 0607INjpx-1c JPXX01.0621 JPXA26.0160 I ST1

ST155 Stool IN NA 0607INjpx-1c JPXX01.0621 JPXA26.0160 I ST1

ST215 Human /Stool OH Snake 0806OHJPX-1c JPXX01.1596 JPXA26.0491 I ST2



63

ST245 Food/Egg wash ME Egg 0404PAJPX-1c JPXX01.0621 JPXA26.0057 I ST3

ST255 NA VT Egg 0404PAJPX-1c JPXX01.0621 JPXA26.0057 I ST3

ST355 Human /Stool OH Sporadic Sporadic JPXX01.0621 JPXA26.0055 I ST4

ST365 Human /Stool MA Sporadic Sporadic JPXX01.1212 JPXA26.0108 I ST4

ST375 Human /Stool MO Sporadic Sporadic JPXX01.0206 JPXA26.0380 I ST4

SMvo1 Blood TX NA NA NA NA Mvo ST1

SMvo2 Stool TX NA NA NA NA Mvo ST2

SMvo7 Human MD NA NA JIXX01.0524 NA Mvo ST3

SMvo3 Human/Rectal swab AZ Raw chicken 0807AZJIX-1c JIXX01.1014 NA Mvo ST3

SMvo8 Human/Stool AZ Raw chicken 0807AZJIX-1c JIXX01.0126 JIXA26.0012 Mvo ST3

SMvo9 Human/Swab AZ Raw chicken 0807AZJIX-1c JIXX01.0126 JIXA26.0012 Mvo ST3

SMvo10 Human/Stool AZ Raw chicken 0807AZJIX-1c JIXX01.0126 JIXA26.0012 Mvo ST3

SMvo11 Human/Stool UT Salami/pepper 0908ORJIX-1 JIXX01.0011 JIXA26.0012 Mvo ST3

SMvo12 Human/Urine OR Salami/pepper 0908ORJIX-1 JIXX01.0011 JIXA26.0012 Mvo ST3

SMvo13 Human/Stool AZ Salami/pepper 0908ORJIX-1 JIXX01.0011 NA Mvo ST3

SMvo15 Human/Stool TN Salami/pepper 0908ORJIX-1 JIXX01.0011 NA Mvo ST3

SMvo14 NA AZ NA NA NA NA Mvo ST3

SMvo4 NA TX NA NA JIXX01.0388 NA Mvo ST4

SMvo5 Human/Stool TX NA NA JIXX01.0875 NA Mvo ST5

SMvo6 Human/Stool TN NA NA JIXX01.1005 NA Mvo ST6

SMcn1 NA TX NA outbreak JJPX01.0014 NA Mcn ST1

SMcn2 NA NYC NA outbreak JJPX01.0014 NA Mcn ST2

SMcn3 Human/Stool LA NA 0509NHJJP-1c. JJPX01.0061 JJPA26.0021 Mcn ST3

SMcn4 NA TX NA 0509NHJJP-1c. JJPX01.0061 JJPA26.0021 Mcn ST4

SMcn5 NA TX NA NA NA NA Mcn ST5

SMcn6 Human/Stool TX NA NA NA NA Mcn ST6

SMcn7 NA TX NA 0710CAJJP-1c JJPX01.0422 JJPA26.0196 Mcn ST7

SMcn8 NA TX NA 0710CAJJP-1c JJPX01.0422 JJPA26.0196 Mcn ST8

SMcn9 Human/Stool TX NA 0802AZJJP-1c JJPX01.0696 JJPA26.0212 Mcn ST9

SMcn10 NA TX NA 0802AZJJP-1c JJPX01.0438 JJPA26.0212 Mcn ST10

SMcn11 Human/Stool TX NA 0712SDJJP-1c JJPX01.0654 JJPA26.0208 Mcn ST11

SMcn12 Human MD NA 0712SDJJP-1c JJPX01.0654 JJPA26.0208 Mcn ST12

SMcn13 NA OR Orange Juice 0711GAJJP-1c JJPX01.1319 JJPA26.0542 Mcn ST13

SMcn15 NA WA Orange Juice 0711GAJJP-1c JJPX01.1319 JJPA26.0542 Mcn ST13

SMcn14 NA WA Orange Juice 0711GAJJP-1c JJPX01.1319 JJPA26.0542 Mcn ST14

SS10 Human MA NA 0806MAJN6-1c JN6X01.0034 JN6A26.0038 S ST1



SS6 Human NE sprouts 0902NEJN6-1 JN6X01.0072 NA S ST2




SS1 NA MN jalapeños 0805NMJN6-1c JN6X01.0048 JN6A26.0019 S ST3

SS2 Human TX jalapeños 0805NMJN6-1c JN6X01.0048 JN6A26.0019 S ST3

SS3 Human NM jalapeños 0805NMJN6-1c JN6X01.0048 JN6A26.0019 S ST3

SS4 Human AZ jalapeños 0805NMJN6-1c JN6X01.0048 JN6A26.0019 S ST3

SS18 NA NE Sporadic Sporadic JN6X01.0622 NA S ST3

SS19 NA TX Sporadic Sporadic JN6X01.0067 JN6A26.0001 S ST3

SS16 NA CA Sporadic Sporadic JN6A26.0026 NA S ST4

SS13 NA CA NA 0807LACJN6-1c JN6X01.0021 JN6A26.0019 S ST5

SS14 NA CA NA 0807LACJN6-1c JN6X01.0021 JN6A26.0019 S ST5

SS15 NA MD Sporadic Sporadic JN6X01.0170 NA S ST5

SS20 NA NV Sporadic Sporadic JN6X01.0623 JN6A26.0047 S ST6

1ST: S. Typhimurium (ST 29-31 are isolates of S. Typhimurium var Copenhagen). SE: S.

Enteritidis. SN: S. Newport. SH: S. Heidelberg. SJ: S. Javiana. SI: S. I 4, [5], 12; i: -. SMvo: S.

Montevideo. SMcn: S. Muenchen. SS: S. Saintpaul.

64

2ST: sequence type. T: S. Typhimurium. E: S. Enteritidis. N: S. Newport. H: S. Heidelberg. J:

S. Javiana. Mvo: S. Montevideo. Mcn: S. Muenchen. S: S. Saintpaul. For instance, T ST1

stands for Typhimurium sequence type 1.

4NA: Not available.

5ST1-3, 14-15,21-25 and 35-37 are isolates of S. I 4, [5], 12; i: -.

65

Table 3.3. Size, function and nucleotide location of the four markers targeted in the present study

Marker Size (bps) Function Nucleotide location in S.

Typhimurium LT2

fimH 1008 Host-cell-specific recognition 28,425 - 29,432

sseL 954 Inflammation and macrophage killing 2,394,795 - 2,395,748

CRISPR1 122-8541

Defense against phage 3,076,611 - 3,077,006

CRISPR2 183-15251

Defense against phage 3,094,279 - 3,096,260

1 Length of CRISPRs varied because the number of repeats/spacers changed among the different

strains analyzed.

66

Table 3.4. Primers used to amplify and sequence the four MLST markers

Marker Primer sequence (5'-3') Note

fimH CGTCGTCATAAAAGGAAAAA Forward primer for both amplification and sequencing

GAACAAAACACAACCAATAGC Reverse primer for both amplification and sequencing

CTCGCCAGACAATGTTTACT Reverse primer for sequencing internal region

CATTCACTTCGCAGTTTTG Forward primer for sequencing internal region

sseL AGGAAACAGAGCAAAATGAA Forward primer for both amplification and sequencing

TAAATTCTTCGCAGAGCATC Reverse primer for both amplification and sequencing

GGAGTTGAAAATCTTTGGTG Reverse primer for sequencing internal region

TTTACCGAGAGAAAAGGTGA Forward primer for sequencing internal region

CRISPR1 GATGTAGTGCGGATAATGCT Forward primer for both amplification and sequencing

GGTTTCTTTTCTTCCTGTTG 1Reverse primer for both amplification and sequencing

GATGATATGGCAACAGGTTT 1Reverse primer for both amplification and sequencing

TATTGACTGCGATGAGATGA 2Reverse primer for both amplification and sequencing

CRISPR2 ACCAGCCATTACTGGTACAC Forward primer for both amplification and sequencing

ATTGTTGCGATTATGTTGGT Reverse primer for both amplification and sequencing

1 The 2 reverse primers (reverse 1 and reverse 2) of CRISPR1 were added together with forward

primer to amplify CRISPR1 in all serovars except S. Javiana.

2 Reverse primer for SJ (S. Javiana) was needed for amplification and sequencing of CRISPR1 in

S. Javiana isolates.

67

Table 3.5. Number of isolates, allelic types and sequence types in each serovar

Serovar No. of Isolates No. of allelic types No. of MLST

STs No. of PFGE patterns

fimH sseL CRISPR1 CRISPR2

Typhimurium 26 3 1 7 8 8 13

Enteritidis 34 2 3 2 6 9 7

Newport 15 3 4 4 6 6 8

Heidelberg 20 2 1 1 5 6 5

Javiana 15 3 1 10 10 10 8

I 4, [5], 12; i: - 13 1 1 1 4 4 7

Montevideo 15 2 2 6 6 6 7

Muenchen 15 2 2 14 2 14 7

Saintpaul 18 2 2 5 6 6 10

Total 171 171

16 49 53 69 72

1Total number of allelic types for fimH does not equal the sum of allelic types in each serovar,

because the same allelic type was sometimes present in more than one serovar.

68

Table 3.6. Allelic polymorphisms and nucleotide substitutions in the nucleotide sequences of

fimH and sseL

Gene Serovar No. of

polymorphic sites

% of

polymorphic sites

No. of

synonymous substitutions

No. of

nonsynonymous substitutions

fimH Typhimurium 2 0.2 1 1

Enteritidis 1 0.1 0 1

Newport 10 0.99 6 4

Heidelberg 1 0.1 1 0

Javiana 2 0.2 0 2

I 4, [5], 12; i: - 0 0 0 0

Montevideo 13 1.29 10 3

Muenchen 16 1.59 13 3

Saintpaul 18 1.79 14 4

Total 48 4.76 35 13

sseL Typhimurium 0 0 0 0

Enteritidis 2 0.21 1 1

Newport 18 1.89 8 10

Heidelberg 0 0 0 0

Javiana 0 0 0 0

I 4, [5], 12; i: - 0 0 0 0

Montevideo 10 1.05 4 6

Muenchen 6 0.63 3 3

Saintpaul 37 3.88 15 22

Total 69 7.23 30 39

69

Table 3.7. Analysis of CRISPR repeat sequences

CRISPR Type Repeat sequences (5’-3’)1

CRISPR1 Typical repeat CGGTTTATCCCCGCTGGCGCGGGGAACAC

Repeat variant AGGTTTATCCCCGCTGGCGCGGGGAACAC

Terminal repeats GTGTTTATCCCCGCTGACGCGGGGAACAC

GTGTTTATCCCCGCTGGCGCGGGGAACAT

CRISPR2 Typical repeat Same as the typical repeat in CRISPR1

Repeat variants GGGTTTATCCCCGCTGGCGCGGGGAACAC

CAGTTTATCCCCGCTGGCGCGGGGAACAC

CGGTTTATCCCCGCTGACGCGGGGAACAT

CGGTTTATCCCCGCTAGCGCGGGGAACAC

CGGTTTATCCCCGCTGACGCGGGGAACAC

TGGTTTATCCCCGCTGGCGCGGGGAACAC

CGGTTTATCCCCGCTGGCACGGGGAACAC

CGATTTATCCCTGCTGGCGCGGGGAACAC

CGGTTTATCCCTGCTGGCGCGGGGAACAC

Terminal repeats ACGGCTATCCTTGTTGGCGCGGGGAACAC

CGGTTTATCCCCGCTGCGCGGGGAACACT

1Underscored nucleotides are SNPs, compared to the typical repeat.

70

Table 3.8. Analysis of CRISPR spacers in different serovars

Serovar No. of unique spacers

Avg no. of

spacers + SD1

Minimum no. of spacers

Maximum no. of spacers

CRISPR1 CRISPR2 CRISPR1 CRISPR2 CRISPR1 CRISPR2 CRISPR1 CRISPR2

Typhimurium 26 34 11.4+4.0 19.6+6.8 3 4 14 25

Enteritidis 9 10 8.5+0.6 8.8+1.6 8 7 9 10

Newport 31 43 11.3+4.9 16.3+3.4 4 10 14 19

Heidelberg 10 18 10.0+0.0 12.6+2.7 10 10 10 17

Javiana 9 16 6.4+2.0 9.4+4.0 4 2 9 14

I 4, [5], 12; i: - 13 23 13+0 24+1 13 13 23 25

Montevideo 38 40 13.2+5.6 17.7+3.0 9 14 24 22

Muenchen 34 5 12.8+5.0 2.5+0.7 6 2 20 3

Saintpaul 35 33 12.2+1.3 16.5+5.6 11 7 14 23

Total2

166 182 10.8+4.5 14.4+6.4

1 SD: value of standard deviation.

2 Number of total unique spacers does not equal the sum of unique spacers in each serovar,

because a unique spacer was sometimes present in more than one serovar.

71

Table 3.9. Comparison of epidemiologic concordance1 between PFGE and MLST based on

virulence genes and CRISPRs for the selected strains analyzed in the present study

Subtyping

method

Enteritidis Typhimurium Newport Heidelberg I 4, [5], 12; i: -

Saintpaul Montevideo Muenchen

MLST 0.94 1.00 1.00 0.88 1.00 1.00 1.00 0.39

PFGE2 0.91 0.91 0.93 1.00 1.00 1.00 0.87 0.92

1Values for epidemiologic concordance were calculated based on isolates with cluster codes

identified by PulseNet.

2 The above values for epidemiologic concordance are biased against PFGE, because in some

cases outbreaks that contained strains with variations in PFGE patterns (had poor epidemiologic

concordance by PFGE) were deliberately selected in the present study.

72

Figure 3.1. Schematic view of the two CRISPR systems in Salmonella Typhimurium LT2.


The terminal direct repeats are represented by white diamonds. L stands for leader sequence. cas

genes are in grey while other core flanking genes (ygcF, iap and ptps) are in white. The figure is

not drawn to scale.

CRISPR1

CRISPR2

5

’

5

’

3

’

’

73

Figure 3.2. (a) Cluster diagram based on only fimH and sseL. (b) Cluster diagram based on fimH,

sseL and CRISPRs (combined allele of CRISPR1 and CRISPR2).

ST: sequence type. T: Typhimurium. E: Enteritidis. N: Newport. H: Heidelberg. J: Javiana. I:

I 4, [5], 12: i :-. Mvo: Montevideo. Mcn: Muenchen. S: Saintpaul. CRISPR1 and CRISPR2

were combined into one allele for the cluster analysis because CRISPR1 and CRISPR2 are

spatially linked (50).

(b)

(a)

74

3.5 Discussion

There are several important criteria to follow when selecting genetic markers to use in an

MLST scheme. First, the selected genetic markers should exhibit adequate sequence variations to

provide separation of unrelated strains (37). Secondly, genetic markers which provide

epidemiologically meaningful information should be selected so that the MLST scheme can

exhibit high epidemiologic concordance. Last but not least, genetic markers should be present in

the genome within all strains of the species of interest. Previous studies demonstrated that MLST

schemes based on Salmonella housekeeping genes showed poor discriminatory power when

compared to PFGE (16, 28, 49). Inclusion of virulence genes into one published MLST scheme

for subtyping S. Typhimurium increased discriminatory power to 0.98, which was comparable to

that of PFGE (0.96) (17). Virulence genes provided epidemiologically meaningful separation and

clustering of strains of Listeria monocytogenes (11). Besides virulence genes, CRISPRs were

selected as markers in the current MLST scheme because they were found to be one of the fastest

evolving genetic elements in bacterial genomes (43).

In the present study, cluster analyses based on the two virulence genes and two CRISPRs

accurately grouped isolates according to their specific serovars, except serovar Typhimurium and

I 4, [5], 12: i :- , which were clustered together. As serovar I 4, [5], 12: i:- is a monophasic

variant of serovar S. Typhimurium (14), our result is not unexpected. Virulence genes were

previously found to provide accurate identification of different serovars of Salmonella in other

studies as well (41, 47, 48).

Addition of CRISPRs significantly increased discriminatory power (Fig. 3.2) compared

to previously published MLST schemes, and the identification of individual outbreak clones was

achieved (Table 3.2). For example, one MLST scheme based on three housekeeping genes

75

(manB, pduF, and glnA) genes and one virulence gene (spaM) identified one sequence type

among 85 S. Typhimurium isolates and discriminatory power for the MLST scheme was 0 (16).

Another MLST scheme targeted seven housekeeping genes, aroC, dnaN, hemD, hisD, purE, sucA,

and thrA, and identified 12 sequence types among a total of 81 S. Newport isolates, which also

resulted in poor discriminator y power (0.61) (28). One MLST study based on virulence genes

(hilA, pefB and fimH), 16S rRNA gene and housekeeping genes showed high discriminatory

power (0.98) for subtyping S. Typhimurium (17); however, its capacity to discriminate strains

from more clonal serovars such as S. Enteritidis was not tested. In conclusion, the MLST scheme

described in the present study has superior discriminatory power, compared to previously

published MLST schemes for subtyping the major serovars of Salmonella, especially for the

highly clonal serovar S. Enteritidis.

As mentioned previously, the isolates selected for this study were biased towards those

that had poor epidemiologic concordance of PFGE, so future studies comparing of MLST and

PFGE need to be performed using a nonbiased strain collection. Generally speaking though, the

current MLST scheme showed high epidemiologic concordance for subtyping the major serovars

of Salmonella, except Muenchen (E=0.39) (Table 3.9). All S. Muenchen isolates had different

sequence types, except the two isolates, SMcn13 and SMcn15 from the orange juice outbreak

(Table 3.2). Interestingly, the allelic types of fimH and sseL were the same for all the S.

Muenchen isolates except isolate SMcn12 (Fig. 3.2a), which means CRISPR1 and CRISPR2

provided almost all of the discriminatory power in the case of S. Muenchen isolates (Fig. 3.2b).

Perhaps PFGE lacks pattern diversity for S. Muenchen because it cannot detect the subtle, but

epidemiologically important changes, detected by CRISPRs. Alternatively, CRISPRs may be

evolving too fast for S. Muenchen outbreak investigations, either because the specific niche

where S. Muenchen resides harbors a large number of different phage, and/or phage pools of S.

Muenchen are very dynamic. Dramatic differences have been observed in the rate of spacer

76

acquisition between different eubacteria. In Streptococcus thermophilus, CRISPRs are very

active and new spacer acquisition appear to be the primary mechanism of this species to defend

against phage (12); however, the rate of new spacer acquisition in other bacteria such as E. coli

appear to be much slower (50).

The current MLST scheme provided greater separation of S. Enteritidis isolates than

PFGE (Table 3.2). The predominant PFGE XbaI patterns for S. Enteritidis in the PulseNet

database are JEGX01.0004 and JEGX01.0005, which is problematic because this lack of PFGE

pattern diversity sometimes makes it difficult to separate potential outbreak-related isolates from

sporadic isolates (19). The discriminatory power of PFGE has been increased by the combination

of multiple restriction enzymes (54). However, whether the increased discrimination caused

potential loss of epidemiologic concordance was not addressed in that study. The present MLST

scheme allowed separation of the two predominant PFGE patterns of S. Enteritidis isolates (Table

3.2) and resulted in high epidemiologic concordance (Table 3.9). CRISPRs provided most of the

discrimination (Fig. 3.2b). CRISPRs in S. Enteritidis are evolving due to plasmids and/or phage

present in the environment (52). Fortunately, the rate of spacer insertion and deletion in

CRISPRs is slow enough such that they do not appear to change during an outbreak (Table 3.2).

CRISPRs may also reflect the specific phage and plasmid pool in the environment and hence

contain ecologically and geographically meaningful information for bacteria (32, 52). As a result,

CRISPRs may be useful for tracing an outbreak clone of Salmonella to the specific farm or food

processing plant which serves as the reservoir for the source strain of an outbreak. In conclusion,

the current MLST scheme effectively subtyped the two most common PFGE patterns of S.

Enteritidis and thus could enhance cluster detection and outbreak investigation capabilities. This

MLST method has the potential to be integrated into public surveillance laboratories to

complement PFGE for S. Enteritidis outbreak investigations.

77

It has been previously suggested that CRISPRs are poor epidemiological markers in

enterobacteria due to the slow rate of spacer acquisition (50). However, that study only analyzed

16 complete Salmonella genomes for CRISPRs, and only four of them were from the same

serovar as strains analyzed in the current study. Additionally, Touchon et al. only included in

their study one isolate of serovars Typhimurium, Enteritidis, Newport, and Heidelberg, so the true

value of CRISPRs for epidemiologic investigations could not be fully appreciated. Our study

analyzed 26, 34, 15 and 20 isolates from these serovars, respectively, and demonstrated that

CRISPR sequences may be implemented for epidemiologic investigations. We are currently

testing this hypothesis using larger numbers of isolates obtained from current and past Salmonella

outbreaks.

This MLST scheme has several other advantages that make it a potential subtyping

method for routine surveillance of Salmonella. First, the primers in this MLST scheme were

designed to have the same annealing temperature for all four markers so that it can be

conveniently performed in large-scale epidemiologic investigations. Second, the number of the

markers targeted was minimized to two virulence genes and two CRISPRs so that time and

expense can be saved during routine typing of Salmonella strains (37). Third, all four markers,

fimH, sseL, CRISPR1 and CRISPR2, are present in the major serovars of Salmonella and also in

all published genomes of Salmonella serovars, so the current MLST scheme is widely applicable.

Although this MLST scheme shows great promise, future research is needed to further validate it

for molecular epidemiologic purposes. For example, future research is needed using a random

collection of isolates representing a larger number of outbreaks, or otherwise epidemiologically

related isolates, to accurately compare the epidemiologic concordance of the present MLST

scheme with PFGE. In conclusion, the MLST scheme described in the current study maybe an

excellent subtyping method for tracking the farm-to-fork spread of the most prevalent serovars of

Salmonella during outbreaks.

78

3.6 Acknowledgements

We thank Dr. Bindhu Verghese for technical guidance throughout the study, especially

for the idea of combining CRISPRs into one allele to construct the cluster analysis. We also

acknowledge the Penn State Genomics Core Facility - University Park, PA for DNA sequencing.

This study was supported by a U.S. Department of Agriculture Special Milk Safety grant to the

Pennsylvania State University (contract: 2009-34163-20132).

79

3.7 References







299:43-51.

3. Brouns, S. J. J., M. M. Jore, M. Lundgren, E. R. Westra, R. J. H. Slijkhuis, A. P. L.

Snijders, M. J. Dickman, K. S. Makarova, E. V. Koonin, and J. van der Oost. 2008.

Small CRISPR RNAs guide antiviral defense in prokaryotes. Science. 321:960-964.



df



df

6. CDC. 2008. Outbreak of Salmonella serotype Saintpaul infections associated with multiple

raw produce items --- United States, 2008. MMWR Morb. Mortal. Wkly. Rep. 57:929-934.

7. CDC. 2008. OutbreakNet Foodborne Outbreak Online Database.

http://wwwn.cdc.gov/foodborneoutbreaks/

8. CDC. 2010. Investigation update: Multistate Outbreak of Human Salmonella Enteritidis

Infections Associated with Shell Eggs. http://www.cdc.gov/salmonella/enteritidis/

80

9. CDC. 2010. Investigation update: multistate outbreak of human Salmonella Montevideo

infections. http://www.cdc.gov/salmonella/montevideo/index.html

10. CDC. 2010. Preliminary FoodNet data on the incidence of infection with pathogens

transmitted commonly through food --- 10 States, 2009.

http://www.cdc.gov/mmwr/preview/mmwrhtml/mm5914a2.htm






in Streptococcus thermophilus. J. Bacteriol. 190: 1390-1400.

13. Efrain, M. R., M. A. Fair, R. Gautom, D. N. Cameron, S. B. Hunter, B. Swaminathan,

and T. J. Barrett. 2006. Standardization of pulsed-field gel electrophoresis protocols for the

subtyping of Escherichia coli O157:H7, Salmonella, and Shigella for PulseNet. Foodborne

Pathog. Dis. 3:59-67.

14. Echeita, M. A., S. Herrera, and M. A. Usera. 2001. Atypical, fljB-negative Salmonella

enterica subsp. enterica strain of serovar 4,5,12:i:- appears to be a monophasic variant of


15. Endo, T., K. Ikeo, and T. Gojobori. 1996. Large-scale search for genes on which positive

selection may operate. Mol. Biol. Evol. 13:685-690.

16. Fakhr, M. K., L. K. Nolan, and C. M. Logue. 2005. Multilocus sequence typing lacks the

discriminatory ability of pulsed-field gel electrophoresis for typing Salmonella enterica


17. Foley, S. L., D. G. White, P. F. McDermott, R. D. Walker, B. Rhodes, P. J. Fedorka-

Cray, S. Simjee, and S. Zhao. 2006. Comparison of subtyping methods for differentiating

81

Salmonella enterica serovar Typhimurium isolates obtained from food animal sources. J. Clin.

Microbiol. 44:3569-3577.

18. Foley, S. L., S. Zhao, and R. D. Walker. 2007. Comparison of molecular typing methods

for the differentiation of Salmonella foodborne pathogens. Foodborne Pathog. Dis. 4:253-276.


Ribot, B. Swaminathan, and Pulsenet Taskforce. 2006. PulseNet USA: a five-year update.





158:10-17.

21. Grimont, P. A. D., and F. Weill. 2007. Antigenic formulae of the Salmonella serovars, 9th

ed. WHO Collaborating Centre for Reference and Research on Salmonella. Institut Pasteur,

Paris, France.

22. Grissa, I., and C. Drevet. 2010. CRISPRs web-service. http://crispr.u-psud.fr/crispr/

23. Grissa, I., G. Vergnaud, and C. Pourcel. 2008. CRISPRcompar: a website to compare

clustered regularly interspaced short palindromic repeats. Nucl. Acids Res. 36:W145-148.

24. Groenen, P. M., A. E. Bunschoten, D. Soolingen, and J. D. Errtbden. 1993. Nature of

DNA polymorphism in the direct repeat cluster of Mycobacterium tuberculosis; application

for strain differentiation by a novel typing method. Mol. Microbiol. 10:1057-1065.

25. Guard, J. 2010. Evolutionary trends in two strains of Salmonella enterica subsp. I serovar

Enteritidis PT13a that vary in virulence potential.

http://www.ncbi.nlm.nih.gov/genomes/static/Salmonella_SNPS.html

82

26. Hale, C. R., P. Zhao, S. Olson, M. O. Duff, B. R. Graveley, L. Wells, R. M. Terns, and M.

P. Terns. 2009. RNA-Guided RNA cleavage by a CRISPR RNA-Cas protein complex. Cell.

139:945-956.

27. Hanning B. Irene, J.D. Nutt, Steven C. Ricke. 2009. Salmonellosis outbreaks in the United

States due to fresh produce: sources and potential intervention measures. Foodborne Pathog.

Dis. 6:635-648.

28. Harbottle, H., D. G. White, P. F. McDermott, R. D. Walker, and S. Zhao. 2006.

Comparison of multilocus sequence typing, pulsed-field gel electrophoresis, and

antimicrobial susceptibility typing for characterization of Salmonella enterica serotype

Newport isolates. J. Clin. Microbiol. 44:2449-2457.

29. Hoe, N., K. Nakashima, D. Grigsby, X. Pan, S. J. Dou, S. Naidich, M. Garcia, E. Kahn,

D. Bergmire-Sweat, and J. M. Musser. 1999. Rapid molecular genetic subtyping of

serotype M1 group A Streptococcus strains. Emerg. Infect. Dis. 5:254-263.

30. Horvath, P., and R. Barrangou. 2010. CRISPR/Cas, the immune system of bacteria and

archaea. Science. 327:167-170.

31. Hunter, P. R., and M. A. Gaston. 1988. Numerical index of the discriminatory ability of

typing systems: an application of Simpson's index of diversity. J. Clin. Microbiol. 26:2465-

2466.

32. Kunin, V., S. He, F. Warnecke, B. S. Peterson, M. Haynes, N. Ivanova, L. L. Blackall, M.

Breitbart, F. Rohwer, D. Mcmahon, and P. Hugenholtz. 2008. A bacterial metapopulation

adapts locally to phage predation despite global dispersal. Genome Res. 18:293-297.


major Salmonella enterica clones. Infect. Genet. Evol. 9:996-1005.

34. Lindstedt, B. -., M. Torpdahl, E. M. Nielsen, T. Vardund, L. Aas, and G. Kapperud.

2007. Harmonization of the multiple-locus variable-number tandem repeat analysis method

83

between Denmark and Norway for typing Salmonella Typhimurium isolates and closer

examination of the VNTR loci. J. Appl. Microbiol. 102:728-735.

35. Lindstedt, B., E. Heir, E. Gjernes, and G. Kapperud. 2003. DNA fingerprinting of

Salmonella enterica subsp. enterica serovar typhimurium with emphasis on phage type

DT104 based on variable number of tandem repeat Loci. J. Clin. Microbiol. 41:1469-1479.



and comparison with pulsed-field gel electrophoresis Typing. J. Clin. Microbiol. 38:1623-

1627.

37. Maiden, M. C. J. 2006. Multilocus sequence typing of bacteria. Annu. Rev. Microbiol.

60:561-588.

38. Pourcel, C., G. Salvignol, and G. Vergnaud. 2005. CRISPR elements in Yersinia pestis

acquire new repeats by preferential uptake of bacteriophage DNA, and provide additional

tools for evolutionary studies. Microbiology. 151:653-663.

39. Ross, I. L., and M. W. Heuzenroeder. 2005. Use of AFLP and PFGE to discriminate

between Salmonella enterica serovar Typhimurium DT126 isolates from separate food-

related outbreaks in Australia. Epidemiol. Infect. 133:635-644.




41. Scaria, J., R. U. M. Palaniappan, D. Chiu, J. A. Phan, L. Ponnala, P. McDonough, Y. T.

Grohn, S. Porwollik, M. McClelland, C. Chiou, C. Chu, and Y. Chang. 2008. Microarray

for molecular typing of Salmonella enterica serovars. Mol. Cell. Probes. 22:238-243.

42. Schouls, L. M., S. Reulen, B. Duim, J. A. Wagenaar, R. J. L. Willems, K. E. Dingle, F. M.

Colles, and J. D. A. Van Embden. 2003. Comparative genotyping of Campylobacter jejuni

84

by amplified fragment length polymorphism, multilocus sequence typing, and short repeat

sequencing: strain diversity, host range, and recombination. J. Clin. Microbiol. 41:15-26.

43. Sorek, R., V. Kunin, and P. Hugenholtz. 2008. CRISPR — a widespread system that

provides acquired resistance against phage in bacteria and archaea. Nat. Rev. Microbiol.

6:181-186.

44. Struelens, M. J. 1996. Consensus guidelines for appropriate use and evaluation of microbial

epidemiologic typing systems. Clin. Microbiol. Infect. 2:2-11.

45. Tamada, Y., Y. Nakaoka, K. Nishimori, A. Doi, T. Kumaki, N. Uemura, K. Tanaka, S.

Makino, T. Sameshima, M. Akiba, M. Nakazawa, and I. Uchida. 2001. Molecular typing

and epidemiological study of Salmonella enterica serotype Typhimurium isolates from cattle

by fluorescent amplified-fragment length polymorphism fingerprinting and Pulsed-Field Gel

Electrophoresis. J. Clin. Microbiol. 39:1057-1066.

46. Tamura, K., J. Dudley, M. Nei, and S. Kumar. 2007. MEGA4: molecular evolutionary

genetics analysis (MEGA) software version 4.0. Mol. Biol. Evol. 24:1596-1599.

47. Tankouo-Sandjong, B., A. Sessitsch, E. Liebana, C. Kornschober, F. Allerberger, H.

Hächler, and L. Bodrossy. 2006. MLST-v, multilocus sequence typing based on virulence

genes, for molecular typing of Salmonella enterica subsp. enterica serovars. J. Microbiol.

Methods. 69:23-36.

48. Tankouo-Sandjong, B., A. Sessitsch, N. Stralis-Pavese, E. Liebana, C. Kornschober, F.

Allerberger, H. Hächler, and L. Bodrossy. 2008. Development of an oligonucleotide

microarray method for Salmonella serotyping. Microb Biotechnol. 1:513-522.




184.

85

50. Touchon, M., and P. C. E. Rocha. 2010. The small, slow and specialized CRISPR and anti-

CRISPR of Escherichia and Salmonella. PLoS One. 5:e11126.

51. USDA-ERS. 2009. Foodborne illness cost calculator: Salmonella.

http://www.ers.usda.gov/Data/FoodborneIllness/salm_Intro.asp.


coevolution past. Proc. R. Soc. Lond., B, Biol. Sci. 277: 2097-2103.



nontyphoidal Salmonella infections in the United States. Clin. Infect. Dis. 38:127-134.


scheme for Salmonella Enteritidis. Emerging Infect. Dis.13:1932-1935.

86

Chapter 4 Characterization of clinical, poultry and environmental Salmonella

Enteritidis isolates using multilocus sequence typing based on virulence genes

and CRISPRs

Fenyun Liu1, Subhashinie Kariyawasam

2, Bhushan M. Jayarao

2, 3, Rodolphe Barrangou

4,

Peter Gerner-Smidt5, Efrain Ribot

5, Edward G. Dudley

1 and Stephen J. Knabel

1*

1Department of Food Science, the Pennsylvania State University, University Park,

Pennsylvania 16802;

2Animal Diagnostic Laboratory, Orchard Drive, the Pennsylvania State University,

University Park, Pennsylvania 16802;

3Department of Veterinary and Biomedical Sciences, the Pennsylvania State University,

University Park, Pennsylvania 16802;

4Danisco USA Incorporation, 3329 Agriculture Drive, Madison, Wisconsin 53716;

5Centers for Disease Control and Prevention, Atlanta, Georgia 30333

*Corresponding author. Mailing address: 437 Food Science Building, The Pennsylvania

State University, University Park, PA 16802, US. Phone: 814-863-1372. Email:

[email protected]

87

4.1 Abstract

Salmonella enterica subsp. enterica serovar Enteritidis has consistently been a major

cause of foodborne salmonellosis in the United States. Two major food vehicles for S. Enteritidis

are contaminated eggs and chicken. Improved subtyping methods are needed to accurately track

specific strains of S. Enteritidis related to human salmonellosis throughout the poultry and egg

food system. A multilocus sequence typing (MLST) scheme based on virulence genes (fimH and

sseL) and CRISPRs (Clustered Regularly Interspaced Short Palindromic Repeats) was developed

and used to characterize 34 clinical isolates, 70 poultry isolates and 63 hen house environment

isolates of S. Enteritidis. A total of 27 sequence types (STs) were identified for the 167 S.

Enteritidis isolates. The MLST scheme identified four persistent and predominate STs circulating

among U.S. clinical isolates, and poultry and hen house environments in Pennsylvania. It also

identified a potential environment-specific sequence type. Moreover, cluster analysis based on

fimH and sseL identified three epidemic clones and one outbreak clone of S. Enteritidis, as well as

11 singletons. Significant differences in virulence gene sequences between singletons and the

other STs suggested that singletons might have different virulence capacity than other STs. The

MLST scheme may provide information about the ecological origin of S. Enteriditis isolates,

potentially identifying strains that differ in virulence capacity.

88

4.2 Introduction

In the United States, Salmonella is the leading cause of bacterial foodborne disease, with

approximately 1.4 million human cases each year since 1996 (41). The second most-reported

serovar of Salmonella for human diseases is Salmonella enterica subspecies enterica serovar

Enteritidis (S. Enteritidis), which causes nearly as many human cases as S. Typhimurium, the

most prevalent serovar (8). The major food vehicle for S. Enteritidis is shell eggs and 80% of the

S. Enteritidis outbreaks and approximately 50,000 to 110,000 cases are egg-associated in the U.S.

each year (6, 34). The most recent S. Enteritidis outbreak was associated with eggs, in which

1,519 people got infected (9). Another common food vehicle is chicken, consumption of which is

considered as another risk factor for S. Enteritidis infections in humans (22, 30).

The chicken and egg food system is complex and contains a large number of niches that

may be potential sources of S. Enteritidis (Fig. 4.1) (21). S. Enteritidis has been isolated from a

wide variety of animals, such as rodents, wild birds and insects, which could serve as potential

reservoirs (17). Additional potential reservoirs for S. Enteritidis include chicken manure, sewage

and other moist and organic materials in farm environments (6). Furthermore, oral S. Enteritidis

infection in poultry could occur via contaminated water and feed (6). Infection of S. Enteritidis

among chickens can spread rapidly by direct contact with infected birds or with contaminated

materials within densely populated poultry houses (6, 17). Additionally, eggs can become

contaminated internally when the ovaries of layers are colonized by S. Enteritidis; this process is

called vertical transmission (17, 31). Eggs can also be contaminated externally by feces and

environments containing S. Enteritidis; this is referred to as horizontal transmission (14). In order

to control S. Enteritidis in poultry, one of the interventions employed on farms is egg quality

assurance programs, which involve acquisition of S. Enteritidis free chicks, control of pests

89

(including rodents and insects), use of S. Enteritidis-free feeds, and routine microbiologic testing

for S. Enteritidis in the farm environment (6).

Characterization of S. Enteritidis isolates in different niches on chicken farms and from

patients can help track dissemination of specific strains related to human salmonellosis and thus

identify reservoirs and routes of transmission. Before considering the epidemiology of

Salmonella, it is important to first define epidemic clone (EC), and outbreak clone (OC).

Epidemic clone is a strain or group of strains descended asexually from a single ancestral cell

(source strain) that is involved in one epidemic, and can often include several outbreaks (11).

Outbreak clone is a strain or group of strains descended asexually from a single ancestral cell

(source strain) that is involved in one outbreak (11).

Several molecular subtyping methods have been developed to characterize strains and

study the epidemiology of S. Enteritidis, including amplified fragment length polymorphism

(AFLP) (27, 15, 19, 36, 38), multiple loci variable number tandem repeat analysis (MLVA) (3, 5,

12, 35), and pulsed-field gel electrophoresis (PFGE) (18). PFGE is currently the ―gold standard‖

method used by public health surveillance laboratories for tracking foodborne pathogens

including Salmonella (18). The main advantage of PFGE is its high discriminatory power (i.e.

ability to separate unrelated strains) for subtyping most serovars of Salmonella. However, PFGE

lacks discriminatory power for highly clonal serovars like S. Enteritidis (18, 42). For example,

the most recent S. Enteritidis outbreak due to shell eggs was caused by the most common PFGE

pattern for S. Enteritidis in the PulseNet database and thus not all isolates may be related to this

outbreak (9). The lack of adequate discriminatory power makes it difficult to track a specific

strain of S. Enteritidis in the food system. Besides occasional inadequate discriminatory power,

PFGE does not provide appropriate information to infer phylogenetic relationships among

subtypes (33). Another subtyping method, MLVA, was reported to have higher discriminatory

power than PFGE for S. Enteritidis (3, 5, 12, 35). However, in some circumstances, strains that

90

had the same MLVA type were separated by PFGE (5). Moreover, replicates of the same strains

of Salmonella have been shown to have different number of repeat units at a specific locus (7, 13),

which makes accurate interpretation of results difficult.

Compared to PFGE and MLVA, Multilocus Sequence Typing (MLST), which targets

nucleotide sequence differences in several DNA loci, generates discreet, highly informative,

highly portable and reproducible data. Moreover, MLST is a well-accepted tool for studying the

population structure, evolution and diversity of bacteria (25). Recently, a new MLST scheme

based on virulence genes (fimH and sseL) and CRISPRs (Clustered Regularly Interspaced

Palindromic Repeats) was shown to provide better separation of S. Enteritidis than PFGE (29).

CRISPRs encode tandem sequences containing 21-47 bp DRs (direct repeats) and spacers of

similar size (Fig. 4.2). Spacers are short DNA sequences obtained from foreign nucleic acids

such as phages or plasmids and are inserted into bacterial chromosome to protect them from

infection by homologous phages and plasmids (2). Therefore, CRISPRs reflect the specific phage

and plasmid pools in the environment and hence contain ecological and geographic information

of the bacteria present there (24, 40). As a result, CRISPRs might be useful for tracing back a

clone of S. Enteritidis to the specific niches in the farm or food processing plant where it

originated. Therefore, the objective of the present study was to characterize S. Enteritidis isolates

from different sources using this scheme to investigate its epidemiology. With a better

understanding of the epidemiology of S. Enteritidis, more effective intervention strategies can be

established to prevent future S. Enteritidis outbreaks due to eggs and poultry meat.

91

4.3 Materials and methods

Bacterial isolates and DNA extraction. A summary of all Salmonella Enteritidis

isolates used in this study are listed in Table 4.1. A total of 167 isolates were obtained as follows:

34 from Centers for Disease Control and Prevention (CDC), 86 from the Pennsylvania Egg

Quality Assurance Program (PEQAP) and 47 from the Animal Diagnostic Lab (ADL) at the

Pennsylvania State University (Table S2). All 34 clinical isolates were previously analyzed (29),

among which 32 isolates were collected from 11 S. Enteritidis outbreaks and the other 2 isolates

were sporadic isolates. Bacterial isolates were stored at -80°C in 20% glycerol. When needed,

isolates were grown overnight in Tryptic Soy Broth (TSB) (Difco Laboratories, Becton Dickinson,

Sparks, MD) at 37°C. For all isolates, DNA was extracted using the UltraClean Microbial DNA

extraction kit (Mo Bio Laboratories, Solana Beach, CA) and stored at -20°C before use.

PCR amplification. Primers for all four markers were designed based upon consensus

alignments of the published S. Typhimurium LT2 genome (accession number AE006468) using

Primer 3.0 (http://frodo.wi.mit.edu/primer3/) (Table 4.2). PCR amplifications were performed

using a Taq PCR master mix kit (Qiagen Inc., Balencia, CA) in a Mastercycler PCR thermocycler

(Eppendorf Scientific, Hamburg, Germany). A 25 µl PCR reaction system contained 12.5 µl Taq

PCR master mix, 9.5 µl PCR-grade water, 1.0 µl DNA template, 1.0 µl forward primer (final

concentration, 0.4 µM) and 1.0 µl reverse primer (final concentration, 0.4 µM). A single PCR

cycling condition was used to amplify all four markers (initial denaturation at 94 °C for 10 min;

28 cycles of 94°C for 1 min, 55°C for 1 min, 72°C for 1 min; final extension at 72°C for 10 min).

DNA sequencing. After PCR, products for sequencing were treated by adding 1/20

volume of shrimp alkaline phosphatase (1 U/µl, USB Corp. Cleveland, OH) and 1/20 volume of

exonuclease I (10 U/µl, USB Corp). The mixture was then incubated at 37°C for 45 min to

http://frodo.wi.mit.edu/primer3/

92

degrade the primers and unincorporated dNTPs. After that, the mixture was incubated at 80°C

for 15 min to inactivate the enzymes. PCR products were then sent to the Genomics Core Facility

at the Pennsylvania State University for sequencing using the ABI Data 3730XL DNA Analyzer.

In order to obtain complete DNA sequences of fimH and sseL, two more primers targeting the

internal regions of these two genes were used together with the forward and reverse primers

(Table 4.2). Both DNA strands of the amplicons were sequenced.

Sequence analysis and sequence type assignment. For fimH and sseL, sequences were

aligned and single nucleotide polymorphisms (SNPs) were identified using MEGA 4.0 (37). For

CRISPR1 and CRISPR2, analyses of the spacer arrangements were performed using

CRISPRcompar (20) and spacers were visualized as described by Deveau et al. (16). Different

allelic types (ATs) (sequences with at least one-nucleotide difference or one-spacer difference in

the case of CRISPRs) were assigned arbitrary numbers. The combination of 4 alleles (fimH, sseL,

CRISPR1 and CRISPR2) determined its allelic profile and each unique allelic profile was

designated as a sequence type (ST).

Cluster analysis. Cluster analyses were performed based on allelic profile data by the

unweighted pair group method with arithmetic mean (UPGMA) and results were visualized using

the tree drawing tool on PubMLST (www.pubmlst.org). CRISPR1 and CRISPR2 were combined

into one allele for a more accurate cluster analysis, because CRISPR1 and CRISPR2 are likely to

be spatially linked (39).

Nucleotide sequence accession number. DNA sequences of the four genetic MLST

markers were deposited in GenBank under accession numbers HQ329919 to HQ329971.

93

4.4 Results

Results of MLST and sequence type distribution. In order to gain insights into the

sources and routes of transmission of S. Enteritidis contamination, the 167 isolates were

characterized using an MLST scheme based on virulence genes and CRISPRs which was

previously developed in our laboratories (Dr. Stephen Knabel and Dr. Edward Dudley). fimH,

sseL, CRISPR1 and CRISPR2 identified 12, 13, 14 and 20 allelic types, respectively. In total, 27

sequence types (STs) were identified for all 167 isolates (Fig. 4.6 and Table S2). There were 9,

12 and 15 STs in clinical, poultry and environmental isolates, respectively. For clinical isolates,

the 9 STs were E ST1, 2, 3, 4, 5, 6, 7, 8 and 9. The number of clinical isolates in each sequence

type is listed in Fig. 4.6. Out of the 9 STs found in clinical isolates, 5 STs (E ST2, 5, 6, 7 and 9)

were not found in either poultry or environmental isolates (Fig. 4.6). Those 5 STs came from

California, Georgia, Maine, Michigan and Ohio. For poultry isolates, the 12 STs included E ST1,

3, 4, 8, 12, 15, 21, 22, 23, 24, 25 and 26, and 7 of them (E ST15, 21, 22, 23, 24, 25 and 26) were

only found in poultry isolates. For the 15 STs in environmental isolates (E ST1, 3, 4, 8, 10, 11,

12, 13, 14, 16, 17, 18, 19, 20 and 27), 10 STs (E ST10, 11, 13, 14, 16, 17, 18, 19, 20 and 27) were

exclusively found in environment.

Predominant STs. An uneven distribution of the 27 STs was observed between different

sources. Overall, the 5 STs designated E ST1, 3, 4, 8 and 10 accounted for 19 %, 17%, 25%, 8%

and 7% of the total isolates, respectively, accounted for 76% of all isolates. Out of the 5

predominant STs, 4 of them (E ST1, 3, 4, and 8) were found in clinical, poultry and

environmental isolates. E ST1 made up 12% of clinical isolates, 33% of poultry isolates and 8%

of environmental isolates, respectively. E ST3 was found in 15% of clinical, 9% of poultry and

2% of environmental isolates. E ST4 accounted for 15% of clinical isolates, 30% of poultry

94

isolates and 3 % environmental isolates. E ST8 comprised 21% of clinical isolates, 14% of

poultry isolates and 40% of environmental isolates. E ST10 was only found in environmental

isolates, where it comprised 21% of all environmental isolates.

Cluster analyses. A cluster diagram based on virulence genes identified three epidemic

clones (ECI, ECII and ECIII), which included STs from multiple outbreaks, were identified by

fimH and sseL (Fig. 4.4). ECI contained 9 STs, E ST3, 4, 5, 8, 10, 12, 14, 18 and 27. In total,

110 (66% of total isolates) belonged to ECI, which was the largest cluster and contained 18

clinical, 41 poultry and 51 environmental isolates. ECII contained 22% of all isolates (8 clinical,

24 poultry, and 5 environmental isolates) and 3 STs (E ST1, 9 and 26). ECIII contained 3 STs (E

ST2, 7 and 13) which included 6 clinical isolates and 1 environmental isolate. One outbreak

clone (OC), E ST6, contained 2 clinical isolates from the same outbreak. Besides the 3 ECs and 1

OC, 11 singletons occupied a single branch on the tree. Among the 11 singletons, 5 STs (E ST11,

16, 17, 19 and 20) were found in the farm environment and 6 (E ST15, 21, 22, 23, 24 and 25)

were found in poultry isolates. These 6 poultry singletons were either sampled from eggs in

broiler hatcheries with hatchability problems or necropsy isolates from sick broilers.

Incorporation of CRISPRs into the MLST method separated isolates within the 3 ECs

(Fig. 4.5). Among the 15 STs in the 3 ECs, 4 STs (E ST1, 3, 4 and 8) were found in all sources

(clinical, poultry and environmental). These STs were also the predominant STs among all

isolates (Fig. 4.3). E ST12 was found in both poultry and environment. The other 10 STs were

from a single source including 4 STs (E ST1, 2, 5 and 9) found only in clinical isolates, 2 STs (E

ST26 and 27) found only in poultry isolates and 4 STs (E ST10, 13, 14 and 18) found only in

environmental isolates.

Spacer arrangements among STs. Fig. 4.6 shows the differences in spacer

arrangements among STs in CRISPR1 and CRISPR2. In CRISPR1, the number of spacers

ranged from 2 to 25; for CRISPR2, the number of spacers ranged from 3 to 25. Generally, there

95

were great similarities among STs in the 3 ECs and the OC. The singleton E ST16 also shared

spacers with STs in the 3 ECs and the OC; however many other spacers were deleted. The other

10 singletons contained totally different spacers from each other, except E ST21 and E ST22,

which shared most spacers within CRISPR1, and had identical CRISPR2 loci.

96

Table 4.1. Sources, sample types and isolation information for the 167 S. Enteritidis isolates

analyzed in the present study

Sources No. of

isolates Samples States

1 Source

2 Year

Clinical 34 Clinical (stool; foods related to

outbreaks)

13

states CDC

2001-

2009

Poultry 70

Poultry: 46 egg and necropsy

isolates of broiler PA ADL

2007-

2009

Poultry: 3 egg isolates from layer

houses PA PEQAP

1998-

1999

Poultry: 21 egg isolates from layer

houses PA PEQAP

2007-

2010

Environment 63

Environmental: 46 drag swabs PA PEQAP 1998-

1999

Environmental: 17 drag swabs PA PEQAP;

ADL

2007-

2010

1Clinical isolates were from 13 states, including CA, CO, CT, GA, ID, ME, MI, MN, OH, OR,

PA, SC and WV.

2 Isolates were received from CDC (Centers for Disease Control and Prevention), PEQAP

(Pennsylvania Egg Quality Assurance Program) and ADL (Animal Diagnostic Lab) in

Pennsylvania State University.

97

Table 4.2. Primers used to amplify and sequence the four MLST markers

Marker Primer sequence (5'-3') Note

fimH CGTCGTCATAAAAGGAAAAA Forward primer for both amplification and sequencing

GAACAAAACACAACCAATAGC Reverse primer for both amplification and sequencing

CTCGCCAGACAATGTTTACT Reverse primer for sequencing internal region

CATTCACTTCGCAGTTTTG Forward primer for sequencing internal region

sseL AGGAAACAGAGCAAAATGAA Forward primer for both amplification and sequencing

TAAATTCTTCGCAGAGCATC Reverse primer for both amplification and sequencing

GGAGTTGAAAATCTTTGGTG Reverse primer for sequencing internal region

TTTACCGAGAGAAAAGGTGA Forward primer for sequencing internal region

CRISPR1 GATGTAGTGCGGATAATGCT Forward primer for both amplification and sequencing

GGTTTCTTTTCTTCCTGTTG Reverse primer for both amplification and sequencing

CRISPR2 ACCAGCCATTACTGGTACAC Forward primer for both amplification and sequencing

ATTGTTGCGATTATGTTGGT Reverse primer for both amplification and sequencing

98

Figure 4.1. Potential routes of transmission of S. Enteritidis contamination throughout the egg

food system.

99

Figure 4.2. Schematic view of the two CRISPR systems in Salmonella Enteritidis strain P125109.


The terminal direct repeats are represented by white diamonds. Numbers of direct repeats and

spacers are represented by the numbers of diamonds and white rectangles, respectively. L stands

for leader sequence. cas genes are in grey while other core flanking genes (ygcF, iap and ptps)

are in white. Primer targeting sites are indicated by upward ponting arrows. The figure is not

drawn to scale.

CRISPR1

CRISPR2

5

’

5

’

3

’

3

’

100

Figure 4.3. Frequency of the five predominant sequence types (E ST1, 3, 4, 8 and 10) in clinical,

poultry and environmental isolates.

The five predominant sequence types accounted for 76% of all isolates analyzed in the present

study. All unlabeled pie slices in Fig. (b), (c), (d) are STs unique to that given source, except the

pie slice of E ST10.

(a)

(b) (c) (d)

101

Figure 4.4. Cluster diagram based on only fimH and sseL for all 27 sequence types.

MLST identified three epidemic clones (ECI, ECII and ECIII) and one outbreak clone. Clinical,

poultry and environmental isolates are designated by c, p and e, respectively. The number of

isolates from each source is indicated before the source designation.

102

Figure 4.5. Cluster diagram based on virulence genes and CRISPRs for all 27 sequence types.

Clinical, poultry and environmental isolates are designated by c, p and e, respectively. The

number of isolates from each source is indicated before the source designation.

103

ST Source Cluster CRISPR1 CRISPR2

1 2 3 4 5 6 7 8 9 10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25 1 2 3 4 5 6 7 8 9 10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

P125109 p p p p p p p p p p p p p p p p p

EST3 c(5), p(6), e(1) ECI p p p p p p p p p p p p p p p

EST10 e(13) ECI p p p p p p p p p p p p p p

EST2 c(5) ECIII p p p p p p p p p p p p p p p

EST13 e(1) ECIII p p p p p p p p p p p p p p p p

EST7 c(1) ECIII p p p p p p p p p p p p p p p p

EST9 c(4) ECII p p p p p p p p p p p p p p p p

EST12 p(3), e(5) ECI p p p p p p p p p p p p p p p p

EST1 c(4), p(23), e(5) ECII p p p p p p p p p p p p p p p p p

EST8 c(7), p(10), e(26) ECI p p p p p p p p p p p p p p p p p

EST6 c(2) OC p p p p p p p p p p p p p p p p p

EST14 e(3) ECI p p p p p p p p p p p p p p p p p

EST26 p(1) ECII p p p p p p p p p p p p p p p p p p

EST27 e(1) ECI p p p p p p p p p p p p p p p p p p

EST4 c(5), p(21), e2) ECI p p p p p p p p p p p p p p p p p p p

EST18 e(2) ECI p p p p p p p p p p p p p p p p p p

EST5 c(1) ECI p p p p p p p p p p p p p p p p

EST16 e(1) Singleton p p p p p

EST21 p(1) Singleton p p p p p p p p p p p p p p p p p p p p p p p p

EST22 p(1) Singleton p p p p p p p p p p p p p p p p p p p p p p p

EST17 e(1) Singleton p p p p p p p p p p l p p p p p p p p p p p p p p p p p p p p p p p p

EST20 e(1) Singleton p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p

EST11 e(1) Singleton p p p p p p p p p p p p p p p p p p p p p p p p p p p p p

EST23 p(1) Singleton p p p p p p p p p p p p p p p p p p p p p p

EST15 p(1) Singleton p p p p p p p p p p p p p p p p p p p p v p p p p p p p p p p p p p p p p

EST24 p(1) Singleton p p p p p p p p p p p p p p p p p p p p p p p

EST25 p(1) Singleton p p p p p p p p p p p p p p p p p p p

EST19 e(1) Singleton p p p p p p p p p p

Figure 4.6. Graphic representation of spacer arrangements in CRISPR1 and CRISPR2 of the 27 S.

Enteritidis sequence types.

Clinical, poultry and environmental isolates are designated by c, p and e, respectively. The

number of isolates from each source is indicated after the source designation. Repeats are not

included; only spacers are listed. The direction of the spacer arrangements is from the 5’ to 3’

leader end. Each unique spacer is represented by a unique combination of the background color

and the color of a particular character. Spacers were aligned and gaps represent the absent of a

particular spacer. Singletons had very distinct arrays of spacers and thus could not aligned.

P125109 is the strain of S. Enteritidis that has a whole genome sequence deposited in GenBank.

104

4.5 Discussion

S. Enteritidis outbreaks due to consumption of eggs and chicken pose a great public-

health and financial burden (6, 22, 30, 34). To prevent future outbreaks, it is important to identify

the reservoirs, sources and routes of transmission of S. Enteritidis throughout the chicken and egg

food system. Because S. Enteritidis is a highly clonal serovar, PFGE is often unable to accurately

differentiate outbreak clones (9, 42). Likewise, many other subtyping methods, including

ribotyping and plasmid profiling, were also not discriminatory enough differentiate S. Enteritidis

outbreak clones and thus were not useful tracking tools (26). Recently an MLST scheme based

on virulence genes and CRISPRs was shown to provide better discrimination for S. Enteritidis

than PFGE and may be a potential tool for tracking S. Enteritidis (29). To test its ability to track

specific subtypes of S. Enteritidis and to enhance our understanding of the epidemiology of S.

Enteritidis, a total of 167 S. Enteritidis isolates were subtyped by this MLST scheme and 27 STs

were identified (Fig. 6). Nine STs were observed among clinical isolates and 5 STs (E ST2, 5, 6,

7 and 9) were only found in clinical isolates (Fig. 6). It should be noted that our clinical isolates

were from various geographical locations within the U.S.; however, the poultry and

environmental isolates were all from Pennsylvania. Therefore, these STs may represent strains

that were not common in the PA poultry environment. In contrast, the other 4 STs (E ST1, 3, 4

and 8) were found in poultry and environmental isolates in PA (Fig. 6). For example, E ST4 was

found in 3 clinical isolates, 21 poultry and 2 environmental isolates from PA (Fig. 6 and S1).

This suggests that PA hen houses might be one of the sources of these clinical STs and thus the

MLST scheme based on virulence genes and CRISPRs might be used to track specific clinical

STs back to their geographic origin.

105

The present MLST scheme identified 5 predominant STs among all 167 isolates included

in the present study (Fig. 3). Because four STs (E ST1, 3, 4 and 8) were found in all sources

(clinical, poultry and environmental), they might represent four predominate STs circulating

among humans in the U.S. and poultry and hen house environments in PA. Moreover, these 4

STs were isolated over duration of ten years (Table 1 and S1), so they are also persistent STs

among humans in the U.S. and poultry industries in PA. As we were unable to find E ST10

among chicken and clinical isolates, this ST might represent an environment-adapted clone of S.

Enteritidis. A previous MLST study observed clustering of environmental versus clinical isolates

and suggested that environmental isolates might have decreased virulence to humans (23).

Likewise, we speculate that E ST10 might have decreased virulence as well, and plan to test this

hypothesis in the future.

In the present study, virulence genes were used to study the molecular epidemiology of S.

Enteritidis and identified 3 ECs and 1 OC (Fig. 4). Virulence genes had been previously used to

determine epidemic clones of Listeria monocytogenes (10) and also the epidemiology of

Salmonella among cow and human isolates, because they reflect virulence capacities of bacterial

isolates (1). It should be noted here that E ST6 might belong to an epidemic clone; however, no

other outbreak was identified in this cluster, probably due to the limited isolates included in the

present study, and thus it was designated an OC. Among the 3 ECs, ECI was the largest cluster

with 4 of the 5 predominant STs (E ST3, 4, 8 and 10 ) and 110 isolates (66%) from all 3 sources

(clinical, poultry and environmental) (Fig. 4). Therefore, ECI might represent a major epidemic

clone and may be responsible for causing most S. Enteritidis cases in the U.S. ECII might be

another major epidemic clone with the predominant sequence type E ST1. Isolates from all

sources were also found in ECII (Fig. 4). The existence of major clones of S. Enteritidis in

human and poultry was also observed by previous studies using pulsed-field gel electrophoresis

(PFGE) (4, 32). The 11 singletons identified in the present study were distant from the 3 ECs and

106

the OC on the cluster diagram and their virulence gene sequences varied significantly from other

STs (Fig. 4).

CRISPRs also separated singletons out from the 3 ECs and the OC due to dramatic

differences in spacer arrangements in both CRISPRs (Tables 10 and 11). Studies previously

demonstrated that bacteria from distant geographic space had strikingly different spacer

arrangements, most likely due to host-phage coevolution (24). Therefore, spacer arrangements

are a good indicator of bacterial adaptation to different microenvironments, which may be driven

by phages unique to specific niches. Therefore, singletons may represent unique ecotypes that are

distinct from STs in the 3 ECs and the OC. Six singletons (E ST15, 21, 22, 23, 24 and 25) may

be adapted to poultry and only pathogenic to chickens for the following reasons: 1) they were

only found in eggs with hatchability problems or necropsy isolates from sick broilers; 2) their

spacer arrangements were dramatically different from those in 3ECs and OC with no spacers

shared between them (Fig. 6); 3) and they showed significant differences in virulence gene

sequences as compared to those in 3ECs and the OC (Fig. 4). The other 5 singletons (E ST11, 16,

17, 19 and 20) were only found in the production environment, not in human and poultry, hence

they may be adapted to this environment. Spacers and virulence genes in these 5 singletons

differed significantly from those in ECs and the OC and thus they might be non pathogenic to

both humans and chickens. Future experiments are planned to compare the virulence of

singletons and clinical isolates. Host-phage coevolution may drive CRISPRs to evolve much

faster than virulence genes because ECs grouped by virulence genes were further separated into

many STs by CRISPRs (Fig. 4 and 5). This increased discrimination by CRISPRs was very

useful because they appeared to accurately separate outbreak clones within epidemic clones.

In conclusion, the present MLST scheme may be a valuable molecular subtyping method

for tracking the spread of S. Enteritidis throughout the poultry and egg food systems. Additional

research is needed to determine the source of S. Enteritidis contamination and the routes of

107

transmission using poultry isolates from other states in the U.S., as well as isolates from breeder

flocks.

108

4.6 Acknowledgements

We thank Dr. Bindhu Verghese for technical guidance throughout the study, especially

for the idea of combining CRISPRs into one allele in the cluster analysis. We also acknowledge

the Penn State Genomics Core Facility - University Park, PA for DNA sequencing. This study

was supported by a U.S. Department of Agriculture Special Milk Safety grant to the Pennsylvania

State University (contract: 2009-34163-20132).

109

4.7 References

1. Alcaine, S. D., Y. Soyer, L. D. Warnick, W. -. Su, S. Sukhnanand, J. Richards, E. D.

Fortes, P. McDonough, T. P. Root, N. B. Dumas, Y. Grohn, and M. Wiedmann. 2006.

Multilocus sequence typing supports the hypothesis that cow- and human-associated

Salmonella isolates represent distinct and overlapping populations. Appl. Environ. Microbiol.

72:7575-7585.







299:43-51.

4. Boonmar, S., A. Bangtrakulnonth, S. Pornrunangwong, J. Terajima, H. Watanabe, K.

Kaneko, and M. Ogawa. 1998. Epidemiological analysis of Salmonella Enteritidis isolates

from humans and broiler chickens in Thailand by phage typing and pulsed-field gel

electrophoresis. J. Clin. Microbiol. 36:971-974.

5. Boxrud, D., K. Pederson-Gulrud, J. Wotton, C. Medus, E. Lyszkowicz, J. Besser, and J.

M. Bartkus. 2007. Comparison of multiple-locus variable-number tandem repeat analysis,

pulsed-field gel electrophoresis, and phage typing for subtype analysis of Salmonella enterica

serotype Enteritidis. J. Clin. Microbiol. 45:536-543.

6. Braden, C. R. 2006. Salmonella enterica serotype Enteritidis and eggs: a national epidemic

in the United States. Clin. Infect. Dis. 43:512-517.

110

7. Call, D., L. Orfe, M. Davis, S. Lafrentz, and M. Kang. 2008. Impact of compounding error

on strategies for subtyping pathogenic bacteria. Foodborne Pathog. Dis. 5:505-516.



df

9. CDC. 2010. Investigation update: Multistate Outbreak of Human Salmonella Enteritidis

Infections Associated with Shell Eggs. http://www.cdc.gov/salmonella/enteritidis/




11. Chen, Y., and S. Knabel. 2008. Strain typing, p. 203-239. In D. Liu (ed.), Handbook of

Listeria monocytogenes.

12. Cho, S., D. J. Boxrud, J. M. Bartkus, T. S. Whittam, and M. Saeed. 2007. Multiple-locus

variable-number tandem repeat analysis of Salmonella Enteritidis isolates from human and

non-human sources using a single multiplex PCR. FEMS Microbiol. Lett. 275:16-23.

13. Davis, M. A., K. N. K. Baker, D. R. Call, L. D. Warnick, Y. Soyer, M. Wiedmann, Y.

Grohn, P. L. McDonough, D. D. Hancock, and T. E. Besser. 2009. Multilocus variable-

number tandem-repeat method for typing Salmonella enterica serovar Newport. J. Clin.

Microbiol. 47:1934-1938.

14. De Reu, K., K. Grijspeerdt, W. Messens, M. Heyndrickx, M. Uyttendaele, J. Debevere,

and L. Herman. 2006. Eggshell factors influencing eggshell penetration and whole egg

contamination by different bacteria, including Salmonella Enteritidis. Int. J. Food Microbiol.

112:253-260.

111

15. Desai, M., E. J. Threlfall, and J. Stanley. 2001. Fluorescent amplified-fragment length

polymorphism subtyping of the Salmonella enterica serovar Enteritidis phage type 4 clone

complex. J. Clin. Microbiol. 39:201-206.



in Streptococcus thermophilus. J. Bacteriol. 190:1390-1400.

17. Gast, R. K., and C. W. Beard. 1990. Production of Salmonella Enteritidis-contaminated

eggs by experimentally infected hens. Avian Dis. 34:438-446.


Ribot, B. Swaminathan, and Pulsenet Taskforce. 2006. Pulsenet USA: a five-year update.





158:10-17.

20. Grissa, I., G. Vergnaud, and C. Pourcel. 2008. CRISPRcompar: a website to compare

clustered regularly interspaced short palindromic repeats. Nucl. Acids Res. 36:W145-148.

21. Guard-Petter, J. 2001. The chicken, the egg and Salmonella Enteritidis. Environ. Microbiol.

3:421-430.

22. Kimura, A., V. Reddy, R. Marcus, P. Cieslak, J. Mohle‐Boetani, H. Kassenborg, S.

Segler, F. Hardnett, T. Barrett, and D. Swerdlow. 2004. Chicken consumption is a newly

identified risk factor for sporadic Salmonella enterica serotype Enteritidis infections in the

United States: a case‐control study in FoodNet sites. Clin. Infect. Dis. 38:S244-S252.

112

23. Kotetishvili, M., O. C. Stine, A. Kreger, J. G. Morris J., and A. Sulakvelidze. 2002.

Multilocus sequence typing for characterization of clinical and environmental Salmonella

strains. J. Clin. Microbiol. 40:1626-1635.

24. Kunin, V., S. He, F. Warnecke, B. S. Peterson, M. Haynes, N. Ivanova, L. L. Blackall, M.

Breitbart, F. Rohwer, D. Mcmahon, and P. Hugenholtz. 2008. A bacterial metapopulation

adapts locally to phage predation despite global dispersal. Genome Res. 18:293-297.


major Salmonella enterica clones. Infect. Genet. Evol. 9:996-1005.

26. Liebana, E., C. Clouting, L. Garcia-Migura, F. Clifton-Hadley, E. Lindsay, E. J.

Threlfall, and R. H. Davies. 2004. Multiple genetic typing of Salmonella Enteritidis phage-

types 4, 6, 7, 8 and 13a isolates from animals and humans in the UK. Vet. Microbiol.

100:189-195.

27. Lindstedt, B. -., E. Heir, T. Vardund, and G. Kapperud. 2000. A variation of the

amplified-fragment length polymorphism (AFLP) technique using three restriction

endonucleases, and assessment of the enzyme combination BglII-MfeI for AFLP analysis of

Salmonella enterica subsp. enterica isolates. FEMS Microbiol. Lett. 189:19-24.



and comparison with pulsed-field gel electrophoresis typing. J. Clin. Microbiol. 38:1623-

1627.

29. Liu, F., R. Barrangou, P. Gerner-Smidt, E. Ribot, S. Knabel, and E. Dudley. Novel

virulence gene and CRISPR multilocus sequence typing scheme for subtyping the major

serovars of Salmonella enterica subspecies enterica . J. Clin. Microbiol., in press.

30. Marcus, R., J. K. Varma, C. Medus, E. J. Boothe, B. J. Anderson, T. Crume, K. E.

Fullerton, M. R. Moore, P. L. White, E. Lyszkowicz, A. C. Voetsch, and F. J. Angulo.

113

2006. Re-assessment of risk factors for sporadic Salmonella serotype Enteritidis infections: a

case-control study in five FoodNet sites, 2002–2003. Epidemiol. Infect. 7:1-9.

31. Okamura, M., Y. Kamijima, T. Miyamoto, H. Tani, K. Sasai, and E. Baba. 2001.

Differences among six Salmonella serovars in abilities to colonize reproductive organs and to

contaminate eggs in laying hens. Avian Dis. 45:61-69.

32. Pang, J., T. Chiu, R. Helmuth, A. Schroeter, B. Guerra, and H. Tsen. 2007. A pulsed

field gel electrophoresis (PFGE) study that suggests a major world-wide clone of Salmonella

enterica serovar Enteritidis. Int. J. Food Microbiol. 116:305-312.

33. Parker, C. T., S. Huynh, B. Quinones, L. J. Harris, and R. E. Mandrell. 2010.

Comparative genotypes of Salmonella enterica serovar Enteritidis phage type 30 and 9c

strains isolated from three outbreaks associated with raw almonds. Appl. Environ. Microbiol.

doi: 10.1128/AEM.03053-09.

34. Patrick, M. E., P. M. Adcock, T. M. Gomez, S. F. Altekruse, B. H. Holland, and R. V.

Tauxe. 2004. Salmonella Enteritidis Infections, United States, 1985–1999. Emerging Infect.

Dis. 10:1-7.




36. Scott, F., J. Threlfall, J. Stanley, and C. Arnold. 2001. Fluorescent amplified fragment

length polymorphism genotyping of Salmonella Enteritidis: a method suitable for rapid

outbreak recognition. Clin. Microbiol. Infect. 7:479-485.

37. Tamura, K., J. Dudley, M. Nei, and S. Kumar. 2007. MEGA4: molecular evolutionary

genetics analysis (MEGA) software version 4.0. Mol. Biol. Evol. 24:1596-1599.



114


184.

39. Touchon, M., and P. C. E. Rocha. 2010. The small, slow and specialized CRISPR and anti-

CRISPR of Escherichia and Salmonella. PLoS One. 5:e11126.


coevolution past. Proc. R. Soc. Lond., B, Biol. Sci. 277: 2097-2103.



nontyphoidal Salmonella infections in the United States. Clin. Infect. Dis. 38:127-134.


scheme for Salmonella Enteritidis. Emerging Infect. Dis.13:1932-1935.

115

Chapter 5

Conclusions and future research

5.1 Conclusions

In the current study, virulence genes identified three epidemic clones and one outbreak

clones of S. Enteritidis among clinical, poultry and environmental isolates. Moreover, this study

suggested that virulence genes may be used to identify strains that differ in virulence capacity,

which is very useful in terms of epidemiology. However, virulence genes could not separate

individual outbreak clones within an epidemic clone.

The present study suggested that CRISPRs are good epidemiological markers and may be

implemented for outbreak investigations. Addition of CRISPR sequences dramatically improved

discriminatory power of this MLST method by accurately differentiating individual outbreak

clones. CRISPRs are evolving due to plasmids and/or phage present in the environment, so

CRISPRs may reflect the specific phage and plasmid pool in the environment and hence contain

ecologically and geographically meaningful information for bacteria. As a result, CRISPRs may

be useful for tracing an outbreak clone of Salmonella to the specific farm or food processing plant

which serves as the reservoir for the source strain of an outbreak.

This study also showed that MLST based on virulence genes and CRISPRs has the

potential to be integrated into public surveillance laboratories to complement PFGE for

Salmonella outbreak investigations. First, the current MLST scheme accurately differentiated

outbreak clones of the major nine serovars of Salmonella. Second, the MLST scheme described

in the present study has superior discriminatory power, compared to previously published MLST

schemes for subtyping the major serovars of Salmonella. Third, the current MLST scheme

116

effectively subtyped the two most common PFGE patterns of S. Enteritidis and thus could

enhance cluster detection and outbreak investigation capabilities of this highly clonal serovar.

Four, the MLST scheme was highly congruent with serotyping for the major nine serovars of

Salmonella. To summerize, the MLST scheme described in the current study maybe an excellent

subtyping method for tracking the farm-to-fork spread of the major serovars of Salmonella during

outbreaks.

Additionally, the present MLST scheme based virulence genes and CRISPRs may be an

excellent tool to study the epidemiology of S. Enteritidis. MLST identified four STs (E ST1, 3, 4

and 8) that might represent four predominate and persistent STs circulating among humans in the

U.S. and poultry and hen house environments in PA. It also identified an environmental specific

sequence type E ST10, which might represent an environment-adapted clone of S. Enteritidis.

Moreover, cluster analysis based on fimH and sseL identified three epidemic clones and one

outbreak clone of S. Enteritidis, as well as 11 singletons. Significant differences in virulence

gene sequences and spacer arrangements between singletons and the other STs suggested that

singletons might have different virulence capacity than other STs.

In conclusion, the MLST scheme based on virulence genes and CRISPRs maybe an

excellent tool for subtyping this important foodborne pathogen during outbreak investigations.

Additionally, the present MLST scheme may be a valuable molecular subtyping method for

tracking the spread of S. Enteritidis throughout the poultry and egg food systems by providing

information about the ecological origin of S. Enteritidis isolates.

117

5.2 Future research

Although this MLST scheme shows great promise, future research is needed to further

validate it for molecular epidemiologic purposes. First, future research is needed using a random

collection of isolates representing a larger number of outbreaks, or otherwise epidemiologically

related isolates, to accurately compare the epidemiologic concordance of the present MLST

scheme with PFGE. A nonbiased strain collection needs to be included in future studies because

the isolates selected for this study were biased towards those that had poor epidemiologic

concordance of PFGE. Second, isolates from other major serovars of Salmonella should be tested

with the present MLST scheme. For instance, srovars Braenderup, Oranienburg, Agona, Infantis

and Thompson are of epidemiologic importance for causing many cases of human salmonellosis.

Third, future studies are required to investigate the ability of the present MLST scheme to

differentiate outbreak clones of S. Muenchen. Poor epidemiologic concordance for S. Muenchen

was observed in the present study and research is needed to test the following hypotheses: 1)

PFGE lacked discriminatory power for subtyping S. Muenchen; 2) CRISPRs involve too fast for

S. Muenchen.

Future research is required to investigate why the congruence between the current MLST

scheme and serotyping occurred. I speculate that the correlation between serotyping and the

MLST scheme results from the possible correlations between markers targeted by each subtyping

method. Serotyping targets the O antigens (lipopolysaccharide) and H antigens (peritrichous

flagella), which might involve in recognition and attachment to host cells. fimH gene encode

fimbrial adhesion which allows Salmonella to recognize and adhere to different receptors on host

cells, while sseL helps Salmonella to survive and replicate inside the host. CRISPRs defend

against phage and plasmids in the specific environment. These markers are all related in the

118

interaction of Salmonella with the host environment, which might explain the congruence

between serotyping and the present MLST scheme. To test this hypothesis, future research is

required.

Future studies are required to compare the current MLST scheme with Multiple Loci

VNTR Analysis (MLVA) for subtyping the highly clonal serovar S. Enteritidis. A number of

MLVA studies showed better discriminatory power than PFGE for subtyping S. Enteritidis.

Furthermore, MLVA has been used successfully along with other subtyping methods for S.

Enteritidis outbreak investigations. Therefore, comparison of discriminatory power and

epidemiologic concordance between MLVA and the current MLST scheme using a collection of

well-characterized isolates of S. Enteritidis from a number of outbreaks is needed.

Previous studies and the current study showed that PFGE banding patterns were not

stable during outbreak investigations, which suggested that the genomic content of Salmonella

might change during outbreaks. Likewise, how stable are CRISPRs during outbreaks? The

current study shown that CRISPRs appeared to be stable among isolates from the same outbreak;

however, future studies are needed to further test the stability of CRISPRs by mimicking the

farm-to-fork transmission of Salmonella during outbreaks and comparing CRISPRs in isolates

from different transmission points.

Moreover, future research is needed to examine how CRISPR1 and CRISPR2 evolve in

Salmonella. The present study suggested that the rate of spacer intake and deletion in CRISPRs

is suitable for Salmonella outbreak investigations. However, there were no studies that measured

the rate of CRISPR spacer intake and deletion. Additionally, future studies should investigate the

role of CRISPR1 and CRISPR2 in defending the bacteria against phage, respectively. The

present study observed more variability in spacer arrangements in CRISPR2 than CRISPR1 in

most serovars, except serovar S. Muenchen, in which CRISPR1 appeared to be more active than

119

CRISPR2. Studies are needed to unveil the reasons for these observations in order to better

understand the roles and mechanisms of CRISPRs in different serovars of Salmonella.

Additional research is also needed to determine the source of S. Enteritidis contamination

and the routes of transmission throughout the egg and poultry food system. First, the present

study only included poultry and environmental isolates from Pennsylvania, so it would be

interesting to include poultry isolates from other states in the U.S. Second, analysis of S.

Enteritidis isolates from breeder flocks in the future would help determine whether or not they

serve as the original source of S. Enteritidis sequence types causing illness in both poultry and

humans. Third, isolates from other potential contamination sources in the egg and poultry food

system should be characterized by the present MLST scheme. By comparing sequence types of

isolates from different sources along the production system with clinical sequence type, the

original contamination source and routes of transmission could be identified. Lastly, the present

study identified two major epidemic clones (ECI and ECII) in the U.S. It would be interesting to

investigate whether they are also major epidemic clones in the world. Previous studies have

observed the existence of major clones of S. Enteritidis in human and poultry in different

countries by PFGE. Therefore, future studies are required to characterize S. Enteritidis isolates

from other countries using the present MLST scheme.

Future experiments are required to compare the virulence of singletons with clinical

isolates. Data in the present study suggested that six singletons (E ST15, 21, 22, 23, 24 and 25)

may be adapted to poultry and only pathogenic to chickens and the other five singletons (E ST11,

16, 17, 19 and 20) might be non pathogenic to both humans and chickens. Future experiments

that examine the virulence of these singletons can test the above hypotheses. We also speculate

that E ST10, the environment-adapted sequence type, might have decreased virulence as well, and

this hypothesis should be tested in the future as well.

120

Future studies are required to investigate sequence type distributions of major serovars in

other animals that serve as reservoirs for Salmonella, such as turkey, cattle and swine.

Characterization of isolates from these animals by the current MLST scheme can broaden our

understanding of epidemiology of Salmonella in animals as a whole. Many questions need to be

answered. What are the contamination sources and routs of transmission of Salmonella in farms?

Are there major epidemic clones that circulate among different animals and causing most cases of

salmonellosis? Would sequence type distribution vary among different animals and are there

animal-specific sequence types?

Lastly, future studies are required to investigate virulence genes and CRISPRs as MLST

markers for subtyping other important foodborne pathogens. In the present study, MLST based

on virulence genes and CRISPRs allowed accurate identifications of outbreak clones for the

major serovars of Salmonella, so similar MLST schemes could be developed for other foodborne

pathogens. CRISPRs have been identified within the genomes of many bacterial species,

including many foodborne pathogens. For example, Campylobacter, Vibrio and Yersinia are

common foodborne pathogens causing many human infections and harbors CRISPR loci at the

same time. Development of MLST schemes based on virulence genes and CRISPRs for these

pathogens could potentially aid outbreak investigations and prevent future diseases.

121

APPENDIX

Supplemental materials

Table S1. Primers used to amplify and sequence other virulence genes

Marker Primer type1

Primer sequence (5'-3')

hilA Forward TTAATCGTCCGGTCGTAGTG

Reverse TCTGCCAGCGCACAGTAAGG

fimH2 Forward CCTCTTTTATTTGCTCTGCT

Reverse GTTATAAGCGAGGTCGTCAG

pipB Forward GGGCCTCTGTTTGAATACTT

Reverse ACAAAAATCACCTTATATCTTTTT

sopE Forward CGTCGCCATAAAAATGAATA

Reverse TGCATAGTTATCTAAAAGGAGAA

sseF Forward CGCAATCAAGATGAGTTATG

Reverse CACTCTCCATATTGGTTTCC

sseJ Forward CACTATGCCATTGAGTGTTG

Reverse ACCGGCACTATGATATTGAG

siiA Forward ATCAGGAGACAACATGGAAG

Reverse ATACCGGGAAAAGATAAAGC

sifB Forward TCGAATACCACCTATTCCAG

Reverse CAGGGGATTGTAAATCCATA

stdA Forward CAGGTATTTCAGGGTGTAGG

Reverse GTATGATGTATGGCGCTTCT

fimA Forward TATTGCGAGTCTGATGTTTG

Reverse TGACGGGATTATTCGTATTT

bcfC Forward TGCTTAAAAATATGGGGGTA

Reverse AAGGAAGGCTGTCGAATAAT

phoQ Forward CGATCCACAGTAAAGGAATG

Reverse TTGATAAAACCACCTTTCGT

1 Primers were used for both amplification and sequencing.

122

ST CRISPR1 CRISPR2

1 2 3 4 5 6 7 8 9 10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29 1 2 3 4 5 6 7 8 9 10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

33

33

SE str. P125109 p p p p p p p p p p p p p p p p p

EST3 p p p p p p p p p p p p p p p

EST2 p p p p p p p p p p p p p p p

EST9 p p p p p p p p p p p p p p p p


EST1 p p p p p p p p p p p p p p p p p



EST4 p p p p p p p p p p p p p p p p p p p


SN str. sl254 p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p

NST1 p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p

NST4 p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p

NST3 p p p p p p p p p p p p p p p p p p p p p p p

NST2 p p p p p p p p p p p p p p p p p p p

NST6 p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p

NST5 p p p p p p p p p p p p p p p p p p p p p p p p

ST str, LT2 p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p

TST7 p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p

TST1 p p p p p p p p p p p p p p p

TST2 p p p p p p p p p p p p p p p p p

TST3 p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p

TST6 p p p p p p p p p p p p p p p p p p p p p p p p p p

TST5 p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p

TST8 p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p

TST4 p p p p p p p p p p p p p p p p p p p p p p p

123

SH str. sl476 p p p p p p p p p p p p p p p p p p p p p v p p p p p p p p p p p p p p p p p p p p p

HST5 p p p p p p p p p p p p p p p p p p p p p p

HST4 p p p p p p p p p p p p p p p p p p p

HST1 p p p p p p p p p p p p p p p p p p p p p p p p p p

HST3 p p p p p p p p p p p p p p p p p p p p p p p p p p

HST2 p p p p p p p p p p p p p p p p p p p p p

HST6 p p p p p p p p p p p p p p p p p p p p

SS str. sara23 p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p

SST2 p p p p p p p p p p p p p p p p p p p p p p p p p p p p

SST1 p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p

SST3 p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p

SST4 p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p

SST5 p p p p p p p p p p p p p p p p p p p p p p p p p

SST6 p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p

SST8 p p p p p p p p p p p p p p p p p p p p p

JST7 p p p p p p p p p p p p p p p p p



JST5 p p p p p p p p p p p p p p p p p p p p p

JST3 p p p p p p p p p p p p p p p p p p p

JST9 p p p p p p p p p p p p p p

JST10 p p p p p p p p p

JST4 p p p p p p p p p p p p p p p p


JST2 p p p p p p

IST1 p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p

IST4 p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p

IST2 p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p

IST3 p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p

124

McnST7 p p p p p p p p

McnST2 p p p p p p p p p

McnST8 p p p p p p p p p p


McnST9 p p p p p p p p p p p p p p p p p p

McnST11 p p p p p p p p p p p p p p p p p p p p

McnST1 p p p p p p p p p p p p p p p p p p p p p p


McnST10 p p p p p p p p p p p p p p p p p p p

McnST6 p p p p p p p p p p p p p p p p p

McnST3 p p p p p p p p p p p p p p p p



McnST12 p p p p p p p p p p p p p p p p p

MvoST1 p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p

MvoST4 p p p p p p p p p p p p p p p p p p p p p p

MvoST5 p p p p p p p p p p p p p p p p p p p p p p p p

MvoST2 p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p

MvoST6 p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p

MvoST3 p p p p p p p p p p p p p p p p p p p p p p p p l p p p p p p p p p p p p p p p

Figure S1. Graphic representation of spacer arrangements in CRISPR1 and CRISPR2.

Repeats are not included; only spacers are listed. The direction of the spacer arrangements is from the 5’ to 3’. Each unique

spacer is represented by a unique combination of the background color and the color of a particular character. Spacers were

aligned and gaps represent the absence of a particular spacer. Singletons had very distinct arrays of spacers and thus could not be

aligned. SE str. P125109, SN str. sl254, ST str, LT2, SH str. sl476 and SS str. sara23 are strains of S. Enteritidis, S. Newport, S.

Typhimurium, S. Heidelberg and S. Saintpaul, respectively, that have whole genome sequences deposited in GenBank.

125

Table S2. Source, isolate information and MLST results for the 167 isolates analyzed in the

present study

Code Source1 Sample type2 State Hen house Year Bird type MLST allelic profile

MLST ST3

fimH sseL CRISPR1 CRISPR2

CDC 1 CDC Clinical MN NA 2006 NA 1 1 1 1 E ST1


CDC 3 CDC Clinical CA NA 2001 NA 1 2 2 2 E ST2



CDC 6 CDC Clinical ME NA 2006 NA 1 3 1 3 E ST3


CDC 8 CDC Clinical PA NA 2007 NA 1 3 1 4 E ST4


CDC 10 CDC Clinical GA NA 2005 NA 1 3 1 5 E ST5







CDC 17 CDC Clinical OH NA 2005 NA 1 1 1 6 E ST9



CDC 20 CDC Clinical OH NA 2005 NA 1 1 1 6 E ST9

CDC 21 CDC Clinical NA NA 2001 NA 1 2 2 2 E ST2

CDC 22 CDC Clinical OR NA 2005 NA 1 3 1 1 E ST8


CDC 24 CDC Clinical WV NA 2009 NA 1 3 1 3 E ST3

CDC 25 CDC Clinical NA NA 2001 NA 1 2 2 2 E ST2

CDC 26 CDC Clinical CO NA 2008 NA 1 3 1 3 E ST3

CDC 27 CDC Clinical OR NA 2005 NA 1 3 1 1 E ST8

CDC 28 CDC Clinical SC NA 2005 NA 1 3 1 1 E ST8

CDC 29 CDC Clinical ID NA 2005 NA 1 3 1 1 E ST8


CDC 31 CDC Clinical CO NA 2008 NA 1 3 1 3 E ST3

CDC 32 CDC Clinical MI NA 2007 NA 1 1 1 6 E ST9

CDC 33 CDC Clinical MI NA 2007 NA 1 1 1 6 E ST9

CDC 34 CDC Clinical CT NA 2007 NA 1 3 1 4 E ST4

PEQAP 76 PEQAP Env PA A 1998-1999 NA 1 3 1 19 E ST10

PEQAP 77 PEQAP Env PA E 1998-1999 NA 22 10 14 20 E ST11


PEQAP 79 PEQAP Env PA B 1998-1999 NA 1 3 1 19 E ST10




PEQAP 83 PEQAP Env PA F 1998-1999 NA 1 3 1 1 E ST8


PEQAP 85 PEQAP Env PA NA 1998-1999 NA 1 3 1 1 E ST8

PEQAP 86 PEQAP Env PA C 1998-1999 NA 1 3 1 6 E ST12


PEQAP 88 PEQAP Egg PA B 1998-1999 NA 1 3 1 6 E ST12



PEQAP 91 PEQAP Egg PA B 1998-1999 NA 1 3 1 6 E ST12

PEQAP 92 PEQAP Env PA D 1998-1999 NA 1 3 1 1 E ST8






126





PEQAP 102 PEQAP Egg PA G 1998-1999 NA 1 3 1 6 E ST12























ADL 1 ADL Egg PA H 2008 Broiler 1 3 1 1 E ST8

ADL 2 ADL Egg PA I 2008 Broiler 1 1 1 1 E ST1







ADL 9 ADL Fecal PA NA 2008 Broiler 1 3 1 4 E ST4

ADL 10 ADL Necropsy PA J 2009 Broiler 1 3 1 4 E ST4

ADL 11 ADL Necropsy PA I 2009 Broiler 1 3 1 4 E ST4

ADL 12 ADL Egg PA K 2009 Broiler 1 3 1 4 E ST4

ADL 13 ADL Env PA K 2009 Broiler 1 3 1 4 E ST4



ADL 16 ADL Necropsy PA J 2009 Broiler 16 18 24 31 E ST25


ADL 18 ADL Egg PA L 2007 Broiler 1 1 1 1 E ST1


















127













PEQAP 2 PEQAP Env PA M 2007 Layer 1 3 1 1 E ST8

PEQAP 3 PEQAP Egg PA M 2008 Layer 1 3 1 3 E ST3



PEQAP 6 PEQAP Env PA N 2008 Layer 1 1 1 1 E ST1

PEQAP 7 PEQAP Egg PA N 2008 Layer 1 1 1 1 E ST1

PEQAP 8 PEQAP Egg PA O 2007 Layer 1 3 1 4 E ST4

PEQAP 10 PEQAP Egg PA P 2009 Layer 1 3 1 1 E ST8

PEQAP 11 PEQAP Env PA P 2010 Layer 1 3 1 1 E ST8

PEQAP 12 PEQAP Egg PA P 2010 Layer 1 3 1 1 E ST8

PEQAP 13 PEQAP Env PA Q 2009 Layer 1 3 1 1 E ST8

PEQAP 14 PEQAP Egg PA Q 2009 Layer 1 3 1 1 E ST8

PEQAP 15 PEQAP Env PA Q 2010 Layer 1 3 1 1 E ST8

PEQAP 16 PEQAP Env PA R 2009 Layer 1 3 1 4 E ST4

PEQAP 18 PEQAP Egg PA S 2008 Layer 1 3 1 1 E ST8

PEQAP 19 PEQAP Env PA T 2009 Layer 1 3 1 22 E ST14

PEQAP22 PEQAP Env PA U 2007 Layer 1 3 1 22 E ST14

PEQAP23 PEQAP Env PA U 2007 Layer 1 3 1 71 E ST27

PEQAP25 PEQAP Egg PA V 2007 Layer 1 3 1 4 E ST4

PEQAP26 PEQAP Env PA W 2010 Layer 1 3 1 3 E ST3

PEQAP27 PEQAP Egg PA W 2010 Layer 1 3 1 3 E ST3



PEQAP30 PEQAP Env PA X 2010 Layer 1 3 1 1 E ST8

PEQAP31 PEQAP Egg PA X 2010 Layer 1 3 1 4 E ST4

PEQAP32 PEQAP Env PA Y 2010 Layer 1 3 1 1 E ST8

PEQAP33 PEQAP Egg PA Y 2010 Layer 17 19 65 73 E ST15

PEQAP34 PEQAP Env PA Z 2010 Layer 1 3 1 1 E ST8

PEQAP35 PEQAP Egg PA Z 2010 Layer 1 3 1 1 E ST8

PEQAP36 PEQAP Egg PA Z 2010 Layer 1 3 1 1 E ST8

PEQAP38 PEQAP Egg PA AA 2010 Layer 1 3 1 1 E ST8

PEQAP39 PEQAP Env PA AB 2008 Organic layer 1 3 1 1 E ST8

PEQAP40 PEQAP Env PA AB 2008 Organic layer 1 1 1 1 E ST1

PEQAP41 PEQAP Egg PA AB 2008 Organic layer 1 1 1 71 E ST26

PEQAP42 PEQAP Egg PA AB 2008 Organic layer 1 1 1 1 E ST1

PEQAP43 PEQAP Env PA AC 2008 Organic layer 1 1 1 1 E ST1

PEQAP44 PEQAP Egg PA AC 2010 Organic layer 1 1 1 1 E ST1

1 Isolates were obtained from CDC (Centers for Disease Control and Prevention), PEQAP

(Pennsylvania Egg Quality Assurance Program) and ADL (Animal Diagnostic Lab) in

Pennsylvania State University.

2 Sample type includes clinical, egg, necropsy and environmental isolates. Env stands for

environment.

128

3 ST: sequence type. E: S. Enteritidis. For instance, E ST1 stands for sequence type 1 for

Enteritidis.

virulence gene and crispr multilocus sequence typing

Documents