dna - based signatures defend against biological warfare agents and their makers

14
rill ... DNA-BASED SIGl':ATURES'DEFEND ~~ . ",. . AdAINST BIOLOGICAL WARFARE AGENTS AND THEIR MAKE~S , AROIANA TIWARI, SUSMrr KOSTA, ROOPESiI JAIN WIth the end of the Cold War, the thrbat of nuclear holocaust faded but another threat emerged - attack by terrorists or even natio~ usip.g biological agents such as bacteria, viruses, biological toxins, and ge~etically 'altered organisms. The former Soviet Union once had a formidable biol~ical weapons program. Now, several countries and extremist groups are belieVl:(dto possess or to be developing biological weapons that could threaten urban P9puJations, destroy livestock, and wipe out crops. Even terrorists with limited ~kills and resources could make biological weapons without much'difficulty. It's not complex, it's not expensive,. and you don't need a large facility. For these reasons, biological weapons have dubbed the poor man's atomic bQmb, Contributing to the ease of making and concealing biological weapons is 'the . dual-use nature of the materials to produce such weapons, be¢ause they, are found in many legitimate medical research and agricultural acti~ities as well. The agents used in biological weapons are difficult todetec~ an~to identify quickly and reliably. Yet, early detection and iden,tification are ctucial for minimizing their potentially catastrophic humapand e~onomic cost. A major objective ofbiol6gical warfare is 4evelop~g better equipment, both fixed and portable, to detect biological agents. How,ever, any det~ction system is dependent on knowing the sign&tures of organism~ likely tq be tl,sed in biological weapons; These signatures are telltale bits of'PNA unique t,o pathogens (disease-causing microbes). Without prqper signatures, medical authorities could .lose houI:sor days trying to. determine the. cause of an outbreak, or they could be treating victims with ineffective antibiotics. Because of the importance of biological signatures as a key thrust ofits effort to improve response· to terrorist attacks. Over the past several years, scientific,' teams expect- to produce species- level signatures for all the' most likely b.iological. .,-.,.,. •.. ."DNA-Based Signatures Agliinst BiologicalWarfare .2ffJ ., 'warfare pathogens. The team also expecis to have an iriitial set of species~level 'signatures for likely agnculturalpathogens, because an attack 60 a nation's "food supply could b.ejust as disruptive as'an attac~ ohthe .civilian population. Modem health and s~~urityconCeli1s have raised iriterest in the real-time detection and identification, of pathogenic microbes. Bacterial and viral pathogens have alwaysrepn:seIlted.one of the greatest threats to human health, andin-recent timesfuis threat .Increased due to the possibility of engineered biological agents. For these and other reasons; the genome sequencing field has targefed and sequenced-the 'complete genomes of hundreds of baderia.and thousands of viruses over the past q~cade, with ·many more sequences expected to appear in the near future. These sequences now make it possible.todevelop probe-based assays capable of identifying " any of hundreds of organisms in environmental an.<!, clinical samples. Such assays rely on detecting a DNA sequence that distinguishes the ,target organism from all other known bllcteriaand viruses and from backgroun4.material, which cQuld include DNA from humans, other animals, plants; or other species. A · probe that 'accurately distinguishes between a target geno~Oo'-Or set of genomes--:-and all olherbackground genomes is terrned a signature sequence. DNA signatures are nucleotide sequences that can be' used to detect the ,•presence of an organism and to distinguish that organism from all other species. Several Levels of Signature The prime aim is to develop strain-level signatures for tI:i.~ top suspected agents. Strains are a subset of a species, and their DNA !nay differ by about 0.1 percent within the species. A species, in turn, is a member of a larger related group (genus), and itsDNA may differ by a percent or so from that 'of other !nemb~rs of the genus, Characterizing pathogens at the strain -hivel signatures are essential for determining the native origin of a pathogen ass,ociated with an outbreak; such mfonttation could help law inforcement id~tify the group or . groupsbehirrd the attack. The biological foUndations work aims to provide validated signatures · usefulto public health and law enforcement agencies as well as classified .signatures for the national security community. In developing these signatures, , biological foundation researchers are also shedding light on poorly understood aspects of biology, microbiology, and genetics, such as iinmunology, evolution, and virulence. Increased'kIlowledge in these fields holds the promise of better medical treatments, including new kinds of vaccines. The biological foundations work is one element in DOE's Chemical anaBiological NonprolifetationProgram. Livetmore'scomponent<>fthis work is managed by its Nonproliferation, Arms . Control, andIntemationai secUrity DIrectorate. Other components of the overall program iI}clude detection, 'modeling and prediction, decontamination, and t~chD:01ogydemonstration projects. Livermore researchers were among the fIrst tot~cognize, in the early 1990s, the tremendous potential of detectors based on DNA signatures. OWe knew that a lot Of work was necessary to , ' 4

Upload: herbalbiz

Post on 21-Jan-2015

826 views

Category:

Documents


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: DNA - based signatures defend against biological warfare agents and their makers

rill...

DNA-BASED SIGl':ATURES'DEFEND~~ . ",. .

AdAINST BIOLOGICAL WARFARE AGENTS

AND THEIR MAKE~S ,

AROIANA TIWARI, SUSMrr KOSTA, ROOPESiI JAIN

WIth the end of the Cold War, the thrbat of nuclear holocaust faded but

another threat emerged - attack by terrorists or even natio~ usip.g biologicalagents such as bacteria, viruses, biological toxins, and ge~etically 'alteredorganisms. The former Soviet Union once had a formidable biol~ical weaponsprogram. Now, several countries and extremist groups are belieVl:(dto possessor to be developing biological weapons that could threaten urban P9puJations,destroy livestock, and wipe out crops. Even terrorists with limited ~kills andresources could make biological weapons without much'difficulty. It's notcomplex, it's not expensive,. and you don't need a large facility. For thesereasons, biological weapons have dubbed the poor man's atomic bQmb,

Contributing to the ease of making and concealing biological weapons is 'the. dual-use nature of the materials to produce such weapons, be¢ause they, are

found in many legitimate medical research and agricultural acti~ities as well.The agents used in biological weapons are difficult todetec~ an~to identifyquickly and reliably. Yet, early detection and iden,tification are ctucial for

minimizing their potentially catastrophic humapand e~onomic cost.A major objective ofbiol6gical warfare is 4evelop~g better equipment,

both fixed and portable, to detect biological agents. How,ever, any det~ctionsystem is dependent on knowing the sign&tures of organism~ likely tq be tl,sedin biological weapons; These signatures are telltale bits of'PNA unique t,opathogens (disease-causing microbes). Without prqper signatures, medicalauthorities could .lose houI:sor days trying to. determine the. cause of anoutbreak, or they could be treating victims with ineffective antibiotics. Becauseof the importance of biological signatures as a key thrust ofits effort to improveresponse· to terrorist attacks. Over the past several years, scientific,' teamsexpect- to produce species- level signatures for all the' most likely b.iological.

.,-.,.,.•..."DNA-Based Signatures Agliinst BiologicalWarfare .2ffJ

., 'warfare pathogens. The team also expecis to have an iriitial set of species~level'signatures for likely agnculturalpathogens, because an attack 60 a nation's"food supply could b.ejust as disruptive as'an attac~ ohthe .civilian population.

Modem health and s~~urityconCeli1s have raised iriterest in the real-timedetection and identification, of pathogenic microbes. Bacterial and viralpathogens have alwaysrepn:seIlted.one of the greatest threats to humanhealth, andin-recent timesfuis threat .Increased due to the possibility ofengineered biological agents. For these and other reasons; the genomesequencing field has targefed and sequenced-the 'complete genomes ofhundreds of baderia.and thousands of viruses over the past q~cade, with

·many more sequences expected to appear in the near future. These sequencesnow make it possible.todevelop probe-based assays capable of identifying

" any of hundreds of organisms in environmental an.<!,clinical samples. Suchassays rely on detecting a DNA sequence that distinguishes the ,target organismfrom all other known bllcteriaand viruses and from backgroun4.material, whichcQuld include DNA from humans, other animals, plants; or other species. A

·probe that 'accurately distinguishes between a target geno~Oo'-Or set ofgenomes--:-and all olherbackground genomes is terrned a signature sequence.DNA signatures are nucleotide sequences that can be' used to detect the

, •presence of an organism and to distinguish that organism from all other species.

Several Levels of Signature

The prime aim is to develop strain-level signatures for tI:i.~top suspectedagents. Strains are a subset of a species, and their DNA !nay differ by about 0.1percent within the species. A species, in turn, is a member of a larger relatedgroup (genus), and itsDNA may differ by a percent or so from that 'of other!nemb~rs of the genus, Characterizing pathogens at the strain -hivel signaturesare essential for determining the native origin of a pathogen ass,ociated with anoutbreak; such mfonttation could help law inforcement id~tify the group or

. groupsbehirrd the attack.

The biological foUndations work aims to provide validated signatures· usefulto public health and law enforcement agencies as well as classified

.signatures for the national security community. In developing these signatures, ,biological foundation researchers are also shedding light on poorly understoodaspects of biology, microbiology, and genetics, such as iinmunology, evolution,and virulence. Increased'kIlowledge in these fields holds the promise of bettermedical treatments, including new kinds of vaccines. The biological foundationswork is one element in DOE's Chemical anaBiological NonprolifetationProgram.Livetmore'scomponent<>fthis work is managed by its Nonproliferation, Arms .Control, andIntemationai secUrity DIrectorate. Other components of the overallprogram iI}clude detection, 'modeling and prediction, decontamination, andt~chD:01ogydemonstration projects. Livermore researchers were among thefIrst tot~cognize, in the early 1990s, the tremendous potential of detectorsbased on DNA signatures. OWe knew that a lot Of work was necessary to, ' 4

Page 2: DNA - based signatures defend against biological warfare agents and their makers

An ideal sigJiallire:, ,• Has few short regions «0.1% total)• ,Occurs only in the parhogea

- No false positives

• Occurs i.n all variants (strai.ns)• No false negatives

• Is necessary for virulence- "Unspoofable", '• In engineered ocganisms '

• Tests for antibiotic resistance

• Provides rooundancy

•••

~

Figure h'Bacte~ial chromos~mes (DNA) form loois, unlike human'chromosomes which form strands. In the loop, between two to five millionbases'ofbacterialDNA are screened to locate unique r~gion~ (circled), whichare marked with primer pairs. The marked regions are amplified thousandsof times using polymerase chain reaction technology and then processed toidentify and characterize an organism.

,classes of threats, such as agricultural pathogens. Two extremely virulentpathogens head the list: B. anthracis and Y.pestis, which'cause anthrax andplague in humans, respectively. Bacillus arithracis has few detectabledifferences among its~trainsj whereas Y.pestis strllms can vary considerablyin genetic 'makeup"Unraveling die' sigriificant differences between the two

organisms wilr givenatioRallaboratory researchers, experience vital for facingthe challe,ngesofthenext few years, as they develop signatures for a wide,spectrum of microbes.

, DNA- Based SignaturesAgainst Biological Warfare '271

of confidence requires several days; the goal again is toreduc~the'time to,lessthan 30 minutes. The final si~ature levei, mtended primarily for laWenf()rcement "use, will permit detailed identific,ation of a specific strain of a pathogen (forexample, ,Yersinidpestis 'I<1M) and' correlate that strain with otl:ler forensicevidence. Such data will help to identify and prosecute attackers. Theptesenttypical time lag for results isctirrently a few,weeks, and the goal IS to reduce

that to a few days.." , '

Biological scientists asserrible a list of natural pathogens most likely tob~ used in a domestic attack; The list includes bacteria, viruses, ap.d other

"'.ttf'

~l]

~~

:}'1

.-;~

;i,

~il

'i,

f

t,

,i;i"

;:';1:(~~"~;~;, ie

i~!=:i.

no ,Bioterrorism and Biologial Warfare

develop the signatures the new detectors would, n~ed,6 says WeiQstein. Inparticular, the researchers recognized several pitfalls. For example, if signatmesare overly specific, they do not identify all strains of the pathogen and so' cangive a false-negative reading. On the other hand, if signatures are based on 'g~nes that are widely shared among'many different bacteria, they can give a

fa!se-positive reading. ~s a re~ult, ~i~atures m~st b~ .able, for example, to,~eparate a nonpathogemc vaccmestram from an mfectIous one.

""'.Jl:

'S'everal Levels of Identification

To enhance their detection development effort, researchers are exploringadvanced methods that distinguish slight differences in DNA. They are using , /the multidisciplinary approach. 'In this case, t>NA signature development'involves a team of microbiologists, molecular biologists, biochemists, "geneticists, and computer experts. Much of the work is focused on screeningthe two to five million bases that comprise a typical microbial genome to designunique DNA markers. This phylogenetic tree is a simple represen~tion of thebacterial kingdom. All human bacterial pathogens belong to the Granl-positive(red) or Proteobacteria (magenta) divisions. The other divisions' consist ofnonpathogenic bacteria associated with diverSe environments. ~iologicalsignatUres must be able to differentiate infectious bacteria from hundreds of

, thousands of harmless ones. Each genus of bacteria has many species, andeach species can have thousands of different strains are performing suppressive.,

, subtractive hybridization to distinguish DNA of vario,us species of virulentorganisms that, will identify the microbe. The markers, called, primer pairs,,typically contain about 30 base segments and bracket specific regions of DNA 'that area few hoodred bases long (Figure 1). The bracketed regions are replicatedmany thousands of times with a detector that uses polymerase chain reaction(PCR) technology. Then they are processed to unambiguously identify andcharacterize the orgartismofinterest. The different signatures will be neededfor different levels of resolution. For example, authorities trying to characterizean unknown material or respond to a suspected act ofbioterrorism will beginwith fairly simple signatures that flag potentially harmful pathogens within afew minutes. Typically, such a signature would encompass one' or two primerpairs and be sufficient for identification at the genus level (Yersinia or Bacillus,

for example) or below. A signature in the next level of resolution is needed forunambiguously identifying a pathogen at the speCies level (Yersinia pestis, forexample)~ This signature involves about:lO primer pairs. Currently, it takesseveral days to obtain conclusive data for a,speci~s-Ievel signature. The goalis to reduce that time to less than 30 minutes;

The third signature level is used in pathogen characterization, identifyingany features that could affect medical response (for example, harmless vaccine

, materials versus highly virulent or antibiotic~esistancepathogeils). This,signature level involves some 20 to 30 primer pairs. Together, the Primer pairS

offer a certainty of correct identification. Currently, providingsucl1 a high level

Page 3: DNA - based signatures defend against biological warfare agents and their makers

.;:l.;~~

~'..

'QNA~ Based SigilaturesAgainstBiological Warfare m .'. . .During the Cold War, the Soviet Union ransevei;a~offensive biowarfare

programs to developso~caned "Super Bugs." One. such program; ProjectBonfire, worked to create bacteria that wereresistantto about tetivarietiesofantibiotics (Figtire 2).' This was done by identifying and cutting out. genes that

" cQnferred antibiotic resistance in many different strams of bacteria. By pastingthes~ genes into the DNA of the anthrax bacterium, the Project Bonfireresearchers created.a strain of anthrax that resisteq any existing cUre, making it

imp()ssible to treat." '.'The· HUnter ,Program was· anothei' Soviet biological warfare r.esearch

program that focused on combiningwhole,genomes of different viruses toproduce completely new hybrid viruses. These artificial viruses .could causeunpredictable Sym,ptOIllSthathave no known treatment. In an innovative tWist,

,the HUIlter ProgramreselU"chers' alsocreat.ed bacteria strains that carried'

, pathogenic viruses inside thein.(Figure3).

Figure 3: Hunter ProjeCt '

. These strains woul9bedouble trouble: a person who cOIitracte~ thebacterial disease wou.ld likely be treated with an antibiotic, whiCh would stop

'tQe .mfection by disrupting the bacterial cells. This would release the virus"re~ultingin an outbreak of viral diseaSe, Such a scenario would confuse medical

perso~el; making treatment very diffic.ul~. .£

Bioterrodsmand BiQlogical Warfare·...~m

Figure 2: ProjectBonffire

:-':;Micha;nis~Qtf~~~ivebiological_~~,!:..((l:e.•cteating·' harmfultilolo'gical:agents'·' ....- ":" ..' ,.... ':' .', ,

, ';, ,;inili~it:~atural state, bacteria, viruses and f'un~i canInak~ pretty .good.'biologic~l weapons. Thfow some genetic engineering mto the mix, however,an,dmore harmful agents can emer8i.. .

~~ " Eachofth~se organisms maintainsits genetic information in tHeforin of ..'DNA or,.in some virUses, RNA. This genetic material contains genes, whichencode all of the information the organism needs to, survive and replicate.

, Some of these. genes gove,m the organism's pathogeniCity, or its ability toinfect a cell of a plant or animal. Through genetic~ngmeeiing, pathogenicity ,"

genes may be, manipulated to make, the orgap.isni 'mOi;¢ .iJ}~ect~ous;,or more '.resistant to a therapy or cute. ' , ,

Page 4: DNA - based signatures defend against biological warfare agents and their makers

Z75

Figure 4.:Idaho. Technology's R;A;P.I.l)~ detection unit. . .4

DNA~ BaSed SignilturesAgainst.Biological Warfare

• The state health ·department conducts a (ull'Investigation' todetermine whether the incident is an act of biological warfare~ To

protect themselves from any potentially harlnful biological prchemical. agents, investigators at the scene are outfitted in protective suits

apd self-contained breathing apparatus (SCBA) respirators'(sameas the "SCUBA" gear used underwater). . . .

• Investigators collect samples from patients and the surroundirigenvironment~,then test them for the presence of harmful biQlogical

agents. In order to know which agents to test for, investigatorsevaluate .allof the evidence they collect at the scene, in~luding signs

and symptoms shown by patients and patterns of disease .. transmission. Through a process of deduction, investigators can. narrow the list of suspected pathogens to just a few candidates ..

.• While testing can take'place in existing laboratories, it can' be

performed more quickly' in temporary field laboratories, usingcompact, portable detection units such as Idaho Technology'sR.A.P.l.D., which stands for "Ruggedized Advanced Pathogen

.Identification Device." The RA.P.I.D., detection unit uses peR to

.id.eriiify the unique DNA signatures of suspected pathogens (Figure~ .

.Oenomic'DNA or RNA extracted from collected samples is added to

a cocktail of reagents thatwill amplify a particular pathogen's DNAsignature. If that specific pathogen is present in the sample collected,it will be positively identifie_dusing this approach. 'the entire process. from sample preparation to detection ~es less than 60 minutes.

\, '. . Ifa biowarfare. incident is confrrmed or thought to be probable; the

state investigators notify the FBI and local law eg,forcement agencies'.. ' immediately. Law enforcement and health officBls work together to

. implement a pian to contain the site of contamination, clean it up and. . pinpoint the source of the attack. '

.,< ..•.~.iil"., ..

t.

\~~:.

f

Bioterrorism and Biological Warfare274

Defensive biological warfare ~ vacdnesaild detection methodsIn 1969, President Richard M. Nixon tenninated'the U.S. offensive

biological warfare program and ordered stockpiles destroyed. The biologicalwarfare research focus shifted from offensive to defensive techniques. Threey~ars later, at the 1972 Biological arid Toxin Weapons Convention, more than100 nations signed a treaty prohib)ting the possession of deadly biological'

~agents, except for purposes of defensive research. Nations around the' worldi=~oncentrated on developing vaccines as well as enhancing detection ofbiological agents. The Soviet Union signed the treaty, butiristead of dismantlingtheir offensive program, they stepped uptheii pace. The Soviet program wasnot terminated until after the collapse of the Soviet ,Union in 1992, when RussianPresident BorisYeltsin banned all offensive biological weapons-related ~ctivity. .All biological weapons stockpiles were destroyed and rese'arch was'considerably doWnsized, but it is unknown if Russia has completely dissolvedthe old Soviet program. .Vaccines

Traditionally, vaccines consisted of a preparation of the infectious agent'itself - either living, weakened or killed. Introducing the vaccine into the bodyactivates the immune system, resulting in the production of antibodies againstthat particular agent. Ifa vaccinated person is later exposed to the infectious .'agent, he or she will already have built up immunity against.it. More recently, ..researchers have started using fragments of the pathog~ri's DNA genome as a'vaccine, rather than the entire organis~. This approach helps eliminate the risk .of infection that comes with using traditional vaccines. .Detection methods

While vaccination helps protect a population from known infectiousagents, rapid detection of a suspected act ofbiowarfare allows fast action to betaken to control the spread of disease. Curreritdetection methods take advantageofthe fact that each biological agent maintains its own unique DNA signature.Rapid detection methods use a technique called Polymerase Chain Reaction(PCR) to make a billion copies of a single. DNA strand within minutes. Thismethod positively ideritifies an infectioqs agent, by means of its DNA signature,

using even the tiniest samples. . ..

Putting It All Togeth~r

The Dark Winter project gave us a~gliriipse of how a biowarfare scenariomight unfold. But what. is the government's planned response to· such ascenario? .. ,.

~. Although it niay be difficult to confrrmrightaway that an unusualillnessiD. a community is caused by a biological attack, theloealhealth officer isimmediat~!y notified. Itis this person's responsibility .to ,wonn the state health ~depaitment, which in turn notifi.es the

. federal Centers for DiseaSe Control and Prevention (CDe).

:1

II'

i:'

tIi

If

I.

[,

~!.,I

Page 5: DNA - based signatures defend against biological warfare agents and their makers

...

":' --_.• -f

~NA-"- BasedSignaturesAgainst Biological Warfare insimultaneously analyze 96 strains ofDNA..The another technique to aid poultry .ind~try by providing a handy way to detect Salmonella enteritidis. ThisbacteriWn can cause illness if eggs are eaten raw orundercooked. Subtractivehybridization results have been so successful that the signature can now beused to diStinguish between. subtypes of salmonella bacterhim.' In addition to

· the DNA-based pathogen detection m.ethods, researchers are developing" detection capabilities using. antibodies that can tag a pathogen by attaching to .

a molecular level physical feature of the organism. Antibody assays are likelyto play an important tole in pathogen detection because they are generally fastand easy to use (commercial home-use medical tests use.this form of assay).Researchers are working to. improve. these· detection methods. as well. Abacteriophage (bacteria-killing virus) that only attacks Y.pestis and none of itscousins discovered that the virus produces a unique prQtein component toattach to the bacterium cell wall at a certain site and gain entry recognizing thedistinct site could fonn the basis of a foolproof antibody signature. To achieveit with Y.pestis, we may be able to do it with other pathogens.

Sensing VirulenceAs more information about' pathogens and their disease mechanIsms

becomes available and as genetic engineering tools to transplant genes becomecheaper and simpler to use, the threat of genetically engineered pathogensincreases. Biodetectors must be able to sense the virulence signatures of

· genetically engineered pathogens" or they will be blind to an entire class of.threats. The ultimate objective is to identify several specific virulence factorsthat might be used in engineered biological warfare organisms so that we candetect these engineered organisms and breaktheir virulence pathway. One.keyfactor useful for' detecting. engmeered organisms is an antibiotic resistancegene~ When transplanted into an infectious microbe, the gene could greatlyincrease the effectiveness of a biological attack and complicate medicalresponse. Some antibiotic resistance genes are widely shared among bacteria

•.and are easily transferred with elementary molecular biolom. methods. In fact,a standard biotechnology research technique is intrOducingantibiotlc resistancegenes into bacteria as an indicator of successful 'cloning. The need to be ableto rapidly recognize such genes so that the medical response is appropriate,another telltale indication of genetic tampermg is the presence of virulencegenes in a microbe that should not con~in them. Virulence gene~ areoftettinvolved in producing toxinsor~olecutes that cause harm or that simply

· evade a host defense. A series of genes is made available to perform theirfunctions at the right time, they.could cause real damage. If interfering with theaction of otie of these. genes. or its protems interrupts the virulence pathway,thl': disease process can be halted. Identifying and characterizing iinportantvirulence genes attddeterminirtg their detailed molecular structure will greatlyaid the development of vaccines, drUgs, .and other medical treatments. As anexample,Y. pestis disables the imrnurie system in humans by injecting proteinsinto macrophages, one of the body key. deftmders against bacterial attack.

~x't"~¥,~n

f

-fl",·,·"r~ii.

,

r~

.'\'€~I~:>'"....••'.·i~;f:.

. "',' .

'.:\

-.

Bioterrorisin and Biological Warfare

,I'-./

~' \--- . J

~t, ~'t

Y .. . ..(, '7I "".

116

!;z)t~...•... .

.' .00 ••••• ~:.

, . . " • f',~.,., ,.;,':~'~:~_~;~.~:~:<Figure 5: Twoextremelyvirul¢~t.()J;"ganisms head the list of pathogens most .likely to be used byt~rforistS;'B;:ttn~hrac~Oeft)and y. pest~ (right), which

causeantllraxand'plagueinhumans;respectively. . .

Focus on Plague

. The main focUs is on Y.pestis, Francise/la tuli:ll'ensis(a bacterium caUsinga plague like illness in hunians), and s~veral other microbes thaUhreaten humanand animal health, Eleven species and many thousands of strains belong to theY~rsiIiia genus. The most nototious sp'ecies, Y.pestis, causes bubonic pl!!gue

. and'is usually fatal unless treated~uickly with antibiotics. The disease istransmitted by·rodents and their fleas tQ humans and other animals. The

'~gly subtle DNA differences among many Yersinia species maSk importantdifferences. .

One species causes' gastroenteritis, another is often fatal; and a third isvirtually harmless; yet all have very similar genetic makeup. Insertionseqllence~'"based fingerprinting to understand these slight genetic, 4ifferences. Insertionsequences are mobile sectionS of DNA that replicate 0l.l their own. Analyzing'for their presence will not only help refinesigriatUres for Y. pestis but also shedI. .

I:igDton how microorganisms evolve into strains that produce. lethal toxins.This tmderstanding, in turn, should give ammunition to researchers seeking anantidote or vaccine. to better understand the genetic differences. among speciesand strains. COmparing the genetic complement ofY. pestis with another I1lemberof tile Yersinia group (pseudotuberculosis) that caUses aninte~tinal diseasethey are closely related, and yet they' cause such different diseases.

Bette •. and Faster, with More Uses

There are a number of methods. fu~t:allow ritorerapid.identification and. characterization of unique segments of;P~A.,.Eachmethod has advantages

and drawbacks, with some more applicable to one;organism than another; Inaddition to the insertion seqllence method, another promising technique iscalled suppressive subtractive hYbridization ..The method takes aD organismand its near neighbor, hybri4izes the DNA frQill both, and determines thefragments 'not in common as th~ basIs of asignatiJrc: One':goal is to

.I

Page 6: DNA - based signatures defend against biological warfare agents and their makers

1111

I

I

II[

I·'II

II"I!"

"--_.- "--

278 Bioterrorism arid BiologicalWarfare

Because,the protein acts as an inimuDosuppressant to disable the macrophage;understanding its structure not only would help scientist,s fashion a drug thatphysically blocks the protein but also would shed light on autoimmune diseases

. sUyh as arthritis and asthma. . .

Virulence Genes in' Com~on

Vfrulence genes spread natur¥lyamong pathogens and thus are also .!?und in unrelated microbial species. Therefore, virulence genes alone aren'8t>sufficient for species-specific DNA based detection. Differentiate thevirulence genes in natural organisms from engineeredorganisnis are usingdifferent methods for differentiating virulence genes from among thethousands of genes comprising the genomes of pathogens. One techniquelooks for genes that start making proteins at the internal telDperatlli"es ofmammals. For example, genes of rpestis that becomes much more active at'Sl"C. It seems a safe bet that many of these genes are associated with thebacterium multiplying within a warm blooded host. The sequence of thethree plasmids (bits of DNA located outside the microorganism circularchromosome) that contain most. of the virulence genes required for' fulldevelopment of the bubonic plague in animals and humans. Plasmidssometimes transfer their genes to neighboring bacteria in what is called lateralevolution. (Antibiotic resistance genes are also located on plasmids.). The Y.pestis strain that causes bubonic plague, for example, may. have evolvedsome 20,000 years ago. Such understanding is relevant to HIV, which may

not have become infectious for humans until.the 20tlr century.

Working with End Users

. There needs to be a strong relationship between development ofbiologicalsignatures and detection technologies and their end uses. Making diagnostictools available to regional publichealthagencies and thus create a nationalmechanism for responding quickly to bioterrorism threats. Currently, many

. health agencies use detection methods that are not sufficiently sensitive,selective, or fast. For example, one culture test for detecting anthrax takes twodays. Major damage and even death may have occurred in that time. DNAsignatures will be thoroughly validated before being released, because theiruse might lead to evacuations of subways, airports, or sporting· events andsuch evacuations cannot be undertaken lightly. As part of the validation effort,which are characterizing natural microbial backgrounds to make sure that the .signatures are accurate. imdel'actual conditions. To that end, researchers are

. collecting background microbial samples in air, water, and soil, liswell as inhuman blood, urine, and saliva ..R anthracis is related to B: thrugin~nsis, anaturally occurring harmless microbe that lives in dirt and can give a falsepositive readingto anthrax if the signann:e used is not adequately specific. Thecharacterization effort is being aided by a device called the Gene Chip. Thedevice simultaneously moriitors the expression of thousands of gepes. Equally

,.1

Ii!H',r,

(\

"·c. ....alii" '

...t-.,.r

. DNA-Based Signatures AgainstBiological Warfare . T79

. .important, the researchers envision a strong mechanism linking biomedical.scientists with public health and law enforcement officials to develop newsignatures speedily and cost,effectively to stay several steps ahead ofterrorists. .

. . DNA signatures are nu<;leotide sequences that can. be useq. to. detectthe presence of an organism and to distinguish that organism from all other'species: Here we describe Insignia, a new, compreIwnsive system for therapid identification. of signatures in the genomes of bacteria and virLises.With the availability of hundreds 6fcomplete bacterial lmd viral genomesequences, it is now possible to use computational methods'toidentifysignature sequences in aU of these species, and to use these !>ignatures asthe basis for diagnostic assays to detect and genotype microbes in bothenvironmental and ciiniCa,lsamples. The success of such assays critically

depends on the methods llsed to identify signatures that properly differentiatebetween the target genomes and the sample background. We have usedInsignia- to compute accurate signatures for most bactel'iatgenomes and

. made them available through our Web site. A sample of these signatures hasbeen 'Suc~essfully;,tested on a set of 46 Vibrio cholerae strains; and the ..results indicate that the signatures are highly sensitive for detection as well

as specific for discrimination between these strains and their near relatives.Our approach, whereby the entire genomic complement of organisms arecompared to identify probe targets, is a promising method for diagnosticassay development, and it provides assay designers with the flexibility tocho.ose probes from the most relevant genes or genomic Fegions. The Insigniasystem is freely accessible via a Web interface and has be.en released asopen source software at: http://insignia.cbcb.umd.edu.

'Occurrence and E~pression of Insignia .. Modem health and security concerns have raised int~est in the real-time

detection and identification of pathogenic microbes. nacterial and viral

. pathogens have always represented one of the greatest threats to humanhealth, and in recent times this threat increased due tp the possibility ofengineered biological agents. For these and other reasons, the genomesequencing field has.targeted and sequenced the-complete genomesofhundreds of bacteria and thousands of viruses over the past decade, with

lnany more sequences expected to appear in the near·future. These sequences. now make it possible· to develop probe-based assays capable ofidentifying

any of hundreds of organisms in environmental and clinical samples. Such.assays rely on detecting a DNA sequerice that distinguishes thetarget organismfrom all other known bacteria and.viruses and from background material, yvhichcould include DNA from humans, other animals, plants, Qr other species. A

probe that accurately distinguishes between a target genome-or set ofgenomes--and all other background genomes is termed a signature sequence:

Page 7: DNA - based signatures defend against biological warfare agents and their makers

..,,-~~.'~1\"~t

DNA- B3sed Signatures Against Biological Warfare· 281

anthraciswhose 16SrRNA seq\lences are identical [Keim et al., 1999,2000].

Although these methods areeffettive, they only provide a limited number ofsignatures, which are not always sufficient to ideritifybacteria or viruses in anew sample; in particular, if the siunple contains an unknown strain, it mightcontain genetic variability in preci~ely the region for which assays are designed.Thus, in general; one would like to have as mliQ.y,assaysavailable as possible.Insignia a,ddresses this by' 4sing the complete genome' to generate all uniquesignatures, ,from which the assay ,designer can choose those that are best­suited for a particular appiication. "

Recent increases irithe amount of available genomic sequence have made

it possible to largely automate the design and screening o(probes viacOlQPutational search algorithms. Large-scale computational prediction of DNA

'.sign~ture!! was first undertaken for the Biological Aerosol Sentry and, , Information System (BASIS), deployed at the Salt Lake City Olympic Games in

2002 [Fitch.et al., 2002, 7.9.03].The related BioWatch project operates bycollecting' and analyzing airborne miCrobial samples for known pathogens,, '

.. using PCR probe-based detection methods. Newer aerosol detll<;,tionsystems,'. such as'the Autonomous Pathogen Detection System (APDS) [McBride et al.,

, 2003], automate the proces!!~and can identif< a known bioweapon in 0.5 to 1.5hours [Brown, 2004]. Similar'teclmiques are not limited to aerosols, and can beused in clinical or agricultural settings [Lirn, 2005]., "(

The success of these assaysde~eridS on both the available sequencedatabases and the computational meth used to identify signatures thatdifferentiate the threat organisms from the \) c~ground.Signature design for

both BAS.IS and BioWat(;h was handled by. LliWr~c~ Livermore, ~ationalLaboratones (LLNL), an4 what began as asunple proof..of-concept BLASTsearen at LLNL evolved into the sophisticated KPATII signature pipeline [Slezak

, et a/~, 2003]. KPAtHidentifies sequences shared by a collection of ~get'genomes, yet urtique with respect to all other microbial genoiites, and is notablefor its ability to handle such a large search space. Other methods for probeselection more rigorously address hybridization efficiency (binding energy,

, self-hybridization; etc.), but do not scale well for large target and backgroundsets [Kaderali and Schliep, 2002., Gordon and Sensen, 2004,. Nordberg, 2005.,Li F, 2001]. Most notabie are the approaches that promise the scalabi1ity~ofKPATH combined with the hybridization considerations .of the other methods

(Tembe et al., 2007., Rahmatm, 2003].Because of its history of use in real-wotld'dlagnostic syste~s, a mote

detailed description of KPATH is warranted. It consists ,of four majorcomponents. First, a whole-genome multi-alignmerit isperforrned ein a set oftargefgenomes. This produc~s a "~o~seris1.is gestalt," ,which repres~nts thesequences that are conserved in all the' target genomes:Next, this consensusis matched against a database ofbackgrOuD.dsequences using Vmatch [Kurtz,2003]. This'step computes all exact matchesb,etween ~e target consensus and,

" ...

w1,

tj:,.

"~

1f~

~l~

~';',',£: "i'r",

It.~

[['.1.•••.1: .

'1'~

~~ '

~.~.,.r.:. .i. ~'

~':~:'.~'i~i,

.Bioterrorism .and Biological Warfare ''.

By the definition, a signature sequence mus~ be conserved among a setof target genomes and dissimilar to any !!equence in the surroundingenvironment. TO.detect a target with existing technology such as qPCR 'assays,signatures must be relatively short; however, if they are too short, they will not

be ,unique: For example, because there.are only 410 ~ I million 10-bp (base-pair)sequences, and a typical bacterial genome is more than I million bp in length, .most 10-mers will be shared by man~.genomes and therefore make unsuitable

.~ipatures. Increasing the length, k~ of the signature alleviates this problem,.. but'if k is to.o large, it may not be possible to fmd a signature shared by a set of

target genomes. Therefore, there is a tradeoff between signature sensitivity(the number of genomes that share the signature) and specificity (the numberof genomes that do notpossesstne signature). For instaIlce, a long signature .'­may be highly specific to a particular strain or isolate, but it may not be sensitiveenough to detect closely related strains that might 'cause the same disease orhave other shared phenotypic characteristics. Because genomic sequence isnonrandom, and only a small sample of genomes has been, sequenced, it isdifficult to estimate an optimal signature length. In practice, signature length is 'usually determined by the constraints of the detection technology (e.g., -20bp focPCR primers).

Cmrent probe-based technologies are generally based on either PCR ormicroarray hybridization. These methods are beginning to replace traditionalgel-based fingerprinting because they can more effectively differentiate between:closely related microbes (Willse et al., 2004]. Microarray methods are particularlypromising because of their ability to multiplex many probes on a single chip[Willse et al., 2004; Wang et al. 2002; Volokhov et al., 2004], improving both the 'redundancy and capabilities of the diagnostic. PCR does notmultip1ex as nicely;

'however, it remains popular because of its robustness, speed, and low cost -[Slezak et al., 2003; O'Connell et al., 2006; Moser, 2006]. Unlike restrictionfingerprinting, both PCR and microarray methods require explicit knowledge ofthe Underlying DNA sequence, therefore necessitatiri.g probe design.

Traditional probe design strategies have focused on single genes orother loci that are determined a priori to be useful in distinguishing one targetorganism from another. Examples include genes that are associated withphylogenetic distance (e.g., 16S rRNA genes) and variable number tandemrepeats (VNTRs). In the fOflller case, where the gene or locus is conservedamong target and nontarget organisms, gene sequence alignments would beused to aid in probe design. Probes would then be manually designed andscreened for sensitivity and specificity to the target. Those assays failing, toidentifyall target organisms, or producing false positives, would be invalidatedand the design revised. This manual screening made diagnostic assay,designexpensive and only worth doing for a few select pathogens. Alternatively,variable number ~dem repeats (VNTRs) have proven very useful in classifyingand distinguishing many closely related strains of bacteria, such as Bacillus

II"·

Page 8: DNA - based signatures defend against biological warfare agents and their makers

. "'<"!!li*!@!J~-'

•••

283

.,

DNA- Based Signatures Against Biological Warfare

the matches may take days to compute,' the signatures can be extracted fromthis cached information in seconds.

Match Pip'eline

The function of the match pipeline is to identify exact matches betweenaU pairs of target and background sequences in the database. The size of theInsignia sequence database is cUrrently about 60 billion nucleotidesj and evenwith the linear-time algorithms described below, this is too large to search inreal time. Some computational effort is saved by limiting targets to microbialgenomes only, but the process of matching all pairs oftarget a~d background

genom~s remains expensive. . . 'To complete the matching phase within a reasonable amount oftirne, all

exact matches of 18bp or longer are first identified using MUMmer [DeIcher etai., 1999; Deicher et ai., 2002; Kurtzet al., 2004], a linear time and space suffixtree matching algorithm. To expedite the process, MUMmer'searches arepartitioned across a 192-node Linux cluster. Even with the use of an efficientsearch algorithnl, however, the size of the database and the high repeat content

·of many genomeS"causethe size of the output-the number of matches betweenallpairs'of genomes-to reach unmanageable levels (e.g., the number ofmatches can be quadratic with respect to the size of the genomes). To combatthis problem, matches are converted to a minimalized "match cover" datastructure, described riext. This structure saves substantial space and later

· provides a convenient mechanism for computing signatures.. .

. The match cover is not a lossless conversion, however, because it discardsinformation about where a match occurred in the background. The informationis nonetheless sufficient for sigllature computation, where it suffices to knowwhich regions of a target are unique .. Furthermore, .by ex,cbiding irrelevantbackground match positions, large background_databases can beaccommodated without drastically increasing the matclftoversize, and dra.ftquality genomic sequences can be incorporated without diffic,ulty. As the next

.section will show, the match cover encapsulates all the necessary information. for signature discovery and.allows for the rapid construct~on of signatures for

any set of target andbackgrOl.ind genomes in linear time.

For petspec~ive,' it is"worth mentioningthaf the match cover is anequivalent, interyal representation of matching statistics (Chang & Lawler,'1994; Gusfield, 1991]. Both formalizations represent the longest contiguousmatch beginrling at any position of Ii. sequence, but our interval representationis space-efficient ~ndeasier to interpret in the context of signature discovery .Rahm~ also leverages the properties of matching statistics in describing a"jump list" for the discovery of DNA probes [Rahmann, 2003], and it is

· iriteresting to note that although the match cover and jump list Were ~ived at· independently, they are analogous given their shared utilization of matching

statistics.

'(i.'

.~.'>

,.~.

Bioterrorismand Biological Warfare

the background. Matching sequences are m~ked out to create a "uniquenessgestalt," which represents all sequences that are shared between target genomesand unique with respect to the backgroUnd. Third, signature sequences are

. supplied to the PrimerJprogram [Rozen and Skaletsky, 2000], which designs,PCRa$says based on those sequences. Primer3 produces a set of oligos suitable

for testing by a TaqMan PCR assay: a forward primer; a reverse primer, andanintervening probe oligomer [Liva1cet al., 1995]. Finally, assay candidates are

:'~;;creenedusing BLAST [Altschul· et at., 1990] for near matches that might. disrupt the hybridization process, and ranked according to their satisfaction ofPCR experimental 'Constraints. The result of this four-stage process is a set ofranked, prescreened assays, which are 'then subjected to rigorous laboratory. .validation. The. transition' to these computational methods' from previouslymanual design methods has result~d in greatly increased design efficiency by

limiting the number of assays that fail during laboratory validation.

In addition to the computational restrictions, limitations ofTaqMan PCRhave been demonstrated for rapidly diverging target genomes, such as hepatitis .and HIV viruses [Gardner et aI.~2004; Gardner et aI., 2003]. However, for typicalbacterial targets, TaqMail assays remain one of the most rapid and sensitivemethods for signatute detection. In the case where TaqMan is inadequate,different detection technologies, such aschip-hybridizatiori methods, couldbe used to remove the TaqMan requirement for three adjacent probes and toprovide greater signature redundancy. Insignia would,easily support the'designof such assays.

ViruSes pose significant challenges for all detection methods because of. . their sma'l genomes and high mutation rates. The Insignia databasecontams

thousands of viral genomes; however, for -large target. sets there are often noconserved signatures. To address highly divergent targets, future Insigniaversions may include the ability to identify signatures with degenerate basesjfor cases where no exact signature is share.d between them. An alternative is tocompute the minimUm signature set; where each signature might not identifyevery target, but the set contains at least. one identifying signature for eachtarget. This approach is particJ.llarlysuited for chip assays where signaturescan be multiplexed. A related approach selects combinations of non-Uniqueprobes, such that certain viral strains c.anbe·identified by theirhybridiZeitionpattern [Urismanet ai., 2005]. Insignia support for specialized viral diagnostics .isleft for future work. . . .

Insignia provides real-time signatUre retrieval for an arbitrary set o(targetand background genQmes. This requires the vast majority of compuuitional .

. work to be done in advance ,and cached, so that a minimum amount ofcomputation is necessary at the time of the query. To accommodate this, Insignia ,is designed as two separate components: th,ematch,pipeline and the signatUre .pipeline. This distinction separates the computationaUy intensive matching

. t .• • .

step from the much simpler signature'generation step, and· allows, sequencematches to be recomputed offline as new genomes become available. While

,Iii·1

II'i'

'I.

~ ;i,,(

II c

I .

I. ~.I

Page 9: DNA - based signatures defend against biological warfare agents and their makers

Signature Pipeline

The function of the signature pipeline is to generate valid signatures for, any set of target and background genomes. Because there are thousands of, possible targets artc:imany more backgroUnds, combin,atorics rules out the pre­

computation of all signatures;' however, it is possible to generate signaturesfrom the match information with Giinimal overhead. The pipeline for doing so is

,~ ",divided into tWo parallel stages, corresponding to the two primary criteria avalid signature ml,lst meet:

I. a signature must be shared by all genomesin the target set; and '

2. a signature mus£ not exist in any genome in the backgroood set. .'

Occurrericeand Expression of MannDB

MannDB is a relational database that organizes data resulting from fullyautomated, high~throughput protein~sequence anaiyses using open:"sourcetools. Types of analyses proxided include predictions of cleavage, chemicalproperties, classification, features, functional assignment, post-translationalmodifications, motifs, antigeniclty, and secondary structure. Proteomes (listsof hypothetical and known proteinS) are downloaded and parsed from Genbankand then inserted into MannDB; and annotations from SwissProt aredownloaded when identifiers arefound jn the Genbank entry or when identical

sequences are identified. Currently 36 open-source tools are run againstMannDB protein sequences either on local systems or by means of batchsubmissiontp external servers. In addition, BLAST against protein entries in

, ,MvirDB, our d~abas~ of ~icrobial virulence. factors, is perfouned. A web clientbrowser enables vlewmg..of comput,atlOnal results and downloaded'annotations" and\a query tool enables sttuctured, and free~textsearchcapabiIItie~.Whenavailable, links to eXternal databases, includingMvirDB,are provided. 'MannDB' contains whole~pro~eomeanalyses, for at least, one

, iepre~entative organismfrQm each category of biological threat organism liste,dby APHIS, CDC; HHS,NWD, USDA, USFDA:, and \VHO~

MannDB comprises a huge numberofgenomes and compreh~nsiveprotein sequence analysesrepreseriting :organismslisted as high-priority;agentson the websites of several governmental organizations concerned With bio­

terrorism. MannDB ptovides the user with a BLAST ,interfabe for compansonof native and non-native sequences arid a query tool forconveriiently selecting

proteins' of interest. In addition, ~the.user has acc¢ss to a web-based browser. that compile's comprehensive and extensive reporKAccessto MiuinDa is

freely available at h#p://inanndb.llril.gov/ webcite.· ..... . . '.

MannDB was created to meeta needfor;apid,comprehensive sequence

analysis with an emphasis 011 proteinprocessing,.surfa~e:character:istics;. andfunctional classification to support selection of pathogen or, virulence­associated proteins suitable as targets for driving the developfilent of protein­h~~ed rea!!:ents (e.!! .• antibodies, non-natUral amino-acid ligan~, synthetic

DNA- Based Signatur.esAgainst BiologicalWarfare 285

high, affinity ligands) for pathogen detection [Slezak et '01., 2003; Zhou CEZ,~005]. Because comprt';heIisive. analyses of this typeJequired using a largehumber of open-source tools,aIidbecause it was 'necessary to'scale thecomputations for analysis, 'of whole prott';omes, we built a fully automatedsystem for executing sequence analysis tools and for storage, integration, anddisplay of protein 's~quem:eanalysisand amlotation data. IIi order to be ableto rapidly examine and compare whole bacterial and viral proteomes for selectionof suitable target protemsfor .bio-defense applications, we compiled data forwhole proteomes frbmreprt';sentative organisms from allcategories of biological .threat agents listed by several governmental' agencies: APifIS, CDC, HHS; .USDA, USFDA; NiAID, and WHO.[APIDS Agricultilral SelectAgent Programselect agent and toxin list;GDC bioterrorism agents/dIseases list; HHS andUSDA select agents artd toxins list; USFDABad Bug Book; NIAID catt';goryA, B and C priority pathogens; WHO list of majo~ zoonotic diseases; WHo list

. of diseases covered by the Epidemic and Pandemic Alert and Response (EPR)]as weHas taxonomic near-neighbor species as appropriate. Therefore, the scopeof MannDB is automated sequence analysis and evidence integration forproteins fromalkurrently recognized bio-threat pathogens. Emphasis is placed

. upon analyses that are most useful in characterizing potential protein targetsand surface· motifs that could be exploited for development of detection

reagents. The contentofManriDB is updated on a regular basis,

In recent years several software systems and accompanying databaseshave been developed for microbial genome annotation, each with a particularemphasis [Andrade etal., 1999; Frishman et al., 2001; Gattiker et al., 2003;Goesmarin et al.,'2005;Markowitz et al., 2005; Meyer et al., 2003; Vallenet,2006;VaiI Domselaar et al.; 2005 ]. Some databases place art emphasis on geneprediction and DNA-basedanalysesvs. protein sequence-based analyses, orprovide autorpated (primary) vs. curated (secondary) annotations. Althoughmicrobial annotation databases frequently include pr~ictions of biological,chemical, structural .• and physical properties. Qf proteins (e.g., antigenicity,post-translational modification's, hydrophobicity, membrane helices), nonecWTentlyoffers the comprehensive suite of analyses (see MannDB website forcomplete list of tools ) contained withinMannDB for chm-acterizing viral as wellas bacterial proteins from human and agricultural/veterinary pathogens ofinterest to the bio-defense community anl for rapIdly identifying putativeviI1l1ence-associated proteins for development of functional assays. The,MannDB database was built and linked to MvirDB [MvirDB nllcrobial virulencedatabase] in order to meet thesetequirements. In addition, we focus onsequence analyses that' assist. in selection' of protein features (e.g., surface,

characteristics) most suited for targeting detection reagent development.Construction and Content

MannDB is implemented as an Oracle 10 g relational database. The schemaforMannDB data organization is available on the website. MannDB captures. - 4

~.

IiiI·

Bioterro,riSmand BiologicalWarfare284

1:

;iTi,

Ii

ifi;i;

ii

·;1'

:!jHi

III

ill!II

ill:'1ii,

iJi

:11:1.

:1;

III" ,.-1i

- ,Ii

Ii II

1I,

Page 10: DNA - based signatures defend against biological warfare agents and their makers

,,'ZZ7

.s

.5 .!a'd 0~b"lO r::ell '0,o CoI- -r::.~~ t).llo ell

"Q r::~ ell

;=,.-..~••. r:: '.~.-~0- -= ~,~~.- 0b:~

'~Q ~'"',~' r:: Sm.= r:: '-'o ell ="':s ~.f!. ,,'"ell .f!.;:,

o;lell~-a "Q ~r:: •••• ­_ o. sg~ >:..-~.!, '~~,.~ ellQ.S.- .•..•.

" r:: "Q ;j.- Q) ­--.c~~ "Q

._, ,r:: =a. ~ ell

.~"'O ~"'r::.!abell ellell '" >.r:: ~,~ell Col' '"~ 0 ­u a- 05a..s= 1;;' ~C" ~ "u~ ~ a-'" 0 ='=.c ~~ - Ir:: ~ 5

, ; ~ go

':S ~ 'Qa- c =

cE "'0= ....= ~ -=

ell _ Ua- '" ~t).ll»~ell '" ~:e .~~ ell

~,~

.f!rJ3"eIl '.~=.. ~~aa- ell~:s~

t.'"

== •..o ••.~ E:

's=.1:> ••

~ eoC'au 10<

-;; ..=s

t; ';! e~o(Si'ii5 Q ~ 8~g~~,Q 0"~ .E ~~ ..

~~

~

•••

l~(I)71

' =, ~"

...:~ '

" ~ ..' . ut ~~. 'Cj.w

ILl '5-:il ' ,~.!-= c . '.c::a~.,Q"Q"

-e ~ ••~ ~~'" ~CIS 4!,jf =-:-::se.~ ~~OU.=~S"'~"'t''''Soo::s b.Q ~ rI) e- = e t:• aI"O.fIllU.~"f'-I401 C Go' u· C •• tn·, = >.E :-·~·eb·S:~;a ~~J ci~~ 5~QuGJ-uO::s=t::oa 6- ~ cr·-;·. ~:e ~~~(J:f~~~E~1i ~.~..=;:;~ ~~

~ .r a·

.~g.~~~o•• 0 I:••. s. -=

. "Q Co ••

~4i~cS ••~01>.Oo5"Q

~ 1>.0;

j,

•••

~rn'~, 0

,~,,i,r-1 == ..

~

.~ rI2.. -bg.. ­I:-'•• 3'=0::l-~~

t

'[D~', 1>.0I:

••l;fJ

,m~..'

(J

DNA.;..Based Signatures Against BiologicalWarfare

:"i:,\,

r

'7\~

'\1(;il

286 Bioterrorism and BiologicalWarfare

results from our fullyautoinated, high~throughpiit, whole-proteome sequenceanalysis process pipeline, depicted in Fig. 6. Proteomes (lists of hypotheticaland known' proteins) representing' human, bacterial and viral pathogens' andnear-neighbor species are downloaded from GenBank and parsed into MannDB.Whenever possible, ~e begiQ with gene calls on fmished genomes. However,the system' can be used to predict genes on draft genomes, and can be used toanalyze arbitrary listtof protein sequences. Reference genomes ,are updatedon a quarterly basis to ensure that the softWare tools are being run on currentsequence data. Annotations fromSwissProt are downloaded when GenBank,entries contain SwissProt identifiers, or when identical sequences are detectedby blasting MannDB entries against the SwissProt proteiQ fas~ ,database.MannDB contains at leastone.reference genomefor each category of pathogenlisted as abio-threat organisl}l on websites maintained by APHIS! CDC, HHS,USDA; USFDA, NIAID, and WHO. Open-source tools 'are run either on localsystems or by means of batch submission to external serVers. As of this writingthe system executes 36 tools, which are listed on the MannDB web site.Automated sequence analyses include predictions of po'st-translationalmodifications; structural conformation, chemical properties, functionalassignment, and antigenicity, as well as motif detection and pre-computedBLAST against protein and nucleic acid sequences in MvirDB, our databaseof microbial virulence factors, protein toxms, and antibiotic resistance genes[MvirDB microbial virulence database]. Tools thatarerun in-house are updatedperiodically to ensure that the system.isrunning themost recent softwareversions against the mos,t recent data sets. Toolsar~ selected, and input'parameters are set according to'the taxon ()f th~,organism from ~hich theptoteinset is constructed. For example, some tools (e.g., NetPicoRNA; [Blomet al., 1996]) are run only on specific organisms, 'Whereas others (e.g., SignalP;[Bendtsen et al., 2004]) have taxon-specific settings. In some cases we runmore than one tool for a similar prediction. TMHMM and TopPred both predictmembrane helices, but results may. differ, for example, in the start and endresidues for a given segment. Our strategy is to e,mploy more than one tool,when available, so that conflicting results can be noted and evaluated by theuser. In parsing results from each tool, data are inserted into one of nine tables(see schema on web site) depending on the type of prediction (e.g., proteinchemistry); tools that make similar predict~ons tend to produce similarlystructured output (although formatting differs considerably), which facilitatesdata storage and retrieval. .

. " .

A web client browser enables viewing of automated analY)iis results,annotations, and'linksto MvrrDB. The user first selectS a proteome, then'aspecific protein for which to view summary results, and [mally selects thespecific categories of analysis to be vie·wed. Only analyseS returning resultsare displayed. Hyperlinksto external data sources are provided for additional'information whenever external database identifiers arereturned ..The MannD8

Page 11: DNA - based signatures defend against biological warfare agents and their makers

:

I:~': [,II'i i

, ,:'1

! i 1,. ,

;:i

:~!ll~~!l1il

~lji!l"~!l_:~l

,m"

ll"-

\~ -

, ,

Biot~rrorism and Biol()gicalWarfare

tools.etincludesa BLAST interface, which can be used to quickly identify anentry ofinter:est by its sequence, when the gene name orlocus tag is unknown,or to identify protein sequences related to,a sequence of interest. A query toolallows the user, to construct 3 types of searches: 1)free-'textsearches against~ndatabase fields that contain qescriptive infornation, including fields,

, containing gene names or external database ident1fiers~2) structw:ed searchesagainst-specific analysis types; and-J) a search for proteIns linked to entries in

~JviVirDBeitherbycornmon uniq\.ieidentifier orbypre-computed blast homology.Iteports lUldresults sets from the query tool can be downloaded into Excel.

zhou et ai, BMe Bioinformatics 2006 7:459doi: 10,1186/1471-2105-7-459

Utility

"MannDB provides users with pre-computed s~quenceanalyses forcomplete proteomes of bacterial and viTalpathoge~ from several governmentalagencies' lists ofbio-tbreat agents. The genomes.and tools are maintained upto date, with predictions being re-run every 3' months. The user can browseproteomes, or can blast sequences againstMannDBto pull up related entries

,and associated data. MannDB provides a convenient source of automatedsequence analyses and downloaded annotation information for wholeproteomes of human pathogenic bacteria and viruses and has a high degree ofintegration with external databases.

MannDB provides sequence analysis information ofpririlary interest to'researchers in the bio-defense communitY. We have been using MannDB for"several years to "annotate" DNA signatures [Slezak: etal., 2003] and more

" recently to assist collaborators in efforts to down~select from ",hole bacterialand viral genomes to identify suitable protein ~gets and protein features for _.driving the development of detection reagents [Zhou et al., 2005]. For example,a common requirement for a detection assay is that it be performed with minimalsample disruption. Therefore, an initial down selection for proteins expected tobe on the stirface of a bacteriaJparticle might entail identification of proteinsthat are predicted to be secreted or membrane bound by using tools such asPSORT [Gardy et al.,2005; Nakai and Horton, 1999;], TMHMM (Kroghet al.,2001], SignalP, TargetP [EmanueJsson et al., 2000], TopPred [Claros et al., 1994],and HMMTOP [Tusnady and Simon, 1998]. Having results from several toolsthat provide similar predictions but using different algorithms or slightlydifferent approaches' allows. us to compare predictions and make selectionswith greater confideJ:.lce. Identification of surface features for targeting of.detection reagents is done primarily by means' of additional sequence- andstructure-based analyses [Zhou et aj., 2005], although predictions pertaining

. to post-translational modifications (e.g., glycosylation, cleavage) are takeninto consideration as they may affect prote~ recognition.

Availability llnd Requirements

MannDB is freely accessible at http://manndbJInl.gov/ webCite. Although

.~~;'~

> •••••••- .·f

,DNA- Based SignaturesAgainstBiologicalWarfare 289

the software that populates and updates MannDB is not open-source,' the user'may request coUaborativesequence analysis services by contactingWi [email protected].

List of abbreviations

BLAST =Basic local alignment se'arch tool.

APHIS =Animal and PlantHealth Inspection SerVice.

(])c = Centers for Disease Control and Prevention.

,'HHS = Health and tIuman Services.

USDA = United States Departinent of Agriculture.

USFDA =United States Food and Drug Administration. '

NIAID =National Institute 'of Allergies and Infectious Diseases.

, WHO , =World Health Organization.

Comparative genomicstools applied, to bioterrorism defence

Rapid advances in the genomic sequencing ofbacteril!.and viruses overthe past few yeats have made it possible'to consider sequencing the genomesof all pathogens that ·affect hlimans and the crops and livestock upon whichour lives depend. Recent events make it imperative that full genome sequencingbe accomplished as soon as possible for pathogens that could be used asweapons of mass destruction or disruption. This sequence information mustbe exploited to provide rapid and accurate diagnostics to identitY pathogensand distinguish them from harmless near-neighbours and hoaxes. The Chem­Bio Non-Proliferation (CBNP) programme of the US Department of Energy(DOE) began a large-scale effort of pathogen detection in early 2000 when it

,was announced that the DOE would be providing bio-security at the 2002

Winter Olympic Games in Salt Lake Cityl Utah.' Our team at the LawrenceLivemlore National Lab (LLNL) was given the task of dAveIopingreliable andvalidated assay s for a number of the most likely biote'rrorist agents. The shorttimeline led us to devise a novel system that utilised whole-genome comparison

methods to rapidly focus on parts of the pathogen genomes that had a highprobability of being unique. As~ays develqped with this approach have beenvalidated by the Centers for Disease Control (CDe). They were used at the2002 Winter Olympics, have entered the public health system, and have beenin continual use for non-publicised aspects of homeland defence since autumn2001. Assays have been developed for all major threat list agents for whichadequate genomic sequence is available, as well as for other pathogensrequested by various government agencieS'. Collaborations with-comparativegenomics algorithm developers have enabled our LLNL team to make niajor

. advances in pathogen detection, since many of the existing tools simply didnot scale well enough to be of practical use for this application. It is hoped thata discussiOn of a real-life practical application of comparative genomics

•••

Page 12: DNA - based signatures defend against biological warfare agents and their makers

I'i;

, fl'J .,

JI' II ,.-I,L!t

I: i :1

"{

,i

11.

i!;1

!I

I~1,I

ij!I

1111~j!q,.

qIl!1[.

.,'\il

III

"Itl'

Iii

Bioterrorism and Biological Warfare

algorithms may help spur algorithm developers to tackle some of the manyremaining problems that need to be addressed. Solutions to these problemswill advance a wide range of biological disciplines, only one of which is pathogendet~ction. For example, exploration in evolution and phylogenetics, amwtilting

,~ene coding regions; predicting- and understanding gene function andregulation, and untangling gene networks all rely on tools for.aligning multiple

. sequences, detecting gene rearrangements and duplications, andvisualising

,~~~n:o~c.data:Two key problems cUrrently needing improved so~utions ar?: (1)lilt'gnmgmcomplete, fragmentary sequence (eg draft genome contlgs or arbItrarygenome regions) with both complete genomes and other fragmentaryseq~ences; and (2) ordering, aligning and visualising hon-colinear generearrangements and inversions in addition to ~e colinear alignments handled , .'by current tools. . ,.

DNA- based signatUres are needed to quiCkly and' accurately identifybiological warfare agents and their makers. DNA signatures are nucleotidesequences that can l)e used to detect the presence of an organism' and todistinguish that organism from all other species. Insignia, a new, comprehensive .system is applicable for the rapid identification of signatures in the genQmes ofbacteria and viruses. With the availability of hundreds of complete bacterial

. and viral genome sequences, it is now possible to use computational methods .to identify signature sequences in all of these species, and to use thesesignatures as the basis for diagnostic assays to detect and genotype microbes.~ .in both enviJ:onmental and clinical samples. The success of such assays criticallydepends on the methods used to identify signatures that properly differentiatebetween the target genomes and the sample background. Insi@ia is used tocompute accurate signatures for most bacterial genomes and' made themavailable through the Web site. A sample of these signatures has beensuccessfully tested on a set of 46 Vibrio cholerae strains, and the resultsiri.dicatethat the signatures are highly sensitive for detection as well as specificfor discrimination between these strains' and. their near relatives. Th~ entire

genomic complement of organisms are compared to identify probe targets, is apromising method for diagnostic assay development, and it provides assaydesigners with the flexibilitY to choose probes from the most relevant genes orgenomic regions. The Insignia system is freely accessible via a Web interfaceand has been released as open source software at: http://insignia.cbcb.umd.edu.

MannDB is a genome-centric database containing comprehensiveautomated sequen£eanalysis predictions for protein :;equences from organismsof interest to the bio-defense research community. Computational tQolsfor theMannDB automated pipeline were selected based on customer needs inprovidihg down selections from large sets of proteins (e.g., wholeproteomes)to short lists of proteins most suitable for developing reagents to be used infield assays for detection of pathogens. For that reason we have focused our

.efforts on' applying tools that would enable selection of proteins that meet

'!t:,.~':..

':'1't,

>{;'1.

f4,

~...~,t-l.~fi?"".".,

, 'DNA-Based Signatures Against lUological Warfare. 291

assay requirements, such as cellular localization, that would liSsistin determiningthe value of a surface feature for targeting'ligand binding, or that would identify

antig~nic sub-sequences of particular value inantipody development~ As the·goals of some of these assays have been to detect toxins or proteins associatedwith virulence, we constructed hard links between protein sequences in MannDBwith entries in MvirDB in order to conveniently identify and characterize protein -

· targets and features for ~ese applications. We believe that MannDB will be ofgeneral use to the bio-defense and medicalresearch communities as a resoUrce

· for predictive sequence analyses and virulence inform<\tion.

References

Altschul SF, GishW;Miller W, Myers EW and Lipman OJ (1990): BasiC local

alignment search tOI;>1.JMol Bioi, 2i5, 403-410. .

Aridrade MA, Brown NP, Leroy G,Hoersh S, de Daruvar A, Reigh C, Franchini. A, Tamames J, Valencia A, Ousounis C and Sander C (1999) : Automated

. gen_omesequence analysis and annbtatlon. Bioinformatics, 15,391-412.APHIS Agricultural Select Agent Prograin select agent and toxin list [http://

. www.aphis.uspa.gov/programs/atLselectagentlalLbioter't _toxinslisthtml]'webcite

BendtsenJD, Nielsen H, von Heijne G and Brunak S (2004) : Improved prediction. of signal peptides: SignalP 3.0. Journal ofMolecularBiologj., 340, 783-

795.

BlomN, Hansen J, Blaas D and Brunak S (1996): Cleavage site analysis inpicomaviral polyproteins: Discovering.cellular targets by neural networks ...Protein Science,S, 2203-2216.

Brown K (2004): Biosecurity. Up in the air. Science, 305·, 1228-1229.

CDC bioterrorism agents/diseases list [http://www.bt.~dc.gov!agentlagentIist-

category.asplwebcite . j;;'Chang WI and Lawler EL (1994): Sublinear expect~d time approximate string

matching and biological applications. Algorithmica, 12,327-344.

Claroi,-MG,vonHeijpe G: TopPred IT (1994) :An improved software for membraneprofein structure predictions. CABIOS, 10,685-686.

DeIcher AL, KasifS, Fleischniann RD, Peterson J, White °and et al. (1999):. Alignmentofwholegenomes. NucleicAcids Re.s, 27,2369-2376.

Deicher AL, Phillippy A, Carlton J and Salzberg SL (2002): Fast algorithms for

. large-scale genome aHgrimentand comparison~Nucleic Acids Res; 30,2478-~~. .

'Emanuelsson 0, Nielsen H, Brunak S and vOll Heijne G (2000): PrediCtingsubcellular localization of proteins based on their N-terminal amino acid

sequence. Journal of Molecular Biology, 300, 1005-1016.'

•••

Page 13: DNA - based signatures defend against biological warfare agents and their makers

292· DNA- Based Signatures Against Biological Warfare 293

Keim P, PriceLB, KlevytskaAM, Smith KL and Schupp JM, (2000) Multiple­locus variable-number tandem repeat analysis reveals genetic relationshipswithin Bacfllus anthracis. JBacteriol182, 2928-2936.

'Krogh A, Larsson B, von Heijne G, Sonnhammer ELL Year: Predictmg. transmembrane protein topology with a hidden Markov model: application

to .complete genQmes.

Kurtz S (2003): A time and space efficient algorithm for the substring matchingproblem. TechDicalReport. Hamburg: Zentrum fiirBioinformatik, UniversitiitBamburg.

Kurtz S, Phillippy A, DeIcher AL, Smoot M, Shumway M and et al. (2004) :Versatile and open software for comparing large genomes. Genome Biol, 5,R12.

Li F and Stonno GD (200 I): Selection of optimal DNA oligos for gene expression

. arrays. Bioinformatics, 17, 1067-1076.

Limnv,Simpson 1M,Keams EAand Kramer MF (2005): Current anddevelopingtechnologies for monitoring agents of bioterrorismaml biowarfare. ClinMicrobiol Rev. 18,583-'607.. .

.L-ivakKJ, Flood SJ, Marmaro J, Giusti Wand Deetz K (1995) Oligonucleotideswith fluorescent dyes at opposite ends provide a quenched probe systemuseful for detecting peR product and nucleic acid hybridization. PCR

Methods Appl, 4,357-362. ,

McBride MT, Masquelier D, Hindson BJ, MakarewiczAJ and Brown S (2003):Autonomous detection of aerosolized Bacillus anthracis and Yersinia

pestis.AnaIChem, 75,5293-5299.

Markowifz vM, Korzeniewski F,PalaniappanK:; Szeto P, lv.anovaN and Kyrpides

NC(200S): The integrated microbial genomes (IMG) system: a case study inbiological data management. Proceedings of the 3J§t VLDB Conference:2005; TrondheimNorway.2005, 1067-1078. .

Meyer F, GoesmannA, McHardy AC, Bartels D, Bekel T, Clausen J, Kafinowski·J, Linke B, Rupp 0, Giegerich Rand PuhlerA (2003): GenDB - an opensource genome annotation system for prokaryote genomes. Nucleic AcidsResearch, 31,2187-2195.

Peterson ill,Umayam LA; Dickinson TM, Hickey EK and WhiteO (200 I): The

comprehensive microbial resource. Nucleic Acids Research, 29, 123-125.. . .

MvirDB microbial virulence database [bttp://mvirdb.llnl.gov} webcite.

Moser MJ, Christensen DR, Norwood D an~Prudent JR (2006): Multiplexed.detection of anthrax-related toxin genes.J Mol Diagn, 8, 89~96.

NakaiK and Horton P (1999) : PSORT: a program for detecting the sorting

signals of proteins and predicting their subcellular localization.

NlAID category A, B and C prioritY pathogenS (bttp://wWw3.niaid.tiih.govlbiodefenselbandc Jlriority.htm) webcite , . ..•. : .. '.

.,.,~

Bioterrorismand Biological Warfare

Fitch jp, Gardner SN, Kuczmarski TA; Kurtz S,MyerS R and et al. (2002):.Rapid. deveiopment of nucleic aciddiagnpstics. Proc IEEE, 90, 1708:-1721. .

Fitch IP,Raber E and Imbro DR(2003): Technology challenges in ~espondingto biologiCal or chemical attacks in the civilian sector. Science, 302, .1350-

,,1354. .. . .. .... .FtjshmartD,Albermanrt K,Hari I, Heumann K, MetariomskiA,Zollner A, Mewes

. H-W (2001): Functional ari-d structural genomics using PEDANT ..

'.'1>,(j;, Bioinjormatics, 17,44~57. . .

Gardy JL, Laird MR,CheriF, Rey S, WalshCJ; EsterM and BrfukmanFSL'(2005):PSOR1b v.2.0: expanded prediction of bacterial prQteinsubcellularlocalization and ~·insighis.gained from comparative proteome analysis.~·Bioinfo~matics, 21,617-623. .

Gardner SN, Lam MW, Mulakkeil NI, Torres CL; Smith JR and ef al. (2004):·Sequencing needs for viral di<!-gnostics.J Clin Microbiol, 42, '5472-5476.

Gardner SN, Kuczmarski TA, Vitalis EAand Slezak TR (2003): Limitations ofTaqMilDPCR for detecting divergent viral pathogens illustrated by hepatitisA, B, C, and E viruses and human iminunodeficiency virus. JClin Microbiol,41,2417-2427.

Gattiker A, Michoud K, Rivoire C, Auchincloss AH, Coudert E, Lima T, KerseyP, Pagni M, Sigrist CJA, Lachaize C, Veuthey A-L, Gasteiger E and BairochA, (2003): Automated amiotation of microbial proteomes in SWISS-PROToComputational Biology and Che"Jistry, 27,49-58. .

GoesmannA, Linke B, Bartels D, Dondrup M; Drause L, Neuweger H, Oehm S,Paczian T, Wilke A and Meyer F, (2005): BRIGEP - the BRIDGE-based~genome-transcriptome-proteomebrowser. Nucleic Acids· Research, 33,W710-W716.

Gordon PM and Sensen CW (2004): Osprey: A comprehensive tool employing.novel methods for the design of oligonucleotides for DNA sequencing andmicroairays.NucleicAcids Res, 32, e133. .

Gusfield b (1997): Algorithms on strings, trees, and sequences: Computer'science and computational biology. New York: Cambridge University Press.554p.

HHS and USDA select agents and toxinS list (http://www.cdc.gov/od/sap/docsl

salist.pdf) webcite.

Hohl M, Kurtz S aild Ohlebusdr E (2002): Efficient multiple genome alignment.Bioinjormatics" 18, S312-8320.

Kaderali L and Schliep A (2002): Selecting signature oligonucleotides to identifyorganisms using DNA arrays.Bioinjormatics, 18, 1340-1349"

Keim P, KlevytskaAM, Price La, Schupp JM and Zinser G (1999) Moleculardiversity in Bacillus anthracis. I Appl Microbiol87: 215c..217..

lli1

il!

;1 ,~

!l,~f

i\·.:11. :.:['

1

',I ~

111

Page 14: DNA - based signatures defend against biological warfare agents and their makers

i'l

'~II ,..-'! '",

q'I i

\ !

'~:>

I" :

\ .:~j ;:

i i!.]i

, 'I

ri!!

II \:

\ I1\'I '

\ jl

i :~"

r:",,:

-'

294 Bioterrorismand Biological Warfare

Nordberg EK (2005) YODA: Selecting signatl,lre oligonucleotides'.Bioinformatics, 21, 1365~1370. '

O'Connell KP, Bucher JR, Anderson PE, Cao CJ, Khan AS and et al. (2006):

Real-time fluorogenic reverse trans~ription-PQRassays for detection ofbacteriophage MS2. Appl Environ Mic.robiol, 12, 478~83.

,KD, Tatusova T"Maglott DR: N~I reference/sequence' (RefSeq), year: acurated non-redundant sequence database of genomes" transcripts and

'iI!~ '" proteins; Nucleic Acids Res 35: D61-D65. '

Pruitt KD, Tatusova T and Maglott DR (2007): NCBI reference sequences

(RefSeq): A curated nonredundantsequence qatabase of genomes,transcripts, and protein's. Nucleic Acids Res, 35, D61~D65.

Rahmann S (2003): Fast and sensitive probe selection for DNA chips usmgjumps in matching statistics. Proc IEEEComput Soc·Bio~form Conf2, 57-64.' ,

Rozen Sand Skaletsky H(2000): Primer3 on the WWW for general users andfor biologist programmers. Methods Mol BioI, 132: 365-386: '

, Slez3I< T, KuczmarsId T, OU'L, Torres C, Medeiros D and et al. (2003):Comparative genomics tools applied to Qioterrorismdefense. BriefBloinform4,133-149.

Slezak T, Kuczmarski T, Ott L, Torres C, Mederos D, Smith J, Truitt B,MulakkenN, Lam M, Vitalis E, ZemlaA, Zhou,C and Gardner S (2003) : Comparativegenomics tools applied to bioterrorism defense. Briefings in Bioinformatics,4,133-149.

Tembe W, Zavaljevski N, Bode E,ChaseC, Geyer J andet al. (2007):Oligonucleotide fmgerprint identification for micro array-based pathogen .

diagnostic assays. Bioinformatics, 23, 5~13. '

Tusnady GE, Simon I year? : Principles governing ~inoacid composition ofintegral membrane proteins: applications to t9P.ology prediction. ,

UrismanA, Fischer KF, Chiu CY,Kistler AL, Beck S andet al. (2005):E-Predict:- '0_, A computational'strategy for speciesidentificatiori 'based, on ob'served

DNA micro array hybridization patterns. Genome BioI, 6~R78.", '

USFDABad Bug Book [http://www.cfsan.fda~gov/-mowr.iltroJitmllweb~ite

Vallenet D, Labarre L, Roily Z, Barbe V"Bocs S~Cruveiller S;Lajtis A, pascal 0.Scarpelli C and Medigue C (2006): MaGe: a Diicrobial genome annotationsystem supported by synteny results. Nucleic Acids Research, 34,53-65.

. Van Domselaar GH, Stothard P, Shrivastava S;CMJA, Guo A, Dong X,LuP,Szafran D, Gremer Rand WIShart DS (2005) :BASys: IIweb server fot~tOrnated "bacterial genome annotation. NudeicAcids Research,;~3(Yl455;;'W 459, ,,'

. ': ... '",.','" ',.1,' "

,V610khov D,Pomerantsev A, Kivovich WRaSoolyA afidChi~ikov V(2004):,0; Identification ofBacillus,antl;zracisbYU1f1.ltip(o~ Iiiicro~ay hybridization.

DiagnMicrobiolInjectDis; 49, 163q7f: ' ' ,

I';"t,~{,

,,~t~. _it~, 4t't\''i'}i:;'\'!~t~.

J.;i'

~1i:~. "',;. {.:

ii/1.:,, ';t~;~;,::

~'l!.'

1'iit "

.tI.oi.,~~"t.,

,,',DNA- Based SignaturesAgainst Biological Warfa,re 295 .

Wang D, Coscoy L, Zylberberg M, Avila PC, Boushey HA and et al. (2002):Microarray.;.based detection and genotyping of viral pathogens. Proc NatlAcadSci USA, 99,15687-15692.

WHO list of major zoonotic diseases [http://www.who.intlzoonoses/diseases/'en/) webcite '

WHO list of diseases covered by the Epidemic and Pandemic Alert and

Response (EPR) [http://wWw.who.iiltlcsr/diseaseleRl] webcite

Willse A, Straub TM, \v~schel SC, Small JA, Call DR and et al. (2004):Quantitative oligonucleotide micro array fingerprinting of Salmonella

ent~rica isolates. Nucleie;Acids Res, 32, 1848-1856.

Zhou CEZ, ZernlaA, Roe D, YoungM, Lam M, SchoeingerJ and Balhom R(2005) : Computational approaches for identification of conserved/uniquebinding pockets in ,the A chain of ricin. [http://bioinforrnatics.oxfordj ournals.o rg/cgilreprint/21114/3089] webcite,

Bio'informatics, 21, 3085~3096. '

it.

.4