evolution of salmonella o antigen variation by interspecific gene transfer on a large scale

6
~'~EVIEWS Many cell surface components are polymorphic, such as human blood groups. In bacteria, about 60 forms of the O antigen have been reported for Salmonella enter- ic# and about 160 for the related Escherichia coll. The O antigen is a polysaccharide present on the surface of many bacterial species where it is part of the lipopolysaccharide that replaces phospholipid in the outer leaflet of the outer membrane of Gram-negative bacteria (Fig. 1). Each type of O antigen consists of many repeats of an oligosaccharide subunit (O unit) with 3--6 or more sugars per O unit. The forms differ enormously both in the specific sugars present and in the arrangement of sugars in the O unit (see Ref. 2 for review). The forms discussed here are shown in Fig. 2. The antigenic properties of the O antigen have been well documented because it is used in the major typing schemes for S. ente~ca and E. coli, and also because for a long time new serotypes in S. enterica were given a full species status, thus meriting a short paper describing each new species! Most O antigen forms have yet to be characterized chemically but it is clear that many quite unrelated forms occur. Maintenance of the polymorphism How is this polymorphism maintained and how did it originate? The variation is such that in general it cannot have involved recent mutations, and must there- fore be maintained by some form of balanced selection. The very large number of forms is particularly difficult to explain using population genetics theory, which was developed for outbreeding diploid species. However, bacterial populations are clonal3 - although the level of genetic transfer between clones is still under discussion. In E. coli and S. enterica many clones appear to be adapted to specific hosts or to specific forms of infec- tion 4. O antigen specificity is often correlated with the mode of pathogenesis4,5, and probably plays a part in the adaptation of clones: a theoretical basis for maintenance of clonal adaptation by niche-specific selection is reported elsewherec~. 0 antigen genes of group B The O unit is synthesized on a lipid carrier and then polymerized on the same carrier before transfer to the oligosaccharide 'core' part of the presynthesized lipid A/core precursor (see Fig. 1). Only the O antigen shows extensive variation, the genetic basis of which is being studied in S. enterica (building on the work of Stocker, Miikelii and Nikaido in particularT). Genes specific to syn- thesis of the O unit are located in the rough B (rfl~) gene cluster, so named because mutants lacking O antigen produce colonies with a 'rough' appearance. In strain LT2 (serovar typhimurium, group B) there are four genes for synthesis of TDP-rhamnose, five for synthesis of CDP- abequose and two for synthesis of GDP-mannose, plus genes encoding four transferases that transfer sugar from nucleotide sugars to oligosaccharides. UDP-galactose has other roles in the metabolism of S. ente~qca and hence its biosynthesis is not controlled from the r fl0 cluster. The path for CDP-abequose biosynthesis is shown in Fig. 3, together with those of other dideoxyhexoses present in related strains. ©1993 Elsevier Science Publishers Lid it*K) 0168 - 9q25/93/S(~.iltl Evolution of Salmonella O antigen variation by interspecific gene transfer on a large scale PETER REEVES The 0 antigen is a bacte~al surface polysaccharide made up of repeats of a short oUgosaccharide. There are about 60forms of 0 antigen in Salmonella, and genetic ana~sis indicates that these were acquired by interspecific gene transfer. We have cloned and sequenced the rfb cluster from several groups of S. enterfca ~-l°. Figure 4 compares the rfb genes of group B with those of five other groups. Twelve of the 15 genes expected have been identified and the others provisionally located, leaving only one open reading frame (ORF) not accounted for. This ORF, named rJbXin Fig. 4, encodes a protein with 12 predicted tmnsmembrane segments and occurs in each cluster. It is possible that this gene is involved in some aspect of O antigen processingg. 0 antigen repeatunit I Mannose mAbequose I Rhamnose I Galactose I lipo- polysaccharlde J ] Outer J membraue Peptidoglycan Cytoplasmic membrane phospholipid FZGH A section of the cell membrane and cell wall of Salmonella enter~ca. The lipidof tile outer leafletof the outer membrane is lipopolysaccharide that has three components:lipid A with la W acids anchoring the moleculein the membrane, an oligosaccharide "core'(dark shading) and the long chain polysaccharide O antigen (light shadingS.The repeat unit shown is that of group B. TIGJANUARY 1993 VOL.9 No. 1 m

Upload: peter-reeves

Post on 26-Aug-2016

213 views

Category:

Documents


1 download

TRANSCRIPT

~ ' ~ E V I E W S

M a n y cell surface components are polymorphic, such as human blood groups. In bacteria, about 60 forms of the O antigen have been reported for Salmonella enter- i c# and about 160 for the related Escherichia coll. The O antigen is a polysaccharide present on the surface of many bacterial species where it is part of the lipopolysaccharide that replaces phospholipid in the outer leaflet of the outer membrane of Gram-negative bacteria (Fig. 1).

Each type of O antigen consists of many repeats of an oligosaccharide subunit (O unit) with 3--6 or more sugars per O unit. The forms differ enormously both in the specific sugars present and in the arrangement of sugars in the O unit (see Ref. 2 for review). The forms discussed here are shown in Fig. 2.

The antigenic properties of the O antigen have been well documented because it is used in the major typing schemes for S. ente~ca and E. coli, and also because for a long time new serotypes in S. enterica were given a full species status, thus meriting a short paper describing each new species! Most O antigen forms have yet to be characterized chemically but it is clear that many quite unrelated forms occur.

Maintenance of the polymorphism How is this polymorphism maintained and how did

it originate? The variation is such that in general it cannot have involved recent mutations, and must there- fore be maintained by some form of balanced selection. The very large number of forms is particularly difficult to explain using population genetics theory, which was developed for outbreeding diploid species. However, bacterial populations are clonal 3 - although the level of genetic transfer between clones is still under discussion. In E. coli and S. enterica many clones appear to be adapted to specific hosts or to specific forms of infec- tion 4. O antigen specificity is often correlated with the mode of pathogenesis 4,5, and probably plays a part in the adaptation of clones: a theoretical basis for maintenance of clonal adaptation by niche-specific selection is reported elsewhere c~.

0 antigen genes of group B The O unit is synthesized on a lipid carrier and then

polymerized on the same carrier before transfer to the oligosaccharide 'core' part of the presynthesized lipid A/core precursor (see Fig. 1). Only the O antigen shows extensive variation, the genetic basis of which is being studied in S. enterica (building on the work of Stocker, Miikelii and Nikaido in particularT). Genes specific to syn- thesis of the O unit are located in the rough B (rfl~) gene cluster, so named because mutants lacking O antigen produce colonies with a 'rough' appearance. In strain LT2 (serovar typhimurium, group B) there are four genes for synthesis of TDP-rhamnose, five for synthesis of CDP- abequose and two for synthesis of GDP-mannose, plus genes encoding four transferases that transfer sugar from nucleotide sugars to oligosaccharides. UDP-galactose has other roles in the metabolism of S. ente~qca and hence its biosynthesis is not controlled from the r fl0 cluster. The path for CDP-abequose biosynthesis is shown in Fig. 3, together with those of other dideoxyhexoses present in related strains.

©1993 Elsevier Science Publishers Lid it*K) 0168 - 9q25/93/S(~.iltl

Evolution of Salmonella O antigen variation by interspecific gene transfer on a large scale PETER REEVES

The 0 antigen is a bacte~al surface polysaccharide made up of repeats of a short oUgosaccharide. There are about 60forms of 0 antigen in Salmonella, and genetic ana~sis indicates that these were acquired by interspecific gene transfer.

We have cloned and sequenced the rfb cluster from several groups of S. enterfca ~-l°. Figure 4 compares the rfb genes of group B with those of five other groups. Twelve of the 15 genes expected have been identified and the others provisionally located, leaving only one open reading frame (ORF) not accounted for. This ORF, named rJbXin Fig. 4, encodes a protein with 12 predicted tmnsmembrane segments and occurs in each cluster. It is possible that this gene is involved in some aspect of O antigen processingg.

0 antigen repeat unit

I Mannose mAbequose

I Rhamnose

I Galactose

I

lipo- polysaccharlde

J] Outer J membraue

Peptidoglycan

Cytoplasmic membrane

phospholipid

FZGH

A section of the cell membrane and cell wall of Salmonella enter~ca. The lipid of tile outer leaflet of the outer membrane is lipopolysaccharide that has three components: lipid A with la W acids anchoring the molecule in the membrane, an oligosaccharide "core' (dark shading) and the long chain polysaccharide O antigen (light shadingS. The repeat unit shown is that of group B.

TIG JANUARY 1993 VOL. 9 No. 1

m

]]~EVIEWS

Abe [ ~ al l ,3 ¢¢11,4

(7. (7 .~ ." --~Fl~lan.-~ Rlaa-*- Gall - ~ Typhimurium 2 =- 1,4 1,3 1

Par [ ~ or/1,3 o¢ [ 1,6 - - ' J L W _ _ ~

- - r an&Rha&6all-e.- Pamtyphi A 2 L 1,4 1,3 "J 1

Tyv 1,01c-2OAcl a11,3 ot11,4

-.-~[ l~lan-q.°b Rha ~.-~ ~aal"] . ~ Typhi 2 t. 1,4 1,3 a 1

- p ct 'a • -~ [-Man-,~ Rha - * Gal']-+ Anatum

6 i. 1,4 1,3 J 1

Abe I GI¢OAcl a 11,3 a | 1 ,8 Muenchen

--~'Rlaa---bM an--~M an--.~ Gall 4 " 1,2 1,2 1,3 J 1

~ ] Montevideo 01

p P FMan Man-* Man-~Man-~..GlcNac']-*. 2 " 1,2 1,2 1,2 1,3 J 1

#TGB

Structures of the repeat units of six Salmonella enterica 0 antigens. Each structure define~ a particular group (B, A, I), El,

C2 or CI) and one serovar of each group is named. Shading indicates residues not always present and determined by genes outside the rJb cluster. These residues are thought to be added

after completion of the main O unid and are not discussed here. Abbreviations~ Abe, abequose; Gal, galactose; GlcNac, N-acetyl

glucosamine; Man, manriose; Par, paratose; Rha, rhamnose; Tyv, tyvelose.

Five closely related 0 antigens The O antigens of groups A, B and D differ only in a

dideoxyhexose side chain sugar, which is paratose, abequose or tyvelose, respectively, and have similar structures and biosynthetic pathways (Fig. 3). Sequence analysis 1t-13 reveals the expected differences (Fig. 4). The rfbJgene of group B is replaced by a rfloS ger, e in groups A and D. The genes rJbJ and rJbS are cleady hom~;lo- gous, but have diverged to such an extent that there is now only 26O/o identity at the amino acid level, with several gaps present lz.

The rfoE gene that is involved in the conversion of CDP-paratose to CDP-tyvelose was expected, and found, in group D but was also, unexpectedly, found in group A, the only difference being that in the group A strain the gene has an early frameshift mutation and is therefore

nonfunctional tz. Southern blotting has shown that sev- eral other group A strains have the (presumably non- functional) rfbE gene, suggesting that all of these are recent derivatives of group D strains. However, the dif- ference between the rJbE genes was sufficient to define

B a new species under the old system of nomenclature for Salmonella.

The O antigens of groups E1 and C2 are related in structure to the three just discussed (Fig. 2) and this has been confirmed by genetic analysis l*-t6 (Fig. 4). In gen-

A eral the predicted genes were found, plus a gene encod- ing a membrane protein. In all five strains the genes for each biosynthetic pathway are grouped and occur in the order: rhamnose, dideoxyhexose (absent in group El) and mannose. In all five groups, galactose is transferred to a lipid carrier as the first step in O unit assembly and

D the galactose transferase gene rfbPis located downstream of the mannose pathway genes. Each strain also has some group-specific genes that mostly lie between the dideoxyhexose and mannose pathway genes. The transferzse genes other than rJbP are in this central

g ] region and often differ between strains: note that the mannosyi-rhamnose linkage is l] in group E1 but ct in the others and that the two mannose transferases are not very similar. The linkages in groups B and C2 are com- pletely different apart from the galactosyl linkage (to the

C9 carrier lipid) and, as expected, rfbPis the only transferase gene conserved between the two.

interspecific transfer among rib gene dusters The G+C content of the genes from group B ranges

from 0.32 to 0.46 (Fig. 5). This pattern is in contrast to the normal situation in which all the coding DNA of a given bacterial species has the same G+C content, which in the case of E. coil and S. enterica is about 0.51.

Sueoka 19 has put forward a generally accepted expla- nation for interspecific variation in G+C content. He argues that the likelihood of mutation from A.T to G.C may not be the same as the likelihood of mutation from G.C to A.T. (A.T signifies either the AT or TA base pair and G.C either the GC or CG pair.) Mutation is due to errors in replication that are not subsequently corrected, and there is no reason why A.T to G.C should occur at the same frequency as the reverse, since different kinds of errors are involved. If the frequency differs, there will be a bias towards G.C or A.T in the mutations available to be fixed either by random drift or natural selection. Thus, in Staphylococcus, which has a low G+C content, mutations from G.C to A.T must occur more frequently than the reverse; this then leads over a long period of time to a low G+C content for all genes.

The G+C content of the rib genes provides strong evi- dence for interspecific transfer. The variation in G+C con- tent indicates that the gene clusters were assembled from several sources, and the generally low G+C content indi- cates that they must have evolved in species with low G+C content, each rib form subsequently being trans- ferred to S. enterica.

The origin of the polymorphism The generally conserved gene order indicates homol-

ogy, but the extreme sequence differences in some genes (Fig. 4) tell us that the clusters have been diverging for a very long time. Most of the rhamnose and abequose

TIG JANUARY 1993 VOL. 9 No. 1

B

[ ~ E V I E W S

pathway genes are essentially identical wherever they occur, but the two mannose pathway enzymes of group C2 differ from those of the othe- forms at 22% and 30% of amino acids, and the abequose synthase (rfbJ) of group C2 differs from the group B homologue at an enor- mous 64% of residues. There are also substantial differ- ences between the 0"oD and rJbNgenes of group E1 and those of the other groups.

Thus, homologous genes in these rfb clusters fall into two classes: essentially identical or substantially divergent. The level of variation in the first group re- sembles that of most other genes and presumably genetic exchange between clones allows the genes to drift in concea. Why has this not applied to all genes with con- served function? Those genes in the second group, which, although homologous, show extensive diver- gence, are always adjacent to group-specific genes and this offers a clue.

The central region of dissimilarity may reduce recom- bination nearby and facilitate accumulation of neutral variation between forms, which itself will further reduce recombination. If at any time such variation reaches a level at which recombination cannot balance the increase in variation by mutation and genetic drift within forms, then the forms would from that time drift independently and continue to diverge ~5,~6, each associated with a sep- arately maintained O antigen form. We suggest that the variation in the genes rJbJ, rJbM, rJbK, rfbP, rJbDand rjbN arose and is maintained in this way, and that the differ- ences reflect random genetic drift rather than adaptive change. The variation in these genes is far greater than that between homologous genes of E. coli and S. enter- ica (which characteristically differ at 15--20% of bases to give mostly synonymous substitutions, and at up to 4% of amino acids), indicating very long periods of diver- gence for O antigen gene clusters.

Gene acquisition in the C1 rfl~ cluster The group C 1 0 antigen and rfb gene cluster show

little similarity to the set of five discussed above 2°,2], although the rfl0Mand rjbKgenes of the nxannose path- way are clearly homologous (Fig. 4). Within the rJb cluster are segments of different G+C content, mostly of low G÷C content ranging down to 0.29. It appears that this cluster evolved independently, but also in an organism with a low G÷C content, and was indepen- dently acquired and incorporated into the same locus in Salmonella. However, the rfl0K gene has a G+C content of 0.61 and, surprisingly, is almost identical to the nearby cpsG gene.

The cpsgene cluster contains the genes for the M anti- gen, a surface capsular polysaccharide comprising glu- cose, galactose and fucose. The cpsB and cpsG genes from strain LT2 have been sequenced and are isogenes of rJbM and rfbK, respectively, encoding enzymes that synthesize GDP-mannose (in this case as an intermedi- ate in the synthesis of GDP-fucose, a component of M antigen). However, whereas the two ~)¢b genes of group B have a G+C content of 0.4, the two cps genes have a G÷C content of 0.61 (Re(. 22), and must have entered S. enterica by interspecific transfer from a species with a high G+C content. The isogenes are clearly homologous, but must have spent a very long time in species of dif- ferent G+C content before transfer to S. enterica.

GLUCOSE-1 -PHOSPBATE

r/bF ~ CDP-D-GLUCOSE

rn, G CDP~,~L~'I~.D~XY-~L~ 0BE

CDP-4-PYRIDOXAMINERI-DEOXY-D-QLUC08 E

c ~ ~'~--Q H CDP-4 KETO-

O=~H H H)I ..e mDSOXY P GLUCOSS . . . ~ O~CDP

(t~.p~// x~mmmr. A..d O,

H r ~ O - - C D P H OH HO H H DP

CDP-ABEQUOSE CDP-PARATOSE

(group D)

CDP-TYVELOSE H O ~ . . . O D P

FIG[]

Biosynthetic pathway for abequose, paratose and tyvelose, indicating the points at which various Salmonella enterica gene products act.

In the C1 strain both cpsG and the nearly identical rfbK(C1) are present 2°, although the cps cluster genes have not yet been tested for function. The C1 rjb cluster must also have transferred to S. enterica from another source and it is hard to believe that it did not originally include a rfbK gene with a G+C content of 0.4. At some time after transfer this gene must have been replaced by a copy of the cpsGgene. Perhaps the original rfbK gene became nonfunctional either at, or some time after, trans- fer, and the cpsG gene was then duplicated to provide the necessary additional activity. If this were followed by tmnslocation to place the additional gene into the rib cluster it would give the sequence that exists now 21. This scenario may seem farfetched, but the facts point to such a series of events. Although the incorporation of a copy of a cpsGgene as an rfbKgene must have postdated the transfer of both to S. enter~ca, the low level of divergence does not mean that the duplication was as recent as this would normally imply, since concerted evolution could act to cause the two genes to remain very similar in sequence during random genetic drift.

Origins of 0 antigen variation The picture we have developed is of O antigen gene

clusters that are mosaics of genes from different sources,

"riG JANUARY 1993 VOL. 9 NO. 1

m

WEVIEWS

put together in species not yet identified, and ultimately transferred to S. enter~ca. Where does this variety of gene clusters come from and what function did they serve beforehand? Polysaccharides comprising repeats of a short oligosaccharide are characteristic of bacteria 23. They occur not only as O antigens but also, for example, as capsules, of which there are two types in E. coll. These two types have many forms, those of type 1 mapping near the rfb gene cluster and those of type 2 mapping to the unlinked kps gene cluster. The M antigen discussed above is probably a form of type 1 capsule 2"=. There are many other repeat unit polysaccharides, some that are apparently invariant, like the enterobacteriai common antigen, and others, like the capsules of Streptococcus pneumoniae, that are as variable as the O antigen z. This wide variety of repeat unit polysaccharides provides a rich potential source for variation in the rJ'o gene clusters found in S. enterica.

How did the rfl~ gene clusters get into S. enter/ca? It must be remembered that in order to explain the

maintenance of so many O antigen forms substantial selection pressure for various forms was assumed. The same selection pressure will facilitate transfer and estab- lishment of desirable forms from outside the species. We envisage that occasionally a gene cluster for a polysac-

charide not previously present in S. enterica is incorpor- ated, perhaps by transduction or by being transferred to a pl~..smid that is then transferred into S. enterica. The genes can only become established if they are expressed, if the polysaccharide reaches the surface, and if the new form is advantageous. It seems unlikely that a new form could replace an existing form in one event but pre- sumably natural selection can operate to optimize the expression and regulation of the newly acquired genes and the attachment of the new polysaccharide to the lipopolysaccharide core as an O antigen. Once this process has been completed, expression of the old anti- gen could be lost by mutation. Although O antigen is readily lost in culture by mutation at rfl0, it is present in virtually all natural isolates of E. coliand S. enterica, prob- ably because it acts as a protective layer. Therefore, the pre-existing O antigen is presumably not lost until the new form is properly expressed.

This unlikely set of events must have occurred many times, since S. enterica and E. coil have only two known O antigen forms in common 2 and in at least one of these cases the gene clusters are very divergent zS. It seems that most or even all have been acquired by interspecific transfer since the two species diverged an estimated 140 million years ago 26, about the same time that the monotremes, marsupials and eutherian mammals

t r ~

f - -

2nO & 1st mannose transferase

Rhamnose pathway Mannose pathway

[ ~ i Galactose pathway Dideoxyhexose pathway 12-membrane-segment proteins

Height of colour indicates degree of amino acid identity with homologous gene of group B (LT2)

~iXi--I Vertical band of colour indicates genes that belong to a corresponding pathway but have no homology to group B (LT2) genes

Non- rfb genes

[ ] Transferases

FIGLq

The r./b region from six groups of Salmonella enterfca. Genes for different functions are colour coded. The height of colour shading reflects the level of amino acid identity with the homologous gene of group B. Genes with letters (e.g. A, meaning rfbA) have been identified by subcloning and assaying gene products e 10.17 except for rfl~C and rJbD, which were identified because together with the products of rJbA and rJbB they allow TDP-rhamnose synthesis. The genes rfl~Cand rJbD are distinguished from each other by hom- ology with strM and strL, respectively, of Stmptomyces grfseus 18. All groups have one gene that expresses a protein product predicted to have 12 transmembrane segments and that has an unknown function (although it could be involved in antigen export). Other genes are named according to their position in the sequence (e.g. orfZ6 lies 7.6 kb along the sequence). The genes orf7.6 and orflO.4 are provisionally allocated to unidentified genes of the dideoxyhexose pathway (rJbH and rfl~l) because their presence is correlated with products of this pathway. Transferases that transfer sugar from nucleotide sugars to oiigosaccharides are indicated by heavy black lines. The position of the gene encoding abequose transferase (rJbV) has been allocated by elimination, and awaits a confirming assay. Group A is as group D but has a mutation in rfl~Eand one segment triplicated. Lines between groups connect breaks in homology. Stippled genes, including the phosphogluconate dehydroqe~.ase (gnd) gene, are non-rJbgenes. A vertical band of co]our in the centre of a box indicates a gene that is part of a corresponding l.athway but has no homology with group B.

TIG JANUARY 1993 VOL. 9 NO. 1

F~EVIEWS

I I • I ! I t i l I I i i i l I t iS ! I i I

$ 10 15 20

51 44 50 46 40 40 43 43 45 32 31 32 33 3? 40 41 37 51

A/ o.- 0.40 0.40

0.32

, i , I Kd , l . !

u I*' tq I'

VXGli

The G+C content of the rfb gene cluster of group B, strain LT2 of Salmonella entenca ~. The G+C content of each gene is shown below the cluster (which is marked with a scale in kilobases). The graph gives the fraction of G+C bp in each 100 bp span along the duster. The colour coding of the genes is as for Fig. 4.

diverged. Since then there have been many species of mammals, reptiles and birds, and as these are the hosts for the two bacterial species, it seems plausible that the range of preferred O antigen forms changed over this time frame to provide the strong selection pressure required to produce new O antigens. Over a span of 140 million years, thousands, or even a few million, years could be allowed for each successful transfer and, as sev- eral such transfers can occur in parallel, one could still explain the replacement of an ancestral set of O antigen forms by a different set in each of the two new species.

Each O antigen form in S. entenca seems to occur in several or many different clones (the serovars, previously given species status, quite often correspond to clones27-29). Thus, loss of an O antigen from one clone would not in general mean its loss from the species. How- ever, there is probably a limit to the amount of variation that can be maintained and over a period of time the loss of old O antigen forms will balance the gain of new forms.

It is harder to explain why the genes for O antigen synthesis always end up at the same chromosomal lo- cation. In the case of the closely related set of five stud- ied here, the answer is straightforward: once one form had been incorporated, the others could enter the chro- mosome most easily by recombination in the homolo- gous DNA at the ends of the clusters. However, this does not apply in the case of group C1, and probably not in the majority of cases, where there seems to be little or no homology between groups. Perhaps the genes are always located at the same site, giving a polymorphic sys- tem, because it is important to ensure that each strain makes only one O antigen.

Only the interspecific transfer of pre-existing gene clusters has so far been considered. The pattern of G+C content for the rib clusters indicates that each is a mosaic of genes assembled from a range of different sources. More clusters must be studied to give more detail to this story, but already we can see that the genes of a given pathway can come from different sources. The dideoxy- hexose pathway provides the clearest example of this,

with the first four genes having a G+C content of 0.4-0.44, while the ~bE, rfl~Jand rJbSgenes have a G+C content of 0.32. This pathway must have been assembled from at least two sources.

Conclusion These studies of the O antigen were carried out with

the intention of learning about the origin of its very exten- sive polymorphism, which appeared to have evolved since S. enterica diverged from its 'close' relative E. coli. The results have been a constant source of surprise, although perhaps in retrospect interspecific transfer should have been seen as more likely than evolution in situ.

Studies on other polymorphic repeat unit polysac- charides support the picture obtained for S. enterica O antigens. About eight rfl~ gene clusters from E. coil have been cloned. Southern blotting with a probe from an O antigen group 101 clone 3° showed both regions of similarity and regions of difference in restriction maps of O antigen groups 18 and 78 of E. coli 3t, which clearly fits the pattern found in S. enterica. There are cases of specific oligosaccharides occurring as repeat units of O antigens and also of capsules of a different species z, again suggesting interspecific gene transfer. In the case of the kps cluster (encoding a type 2 capsule) of E. coli and capsule gene clusters of Neisseria meningitidis and Haemophilus influenzae, interspecific gene transfer is supported both by the common organization of the gene clusters and, for genes of one well-documented form present in all three, by sequence homology 32'33.

Interspecific gene transfer could turn out to be a gen- eral mechanism for generating polymorphism involving repeat unit polysaccharides. This process involves a series of gene clusters that survive for very long periods by changing 'host' species from time to time, on a time frame perhaps measured in millions of years. The gene clusters themselves appear to have evolved by assembly from modules taken from different sources over a much longer time frame.

"fig JANUARY 1993 VOX.. 9 NO. 1

B!

] R E V I E W S

Acknowledgements The work done in my laboratory was supported by grants

from the Australian Research Council. Much of it was carried out by research students and research assistants: it is a pleasure to acknowledge their contribution to both lab work and the development of the project through discussions over several years.

References ! Ewing, W.H. (1986) Edwards' and Ewing's Identification

o f the Enterobacteriaceae, Elsevier Science Publishers 2 Kenne, L. and Lindberg, B. (1983) in Bacterial

Polysaccbarides (Aspinall, G.O., ed.), pp. 287-363, Harcourt ~race Jovanovich

3 Selander, K.K, Caugant, D.A. and Whittam, T.S. (1987) in Genetic Structure and Variation in Natural Populations o f E~.6berichia coil (Neidhardt, F.C.. Ingraham, J.L., Magasanik, B., Schaechter, M. and Umbarger, I-I.E., ed.), pp. 1625--1648, American Society for Microbiology

4 Sussman, M. (1985) in Escherichia coil in Human and Animal Disease (Sussman, M., ed.), pp. 7--45, Academic Press

5 0 r s k o v , I. and Orskov, F. (19771Med, Microbiol. lmmunol. 103, 99-110

6 Reeves, P.R. (1992) FEMSMicrobiol. Lett. 100, 509-516 7 M;.ikel-:i. P.H. and Srockcr, B.A.D. (1984) in Geneticsof

LipopoIJ,aaccl.Taridc (Rietschel. E.T., ed.), pp. 59-137, Elsevier Science Pablishers

8 Brahmbha.rt. HN., Wyk, P., Quigley, N.B. and Reeves, P.R. (1988)j~ Bacter&,l. 170, 98-102

9 Jiang, X.M. etal. (1991) Mol. Microbiol. 5, 695-713 10 Wyk. P. and Reeves, P.R. (1989)J. Bactenol. 171, 5687-5693 I I Verma, N K., Quigley, N.B. and Reeves. P.R. (1988)

J. Bacteriol. 1'70. 103-107 12 Verona, V. and Reeves, P.R. (1989)J. Bacteriol. 171.

5694-5701 13 Liu, D., Verma, N.K., Romana, L.K. and Reeves, P.R.

( 1991 ) J. Bacteriol. 173, 4814--4819

14 Brown, P.K., Romana, L.K. and Reeves, P.R. (1991) Mol. Microbiol. 5, 1873--1881

15 Brown, P.K., Romana, L.K. and Reeves, P.R. (1992) Mol. Microbiol. 6, 1385--1394

16 Wang, L., Romana, L.K. and Reeves, P.R. (1992) Genetics 130, 429-443

17 Romana, L.K., Santiago, F,S. and Reeves, P.R. (1991) Biochem. Biopbys. Res. Commun. 174, 846--852

18 Stockmann, M. and Piepersberg, W. (1992) FEMS Microbiol. Left. 90, 185-190

19 Sueoka, N. (1988) Proc. Natl Acad. Sci. USA 85, 2653-2657 20 Lee, S.J., Romana, L.K. and Reeves, P.R. (1992).]. Gen.

MicrobioL 138, 305-312 21 Lee, S.J., Romana, L.K. and Reeves, P.R. (1992)J. Gen.

Microbiol. 138, 1843--1855 22 Stevenson, G.S., Lee, S.J., Romana, L.K. and Reeves, P.R.

(1991) Mol. Gen. Genet. 227, 173-180 23 Aspinall, G.O. (1983) in Classification o f Polysaccharides

(Aspinall, G.O., ed.), pp. 1-9, Academic Press 24 Keenleyside, W.J., Jayavartne, P., Maclachlan, P.R. and

Whitfield, C. (1992)J. Bacteriol. 174, 8-16 25 Bastin, D.A., Romana, L.K. and Reeves, P.R. (1991) Mol.

Microbiol. 5, 2223-2231 26 Ochman, H. and Wilson, A.C. (1987) J. Mol. Evol. 26, 74--86 27 Selander, R.K. et al. (1990) Infect. Immun. 58, 2262-2275 2/~ Selander, R.K. and Smith, N.H. (1990) Rev. Med.

Microbiol. 1. 219-228 29 Selander, R.K. et al. (1990) Infect. Immun. 58, 1891-1901 30 Heuzenroeder, M.W., Beger, D.W., Thomas, C.J. and

Manning, P.A. (1989) Mol. Microbiol. 3, 295-302 31 Beger, D.W., Heuzenroeder, M.W. and Manning, P.A.

(1989) FEMSMicrobiol. Lett. 57, 317-322 .{,2 Kroll, J.S. and Moxon, E.R. (1990)J. Bactenol. 172, 1347-1379 33 Frosch, M. etal. (1991) Mol. Microbiol. 5, 1251-1263

P. REEIOeS iS IJv rUE MleROmOLOG¥ DEPaRTMeNt (GO8), UNII~XSlIY 01' SgD~I~, NSW 200~ AUSIRAIJ~

B r e a s t cancer is the most common malignancy in women, affecting more than one in ten. In the UK each year, at least 25 000 women, more than half of those diag- nosed as having breast cancer, die from the disease. Unlike lung cancer and smoking, no major environmen- tal cause of breast cancer has been identified to date and there is no way of substantially reducing the risk. There are, however, certain factors that may contribute to the risk, the most important being a family history of the disease I.

Genetics Most cases of breast cancer appear to be sporadic,

that is, they arise without a clear genetic susceptibility because they lack an obvious inherited component. However, there are some families in which there are sev- eral affected individuals, indicating either a chance clus- tering of sporadic cases or an inherited genetic effect. In other, rarer families, there is a clear inherited suscepti- bility to breast cancer that can be traced through con- secutive generations. In these families breast cancer usually occurs at an early age and often with bilateral disease. Numerous epidemiological studies have investigated the transmission in such familial cases, and have shown that

The search for the familial breast/0varian cancer gene DONALD M. BLACK AND EllEN SOLOMON

Familial breast cancer is a very common autosomal dominant disorder in wo. tm A predisposing &ene for breast and ovarian cancer (BRCAI ) has bee~ mapped by lit#age ana~sis to the long arm of chromosome 17. In almost ali famiHes with breast and ovarian cancer and half of those with only breast cancer, the disease is linked to this gene. The BRCAI gene, which is also believed to be involved in sporadic breast and ovarian cancer, shouM soon be identified

at least a proportion of breast cancer cases can be explained by inherited mutations in one or more auto- somal dominant genes 2-q. Estimates of the population fre- quency of these mutant alleles have varied between 0.0006 and 0.008; the mutant alleles confer on female carriers a lifetime probability of breast cancer of over 90%, as opposed to less than 10% in noncarriers3a. The

TIG JANUARY 1993 VOL. 9 NO, 1 .t~ o,J,~ El,,m icr Scient-t- Publisher~ Ltd (!'KI 0168 - 052=, 93 S0b.(~I

i l l