evolution of lanthipeptide synthetases - pnas · evolution of lanthipeptide synthetases qi zhang,...

28
1 Supporting information Evolution of Lanthipeptide Synthetases Qi Zhang, Yi Yu, Juan E. Vélasquez, and Wilfred A. van der Donk SI Materials and Methods Materials. Restriction endonucleases and T4 DNA ligase were purchased from New England Biolabs. iProof TM high-fidelity DNA polymerase were purchased from Bio-Rad Laboratories. All oligonucleotides were purchased from Integrated DNA Technologies. Media components for bacterial cultures were purchased from Difco laboratories. Chemicals were purchased from Sigma-Aldrich unless noted otherwise. Endoproteinase GluC (sequencing grade) was purchased from Roche Biosciences. Trypsin (modified, sequencing grade) was purchased from Worthington Biosciences. General Methods. All PCRs were carried out on a C1000™ thermal cycler (Bio-Rad). E. coli DH5α was used as host for cloning and plasmid propagation, and E. coli BL21 (DE3) was used as a host for coexpression. DNA sequencing was performed by ACGT, Inc. MALDI-TOF MS was carried out on Bruker Ultraflex TOF/TOF. Liquid chromatography electrospray ionization (ESI) tandem mass spectrometry was carried out and processed using a Synapt ESI quadrupole TOF Mass Spectrometry System (Waters) equipped with an Acquity Ultra Performance Liquid Chromatography system (Waters). Data retrieval. The nucleotide and amino acid sequences of lanthipeptide synthetases and the genomic sequences of their hosts were obtained from the National Center for Biotechnology Information (NCBI) sequence database. To identify the putative lanthipeptide synthetases, iterative BlastP searches were performed using the protein sequence of various known and newly-identified lanthipeptide synthetases as the queries. Hits were selected usually with expected value < 10 -8 and query coverage > 75%. The BlastP search results were preliminarily aligned using ClustalX (1) with default parameters and were inspected to exclude improper sequences, such as truncated proteins and LanM/LanC hits without the conserved zinc-binding motif, from the data collection. The accession number and the source organism of 4 classes of lanthipeptide synthetases used in this study are listed in Tables S2-S5. Other analyses such as sequence comparison were performed by using the web service of EMBL-EBI (2). Phylogenetic analysis. The amino acid sequences of selected lanthipeptide synthetases were aligned in ClustalX with iteration at each alignment step, and the alignments were manually fine-tuned afterwards to minimize hypothetical insertion/deletion events. For Bayesian MCMC inference, analyses were performed using the program MrBayes (version 3.2) (3). Final analyses consisted of two sets of eight chains each (one cold and seven heated), run for about 2-10 million generations with trees saved and parameters sampled every 100 generations. Analyses were run to

Upload: phamkhanh

Post on 17-Sep-2018

236 views

Category:

Documents


0 download

TRANSCRIPT

1

Supporting information

Evolution of Lanthipeptide Synthetases

Qi Zhang, Yi Yu, Juan E. Vélasquez, and Wilfred A. van der Donk

SI Materials and Methods

Materials. Restriction endonucleases and T4 DNA ligase were purchased from New England

Biolabs. iProofTM high-fidelity DNA polymerase were purchased from Bio-Rad Laboratories. All

oligonucleotides were purchased from Integrated DNA Technologies. Media components for

bacterial cultures were purchased from Difco laboratories. Chemicals were purchased from

Sigma-Aldrich unless noted otherwise. Endoproteinase GluC (sequencing grade) was purchased

from Roche Biosciences. Trypsin (modified, sequencing grade) was purchased from Worthington

Biosciences.

General Methods. All PCRs were carried out on a C1000™ thermal cycler (Bio-Rad). E. coli

DH5α was used as host for cloning and plasmid propagation, and E. coli BL21 (DE3) was used as

a host for coexpression. DNA sequencing was performed by ACGT, Inc. MALDI-TOF MS was

carried out on Bruker Ultraflex TOF/TOF. Liquid chromatography electrospray ionization (ESI)

tandem mass spectrometry was carried out and processed using a Synapt ESI quadrupole TOF

Mass Spectrometry System (Waters) equipped with an Acquity Ultra Performance Liquid

Chromatography system (Waters).

Data retrieval. The nucleotide and amino acid sequences of lanthipeptide synthetases and the

genomic sequences of their hosts were obtained from the National Center for Biotechnology

Information (NCBI) sequence database. To identify the putative lanthipeptide synthetases,

iterative BlastP searches were performed using the protein sequence of various known and

newly-identified lanthipeptide synthetases as the queries. Hits were selected usually with expected

value < 10-8 and query coverage > 75%. The BlastP search results were preliminarily aligned

using ClustalX (1) with default parameters and were inspected to exclude improper sequences,

such as truncated proteins and LanM/LanC hits without the conserved zinc-binding motif, from

the data collection. The accession number and the source organism of 4 classes of lanthipeptide

synthetases used in this study are listed in Tables S2-S5. Other analyses such as sequence

comparison were performed by using the web service of EMBL-EBI (2).

Phylogenetic analysis. The amino acid sequences of selected lanthipeptide synthetases were

aligned in ClustalX with iteration at each alignment step, and the alignments were manually

fine-tuned afterwards to minimize hypothetical insertion/deletion events. For Bayesian MCMC

inference, analyses were performed using the program MrBayes (version 3.2) (3). Final analyses

consisted of two sets of eight chains each (one cold and seven heated), run for about 2-10 million

generations with trees saved and parameters sampled every 100 generations. Analyses were run to

2

reach a convergence with standard deviation of split frequencies < 0.01. Posterior probabilities

were averaged over the final 75% of trees (25% burn in). The analysis utilized a mixed amino acid

model with a proportion of sites designated invariant (+I), and rate variation among sites modeled

after a gamma distribution (+G) divided into 8 categories, with all variable parameters estimated

by the program based on random starting trees. Bayesian analyses were also performed for the

separate alignments of the major subdivisions of each class of lanthipeptide synthetase, which

normally show similar tree topologies and posterior probability supports. The figures of Bayesian

phylograms were prepared by using TreeView (4) and MEGA4 (5).

Maximum likelihood analyses were performed using the program PhyML (6) with the

WAG+I+G+F (7) model, which was selected by using ProtTest (8). Gamma distribution was

divided into 8 categories and the tree topologies were estimated by SPR + NNI branch swapping,

with 20 random starting trees. Branch support was determined by SH-like approximate

likelihood-ratio test (aLRT) statistics (6, 9).

Nucleotide base composition and codon usage analysis. GC3s, the frequency of GC nucleotides

present at the third position of synonymous codons, was analyzed for each lanthipeptide

synthetase gene (Fig. S16A). The effective number of codons (Nc) (10) used by lanthipeptide

synthetase genes from firmicutes and actinobacteria was also analyzed and plotted against GC3s

(Nc-plot) (Fig. S16B). GC content, GC3s and Nc analysis were performed by using CodonW 1.4.2

(http://codonw.sourceforge.net/).

Cloning, production, and purification of modified NisA-ElxA. The fragment of the gene elxA

encoding for the core peptide was amplified by PCR from Staphylococcus epidermidis 15X154

genomic DNA using primers NisA-ElxA-FP and NisA-ElxA-FP (for primer sequences, see Table

S6). A mutation that allows removal of the leader peptide with GluC was introduced in the primer.

The PCR product contained annealing regions to the pRSF.His6-NisAB plasmid (11), which

encodes for hexahistidine tagged NisA and for untagged NisB, allowing replacement of the nisA

core region for elxA by PCR amplification of the entire plasmid. After treatment with DpnI and

transformation of E. coli DH5α cells, the plasmid pRSF.His6-NisAN-ElxAC.NisB was generated.

This plasmid encodes for the hexahistidine tagged chimera NisA-ElxA and for NisB.

Electrocompetent E. coli BL21(DE3) cells were transformed with pRSF.His6-NisAN-ElxAC.NisB

(negative control), or cotransformed with pRSF.His6-NisAN-ElxAC.NisB and pACYC.NisC (11).

Single colonies were inoculated in 5 mL of LB medium containing the appropriated antibiotics (50

μg/mL kanamycin and 12.5 μg/mL chloramphenicol) and grown for 12 h at 37 °C with shaking.

Aliquots of 2.5 mL were used to inoculate 250 mL of LB medium containing the same antibiotics

followed by incubation at 37 °C until OD600 = 0.6. IPTG was added to a final concentration of 0.2

mM and the cultures were shaken for 20 h at 18 °C. The cells were harvested by centrifugation

(6,500 × g for 20 min; Beckman JLA-10.500 rotor). The cell pellet was resuspended in 30 mL of

denaturing buffer (6 M guanidine hydrochloride, 20 mM NaH2PO4, 500 mM NaCl, pH 7.5), and

cell lysis was carried out using a MultiFlex C3 homogenizer (Avestin). The lysed cells were

centrifuged at 23,700 × g for 60 min at 4 °C. The supernatant was loaded onto a HiTrap

high-performance (HP) nickel affinity column (GE Healthcare) preequilibrated with start buffer

3

(20 mM Tris, pH 8.0, 500 mM NaCl, 10% glycerol). The column was washed with wash buffer

(start buffer containing 30 mM imidazole), and the peptide was eluted from the column with

elution buffer (start buffer containing 500 mM imidazole).

The eluent was desalted using preparative scale RP-HPLC using a Waters Delta-pak C4 15 μm;

300 Å; 25 × 100 mm PrepPak Cartridge. A gradient of 2–100% of solvent B (0.086%TFA in 80%

MeCN⁄20% water) was used (solvent A, 0.086% TFA in 2% MeCN⁄98% water). After

lyophilization, the peptides were resuspended in buffer (50 mM HEPES, pH 7.5) and incubated

with the endopeptidase GluC (2 ng/μL) at room temperature overnight to remove the leader

peptide. Samples were analyzed directly by ESI-MS/MS, or were desalted by using ZipTipC18

before MALDI-MS analysis.

Construction of pET15b derivatives for expression of chimeric peptides containing the

ProcA3.2 leader peptide. The genes for the chimeric peptide ProcA3.2-K-NisA and

ProcA3.2-LctA were generated using nested PCR with previously reported expression plasmids as

template. For ProcA-NisA, the DNA encoding the N-terminal region of the chimeric peptide

(ProcA3.2 leader) was amplified using primers ProcA3.2 NdeI-FP and

ProcA3.2Lea-LctAStr-Conn-RP (Table S6) with plasmid pET15b-ProcA3.2 (12) as template. The

DNA encoding the C-terminal region of the chimeric peptide was amplified using primers

ProcA3.2Lea-LctAStr-Conn-FP and LctA XhoI-RP with plasmid pET15b-LctA (13) as template.

The primers ProcA3.2Lea-LctAStr-Conn-FP and ProcA3.2Lea-LctAStr-Conn-RP were designed

to provide overlap of the two PCR products and the primers ProcA3.2 NdeI-FP and LctA XhoI-RP

contained NdeI and XhoI restriction sites, respectively. The final insert was obtained by overlap

extension PCR using the two products of the first two PCR reactions and subsequent amplification

using the ProcA3.2 NdeI-FP and LctA XhoI-RP primers. The chimeric peptide gene was then

cloned into a pET15b vector between the NdeI and XhoI restriction sites. The sequences of the

resulting plasmid (pET15b-ProcA 3.2Lea-LctAStr) was confirmed by DNA sequencing.

For constructing the gene encoding ProcA3.2-K-NisA, the DNA encoding the N-terminal region

of the chimeric peptide was again amplified using primers ProcA3.2 NdeI-FP and ProcA3.2Lea-

K-NisAStr-Conn-RP with plasmid pET15b-ProcA3.2 as template. The DNA encoding the

C-terminal region of the chimeric peptide was amplified using primers

ProcA3.2Lea-K-NisAStr-Conn-FP and NisA XhoI-RP with plasmid pET15b-NisA (14) as

template. The primers ProcA3.2Lea-K-NisAStr-Conn-FP and ProcA3.2Lea-K-NisAStr-Conn-RP

were designed to provide overlap of the two PCR products and to introduce a Lys at the junction

of the leader and core peptide for cleavage with trypsin, and the primers ProcA3.2 NdeI-FP and

NisA XhoI-RP contained NdeI and XhoI restriction sites, respectively. The final insert was

obtained by overlap extension PCR using the two products of the first two PCR reactions, and

subsequent amplification using the ProcA3.2 NdeI-FP and NisA XhoI-RP. The chimeric peptide

gene was then cloned into a pET15b vector between the NdeI and XhoI restriction sites. The

sequences of the resulting plasmid (pET15b-ProcA 3.2Lea-K-NisAStr) was confirmed by DNA

sequencing.

Construction of pRSFDuet-1 derivatives for co-expression of ProcM and chimeric peptides

4

containing the ProcA3.2 leader peptide. The procM gene was inserted into multiple cloning

site-2 of the pRSFDuet-1 vector between the NdeI and KpnI restriction sites (11). The chimeric

peptide genes were amplified using their pET15b constructs (see above) as templates and the

primer pairs ProcA3.2 EcoRI-FP and LctA NotI-RP, ProcA3.2 EcoRI-FP and NisA NotI-RP,

respectively. The PCR fragments were then individually cloned between the EcoRI and NotI

restriction sites of multiple cloning site-1 of the pRSFDuet-1 vector with the procM gene already

inserted in the multiple cloning site-2. The sequences of the resulting plasmids

(pRSF-Duet1-ProcA3.2Lea-LctAStr/ProcM, and pRSF-Duet1-ProcA3.2Lea-K-NisAStr/ProcM)

were confirmed by DNA sequencing.

Overexpression and purification of ProcM-modified chimeric peptides. Electrocompetent E.

coli BL21 (DE3) cells were transformed with the pET15b constructs containing the N-terminal

hexa-histidine chimeric peptide fusion gene or the pRSF-Duet1 constructs containing both the

N-terminal hexa-histidine chimeric peptide fusion gene and procM. Overexpression and

purification of chimeric peptide were performed as described for NisA-ElxA.

Iodoacetamide assays for detection of free cysteines. A peptide sample (~ 50 M) was

incubated in 250 mM Tris (pH = 9), 25 mM iodoacetamide, and 1 mM TCEP in the dark at 25 oC

for 3 h. Samples were desalted using Zip-TipC18 and subjected to MALDI-TOF MS analysis.

Agar diffusion growth inhibition assay. The antibacterial activity of modified NisA-ElxA was

determined against Staphylococcus carnosus grown on Brain Heart Infusion (BHI) agar (3.2 %

BHI, 1.5% agar). Briefly, the liquid agar medium was cooled to 42 °C and seeded with 200 μL of

dense overnight culture (approximately 108-109 CFU mL-1) of the indicator strain. After agar

solidification in a Petri dish, samples were applied to a small well created in the medium, and the

plates were incubated at 30 oC for 15-20 h. A similar procedure was used for bioactivity assays

with ProcA3.2-LctA and ProcA3.2-K-NisA against Lactococcus lactis HP grown on GM17 agar

(4% M17, 0.5% glucose, 1.5% agar).

Tandem mass spectrometry analysis. The peptides of interest were separated from leader

peptide-derived peptides by UPLC using a gradient of 3% mobile phase A (0.1% formic acid in

water) to 97 % mobile phase B (0.1% formic acid in methanol) over 12 min. The instrument

settings used included capillary voltage and cane voltage of 3500 V and 40 V respectively, 120 °C

as source temperature; 300 °C as desolvation temperature, cone gas flow of 150 L/h, and

desolvation gas flow of 600 L/h. A transfer collision energy of 4 V was used, while the trap

collision energy was set to 20 to 30 V for MSMS.

5

Fig. S1. Statistical support for the LanC phylogeny shown in Fig. 2A and 2B in the main text. The

Bayesian posterior probability (bPP) and the maximum likelihood nonparametric aLRT statistics

are shown as bPP/aLRT, or if space did not allow, they are shown above and below the lines,

respectively. If the branch is not present in the maximum likelihood tree, support is indicated as a

“-”; if the branch was present but with low statistic support (aLRT < 50), it is indicated as a “C”.

6

Fig. S2. Representative examples of the nisin-group of class I lanthipeptides. A shorthand notation

used for these structures is shown below each chemical structure in the black box.

7

Fig. S3. Representative examples of the epidermin-group and the Pep5-group of class I

lanthipeptides. Structures are similarly represented as in Fig. S2.

Ile

Ala Ala

LysPhe

Ile

Ala

Pro Gly

AlaAla

Lys

Dhb Gly

Ala

Phe

Ala

Tyr

Asn Ala

S

S

SNH

S

Pep5

Epidermin

Abu

H

Phe

DhaAla

LeuDha

Leu

Ala

Leu Gly

AlaThr

Gly

Val

Ala

Phe

Ala

Tyr

Asn Ala

S

S

SNH

SMutacin I

Ala

H

Lys Asn Pro

Ile

Ala Ala

LysPhe

Leu

Ala

Pro Gly

AlaAla

Lys

Dhb Gly

Ala

Phe

Ala

Tyr

Asn Ala

S

S

SNH

SGallidermin

Abu

H

OOH

Pro

Ile

ArgAla

Ala

ValLys

Gln

SGln

Lys

Leu Dhb

Lys AlaGly

AlaLeu

AlaAla

OOH

Epilancin 15X

DhaVal

Dhb

IleLys

Ala

Ala

LysLys

Leu

SArg

GlyPhe

Leu AlaDhb

AlaHis

PheGly

LysLys

S

SIle Lys

Gly

Ala Dhb AlaAbu

Abu

HO

Dhb

ArgAla

Ala

Phe Val

LysGly Lys

Asn

Gly

AlaLys

S

SAbu OH

Pro Gly Ala

Ala

TrpLys

Leu

ValS

SPheDhb

Dhb Lys

Gly Abu Ala

Dhb

Val

Streptin

Ala

Tyr

Ala

S

Arg LeuValHOH

Ile

Dhb Ala

HisDha

Leu

Ala

Pro Gly

AlaAla

Lys

Dhb Gly

Ala

Phe

Ala

Phe

Asn Ala

S

S

SNH

SBsaA2

Abu

H

Epidermin

group

Pep5group

8

Fig. S4. Bayesian MCMC phylogeny of LanB enzymes. Other Ser/Thr dehydratases such as those

involved in goadsporin (15) and thiopeptide (16) biosynthesis were not suitable for using as the

outgroup because of their low sequence similarities with LanB enzymes (e.g. the global sequence

identities of GodF for goadsporin biosynthesis and TsrC/TsrJ for thiostrepton biosynthesis with

NisB calculated by the Needleman-Wunsch algorithm (2) are 2.0% and 0.2%, respectively; the

same calculation for any two LanBs are normally >20%). Therefore, the LanB trees were rooted

by using all members of a sister clade of enzymes as the outgroup (17). Bayesian inferences of

posterior probabilities are indicated by line width. Lanthipeptides in each tree are shown by

differently colored boxes according to their structural types. Lanthipeptides proposed in this study

are shown in yellow font.

9

Fig. S5. Statistical support for the LanB phylogeny. The Bayesian posterior probability (bPP) and

the maximum likelihood nonparametric aLRT statistics are shown as bPP/aLRT, or if space did not

allow, they are shown above and below the lines, respectively. If the branch is not present in the

maximum likelihood tree, support is indicated as a “-”; if the branch was present but with low

statistic support (aLRT < 50), it is indicated as a “C”.

10

Fig. S6. Structures of streptin and the proposed analog streptin B.

11

Fig. S7. Statistical support for the LanM phylogeny. The Bayesian posterior probability (bPP) and

the maximum likelihood nonparametric aLRT statistics are shown as bPP/aLRT, or if space did not

allow, they are shown above and below the lines, respectively. If the branch is not present in the

maximum likelihood tree, support is indicated as a “-”; if the branch was present but with low

statistic support (aLRT < 50), it is indicated as a “C”.

12

Fig. S8. Representative examples of the type IIA (lacticin 481-like) group of lanthipeptides. This

group of lanthipeptides contains a conserved globular C-terminus that is highlighted by the orange

boxes, and a linear N-terminus.

13

Fig. S9. Representative examples of class II lanthipeptides other than the IIA group. Structures are

similarly represented as in Fig. S2.

Ala

Phe

Leu

Pro

GlyGly

Gly

Gly

Val

Ala

Leu

Glu

Ala Ile

S

S

S

S

HN

Phe

Leu

Asp

Tyr

Trp

Gly

Gly

Ala

Trp

Ala

Leu

Glu

Ala Met

S

S

AsnAsn

AlaHis

Ala

Trp

AlaS

Lys

DhbAsnDhb

Ala

AlaS Ala

ProIle

Ile

Ala

AlaIle

ThrAsn

Tyr

Thr

S

ProAbu Thr

AlaSArg

Ala

Ala

SAla

Lys

AlaDhb

Leu

Dhb

Pro

Dhb

Pro

ValLeu

Ala

D-Ala

Val

Ala

ValD-Ala

Met

Glu

Leu

LeuPro

Thr

Ala

D-Ala

Val

Leu Tyr

Asp

ValAla

Gly

Ala

PheLys

Tyr

Ala

Ala

LysHis His

Ala

Ala

S

S

Abu

AbuAbu Abu

Dha

Mersacidin

Abu Abu

Abu

Lacticin 3147 A1 (Ltn )Lacticin 3147 A2 (Ltn

Lactocin S

TyrIle

LeuGly Asn Lys

Gly

Ala

Tyr

Ala

Leu

Glu

Ala

Ser

S

S

Asn

Val

Pro

MetArg

NHS

H2N

SO

O

TrpAla Ala

S Asn

Pro

AlaVal

Val

Ala

ValAla

Leu

S

ProAbu

AlaS Dha

Dhb

AlaAla

LysDhb

Trp

Gly Ala

Gln

S

Abu

S

AlaAbu Abu

Haloduracin Haloduracin

Thr

Abu

H

H

H

OH

OH

OH

OH

OH

Lys

AsnGly Asp Val

PhePro

Gly

Phe

AlaAla

AlaGlnArgAla

OH

S S

NH

S

PheCinnamycin

Abu

Abu

OH

H

Ala

HN

O NH

O

NH

Lys Ala

(2S,9S)-lysinoalanine

NH

OH

Actagardine

Ala

GlySer

TrpVal

Ala

Leu

Glu

Ala

Val

S

S

Abu Abu

AbuGly

Ile

AlaAla

S

H IleAla

SO

O

O

Michiganin A

Ala

GlySer

TrpLeu

Ala

Leu

Glu

Cys

Val

SDhb Abu

AbuGly

Ile

AlaAla

S

HIle

Ala

S

Ser

Ser

OO

D-Ala

D-Ala

D-Ala

HN

O

D-alanine

D-Ala

OH

NH O

HOO

OH

OH

erythro-3-hydroxy-L-aspartic acid

Asp

14

Fig. S10. Bayesian MCMC phylogeny of LanM enzymes from group 1 in Fig. 2C in the main text.

The analysis using either natural sequences (A) or sequences with artificially-changed

zinc-binding residues for five representative enzymes (B) show almost exactly the same results.

The enzymes with Cys-to-His and His-to-Cys changes are highlighted by the red and blue stars,

respectively.

15

Fig. S11. (A) Mechanism for labionin formation, and (B) representative examples of class III

lanthipeptides. Structures are similarly represented as in Fig. S2. The structure of labionin is

shown in the black box. Xn and Xm are different peptide linkers.

(A)

(B)

16

Fig. S12. Bayesian MCMC phylogeny of LanKC and LanL enzymes. Bayesian posterior

probability supports for the clades are indicated by line width.

17

Fig. S13. Statistical support for the LanKC and LanL phylogeny. The Bayesian posterior

probability (bPP) and the maximum likelihood nonparametric aLRT statistics are shown as

bPP/aLRT, or if space did not allow, they are shown above and below the lines, respectively. If the

branch is not present in the maximum likelihood tree, support is indicated as a “-”; if the branch

was present but with low statistic support (aLRT < 50), it is indicated as a “C”.

18

Fig. S14 (A) MALDI-TOF mass spectra of ProcA3.2-LctA chimeric peptide co-expressed with

ProcM in E. coli followed by trypsin digestion to remove the ProcA leader. The unmodified and

ProcM-modified peptides are shown as the red and black trace, respectively. We note that ProcM

partially dehydrates Ser4 that is not dehydrated by LctM resulting in a 5-fold dehydrated peptide;

(B) MALDI-TOF mass spectra of peptides from (A) that were treated with iodoacetamide (IAA)

prior to trypsin cleavage; (C) ESI-MS/MS data of the 4-fold dehydrated product from (A); (D)

Antimicrobial assays against L. lactis HP. (1) 5 L of 250 μM ProcM-modified and

trypsin-digested ProcA3.2-LctA; We note that treatment with trypsin generated 1-lacticin lacking

the N-terminal Lys residue. Previous studies have shown that lacticin is about 3-fold less

active than authentic lacticin 481 (18). (2) Negative control using 5 L of 250 μM unmodified

ProcA3.2-LctA that was treated with trypsin; (3) 5 L of 10 μM authentic lacticin 481; (4) 5 L of

10 μM (roughly estimated) LctA that was modified by LctM in vitro and digested by trypsin.

19

Fig. S15. (A) MALDI-TOF mass spectra of the ProcA3.2-K-NisA chimeric peptide coexpressed

with ProcM in E. coli followed by trypsin digestion. The unmodified and ProcM-modified

peptides are shown as gray and black trace, respectively. The nisin core sequence contains 3 Lys

residues, and formation of the rings can partially protect the cyclized peptide (but not the linear

unmodified peptide) from trypsin digestion. ESI-MS/MS analysis of the 6-fold dehydrated

product(s) did not give conclusive results regarding its structure; (B) MALDI-TOF spectra of

ProcA3.2-K-NisA chimeric peptide treated with iodoacetamide (IAA) and followed by Glu-C

treatment (resulting in cleavage of the ProcA3.2 leader peptide at Glu−6 (12)). The unmodified

and ProcM-modified peptides are shown as the red and black trace, respectively. It is clear that

most of the peptide is incompletely cyclized, possibly because the incomplete dehydration by

ProcM (wt NisA is dehydrated 8 times) did not generate the dehydro amino acids required to make

the nisin rings; (C) MALDI-TOF mass spectrum of a subfragment of the modified core peptide

generated by treatment of ProcM-modified ProcA3.2-K-NisA with trypsin. For this fragment,

cleavage occurred C-terminal to Lys22 of the NisA core, and the sequence of this fragment

(corresponding to M) is: ITSISLCTPGCKTGALMGCNMK (some of the Ser/Thr residues are

dehydrated in the ions shown). In wild type nisin, this fragment is dehydrated 5 times and it

contains the A, B, and C-rings. The observation of this fragment from trypsin treatment, combined

with the IAA assay result in panel C, suggests that ProcM may not have dehydrated two of the

three C-terminal Ser/Thr residues that are dehydrated in nisin. (D) Antimicrobial assays against L.

lactis HP. For (1)-(4), 5 L of 125, 500, and 1000 μM of ProcM-modified and trypsin-digested

ProcA3.2-K-NisA were applied, respectively; (5) buffer used for protease digestion; (6), (7), and

(8) show 5 L of 10, 1, and 0.1 μM of authentic nisin, respectively.

20

Fig. S16. Base composition and codon usage analysis of lanthipeptide synthetase genes. (A) Base

composition of the lanthipeptide synthetase genes and the associated genomes. GC3s represents the

frequency of GC nucleotides present at the third position of synonymous codons. Comparing with

the GC contents the GC3s values are much lower for genes with GC contents < 0.5 and much

higher for genes with GC contents > 0.5, possibly owing to the nucleotide compositional

constraints. Mostly, the GC3s index shares the same trends as that of GC content, suggesting that

the analysis is not biased by the codon preference of different organisms. (B) Nc-plot of

lanthipeptide synthetase genes from firmicutes and actinobacteria. Nc values represent the bias

away from equal usage of codons within synonymous groups (10). A value of 61 means that all

codons are used equally for all amino acids; a value of 20 is found when only one codon is used

for each amino acid. Both Nc and GC3s are not sensitive to gene length and are independent of

associated genome background (10), allowing for a comparative analysis of the patterns of

synonymous codon usage bias of lanthipeptide synthetase genes from different organisms. The red

line shows the theoretically expected correlation of GC3s and Nc under the assumption of only

mutational bias with no natural selection for the genes, which can be described as Nc = 2 + GC3s +

29/[(CG3s)2 + (1 – GC3s)

2] (10). “B” in the Nc-plot denotes the bifidobacterial genes. Most of the

firmicutes genes have Nc values close to the red line, indicating that most of the firmicutes genes

have not been subjected to selections such as selection for transcriptional and translational

efficiency. Alternatively, it could represent rapid evolution of the firmicute genes. On the contrary,

most of the actinobacterial genes show some deviations from the red line, suggesting the codons

of these genes have been optimized to some extent. These results reflect the dynamic evolutionary

process of lanthipeptide synthetases, particularly in firmicutes strains.

21

Table S1. Statistic analysis of the GC content deviation for the selected 25 core genes in

Streptococcus strains. These 25 genes include adenine phosphoribosyltransferase (Apt),

Elongation factor Ts (EF-Ts), holliday junction DNA helicase protein RuvB, DNA repair protein

RecN, DNA repair protein RadA, ATP-dependent DNA helicase RecG, A/G-specific adenine

glycosylase, 50S ribosomal protein L11 methyltransferase (PrmA), leucyl-tRNA synthetase,

glutamyl-tRNA synthetase, threonyl-tRNA synthetase, asparaginyl-tRNA synthetase,

tryptophanyl-tRNA synthetase, tyrosyl-tRNA synthetase, histidyl-tRNA synthetase, lysyl-tRNA

synthetase, phenylalanyl-tRNA synthetase alpha subunit, phenylalanyl-tRNA synthetase beta

subunit, alanyl-tRNA synthetase, isoleucyl-tRNA synthetase, valyl-tRNA synthetase, glycyl-tRNA

synthetase alpha subunit, arginyl-tRNA synthetase, prolyl-tRNA synthetase, seryl-tRNA

synthetase.

Strain Accession number Genome GC (%) Mean GC (%) Standard deviation (σ) (%)

S. pyogenes MGAS1882 NC_017053.1 38.40 41.48 2.49

S. mutans UA159 NC_004350.2 36.83 40.30 2.67

S. salivarius CCHSS3 FR873481.1 39.93 42.58 1.92

S. agalactiae A909 NC_007432.1 35.62 39.00 2.86

S. pasteurianus NC_015600.1 37.38 40.68 1.27

S.uberis 0140J NC_012004.1 36.63 39.29 2.01

S. sanguinis SK36 NC_009009.1 43.40 47.28 2.37

S. thermophilus LMG CP000023.1 39.09 42.2 1.96

S. pneumoniae CGSP14 NC_010582.1 39.46 44.08 2.25

22

Table S2. Accession numbers of LanB and LanC enzymes in this study.

Strain LanB LanC Lanthipeptide

Actinomyces sp. oral taxon 848 str. F0332 ZP_06162153.1 ZP_06162154.1 Bacillus cereus AH1273 ZP_04178050.1 ZP_04178052.1 Bacillus cereus F65185 ZP_04205873.1 ZP_04205869.1 Bacillus clausii KSM-K16 YP_177053.1 YP_177052.1 Bacillus megaterium QM B1551 YP_003565953.1 YP_003565951.1 Bacillus mycoides DSM 2048 ZP_04172085.1 ZP_04172087.1 Bacillus subtilis BSn5 YP_004206153.1 YP_004206154.1 Bacillus subtilis subsp. spizizenii ATCC 6633 ZP_06872918.1 ZP_06872916.1 subtilin Bacillus thuringiensis IBL 200 ZP_04075567.1 ZP_04075568.1 Bacillus subtilis A1/3 AAL15564.1 AAL15566.1 ericin A/S Bacillus cereus F65185 ZP_04205873.1 ZP_04205869.1 Bacteroides sp. 2_1_56FAA ZP_08590997.1 Bifidobacterium longum subsp. infantis ATCC 15697 YP_002322012.1 Brevibacillus laterosporus GI-9 CCF16798.1 CCF16796.1 Catenulispora acidiphila DSM 44928 YP_003114660.1 YP_003114661.1 Chitinophaga pinensis DSM 2588 YP_003121134.1 YP_003121137.1 Clostridium cellulovorans 743B YP_003845682.1 YP_003845681.1 Clostridium cellulovorans 743B YP_003845682.1 YP_003845687.1 Clostridium perfringens CPE str. F4969 YP_473415.1 YP_473413.1 Desmospora sp. 8437 ZP_08465087.1 ZP_08465089.1 Enterococcus faecalis Fly1 ZP_05578969.1 ZP_05578971.1 Frankia sp. CcI3 YP_480236.1 YP_480235.1 Frankia sp. CcI3 YP_480926.1 YP_480927.1 Frankia sp. CcI3 YP_482401.1 YP_482400.1 Frankia sp. CN3 ZP_09165893.1 ZP_09165894.1 Frankia sp. EAN1pec YP_001507128.1 YP_001507129.1 Frankia sp. EAN1pec YP_001504430.1 YP_001504429.1 Frankia sp. EAN1pec YP_001505690.1 YP_001505691.1 Frankia sp. EuI1c YP_004020784.1 YP_004020783.1 Frankia sp. EUN1f ZP_06417290.1 ZP_06417291.1 Frankia alni ACN14a YP_712916.1 YP_712915.1 Geobacillus thermodenitrificans NG80-2 YP_001124395.1 YP_001124397.1 geobacillin Haliangium ochraceum DSM 14365 YP_003267499 Kitasatospora setae KM-6054 YP_004907608.1 YP_004907609.1 Kordia algicida OT-1 ZP_02161439.1 Lactococcus lactis 61-14 BAG71480.1 BAG71482.1 nisin Q Lactococcus lactis 6F3 CAA48381.1 CAA48383.1 nisin A Microbispora corallina ADK32556.1 ADK32555.1 ADK32556.1 microbisporicin Microscilla marina ATCC 23134 ZP_01689156.1 ZP_01689160.1 Micromonospora aurantiaca ATCC 27029 YP_003837207.1 YP_003837208.1 Odoribacter laneus YIT 12061 ZP_09642632.1 Paenibacillus polymyxa SC2 YP_003945785.1 YP_003945783.1 Paenibacillus elgii B69 ZP_09077227.1 ZP_09077225.1 Paenibacillus polymyxa E681 YP_003869832.1 YP_003869828.1 Paenibacillus alvei ADG29283.1 Pedobacter heparinus DSM 2366 YP_003090840.1 Pseudoalteromonas haloplanktis ANT/505 ZP_08411471.1 ZP_08411472.1 Ruminococcus obeum A2-162 CBL24757.1 Saccharomonospora marina XMU15 ZP_09744179.1 ZP_09744180.1 Saccharomonospora paurometabolica YIM 90007 ZP_09032097.1 ZP_09032096.1 Spirosoma linguale DSM 74 YP_003389481.1 YP_003389482.1 Stackebrandtia nassauensis DSM 44728 YP_003514142.1 YP_003514143.1 Staphylococcus caprae C87 ZP_07840598.1 ZP_07840600.1

23

Staphylococcus epidermidis CAA90025.1 CAA90026.1 Pep5 Staphylococcus epidermidis CAA74350.1 CAA74351.1 epicidin 280 Staphylococcus epidermidis Tu 3298 CAA44253.1 CAA44254.1 epidermin Staphylococcus gallinarum ABC94903.1 ABC94904.1 gallidermin Staphylococcus aureus subsp. aureus ED133 ADI98309.1 ADI98308.1 Bsa Staphylococcus aureus subsp. aureus YP_494457.1 YP_494456.1 Bsa Streptomyces clavuligerus ATCC 27064 ZP_08218308.1 ZP_08218309.1 Streptomyces clavuligerus ATCC 27064 ZP_06771940.1 ZP_06771941.1 Streptomyces sp. W007 ZP_09400509.1 ZP_09400510.1 Streptomyces ambofaciens ATCC 23877 CAJ88053.1 CAJ88054.1 treptomyces bingchenggensis BCW-1 YP_004965200.1 YP_004965201.1 Streptomyces cattleya NRRL 8057 YP_004910077.1 YP_004910078.1 Streptomyces coelicolor A3(2) NP_624599.1 NP_624600.1 Streptomyces griseoflavus Tu4000 ZP_07315085.1 ZP_07315084.1 Streptomyces griseus subsp. griseus NBRC 13350 YP_001825359.1 YP_001825358.1 Streptomyces lividans TK24 ZP_06533438.1 ZP_06533437.1 Streptomyces scabiei 87.22 YP_003488857.1 YP_003488858.1 Streptomyces sp. Mg1 ZP_04997226.1 ZP_04997227.1 Streptomyces violaceusniger Tu 4113 YP_004811344.1 YP004811345.1 Streptomyces violaceusniger Tu 4113 YP_004815225.1 YP_004815226.1 Streptococcus dysgalactiae subsp. equisimilis YP_002996025.1 Streptococcus mutans AAF99579.1 AAF99580.1 mutacin I Streptococcus pasteurianus ATCC 43144 YP_004559249.1 YP_004559248.1 Streptococcus pyogenes MGAS10270 YP_598531.1 YP_598530.1 streptin Streptococcus salivarius AEX55164.1 AEX55161.1 salivaricin D Streptococcus sanguinis SK408 EGF18911.1 EGF18912.1 Streptococcus uberis strain 42 ABA00879.1 ABA00881.1 nisin U Thermomonospora curvata DSM 43183 YP_003302206.1 YP_003302207.1 Tannerella forsythia ATCC 43037 YP_005013067.1

24

Table S3. Accession numbers of LanM enzymes used in this study.

Strain LanM Lanthipeptide

Actinoplanes garbadinensis ACR33053.1 actagardine Amycolicicoccus subflavus DQS3-9A1 YP_004495491.1 Anabaena variabilis ATCC 29413 YP_320138.1 Azorhizobium caulinodans ORS 571 YP_001526124.1Azospirillum amazonense Y2 ZP_08869074.1Bacillus cereus Q1 YP_002532646.1Bacillus halodurans C-125 NP_241321.1 haloduracin α Bacillus halodurans C-125 NP_241318.1 haloduracin β Bacillus licheniformis ATCC 14580 YP_081205.1 lichenicidin α Bacillus licheniformis ATCC14580 YP_081203.2 lichnicidin β Bacillus sp. strain HIL-Y85/54728 CAB60261.1 mersacidin Bifidobacterium angulatum DSM20098 ZP_04448254.1 Bifidobacterium longum DJO10A YP_001955594.1 Bifidobacterium longum DJO10A ZP_00206332.1 Blautia hansenii DSM 20583 ZP_05855760.1 Caldicellulosiruptor bescii DSM 6725 YP_002572981.1Catenulispora acidiphila DSM 44928 YP_003115146.1Catenulispora acidiphila DSM 44928 YP_003114474.1Clavibacter michiganensis NCPPB 382 YP_001222710.1 michiganin A Clostridium hylemonae DSM15053 ZP_03780146.1 Clostridium perfringens B ATCC 3626 ZP_02636763.1 Clostridium perfringens D str. JGS1721 ZP_02952302.1Corallococcus coralloides DSM 2259 AFE09191.1Corallococcus coralloides DSM 2259 AFE09192.1 Corynebacterium diptheriae NCTC 13129 NP_939126.1 Corynebacterium matruchotti ATCC 33806 ZP_03711708.1 Corynebacterium matruchotti ATCC 33806 ZP_03711704.1 Coxiella burnetii Dugway 5J108-111 YP_001424603.1 Coxiella burnetii Dugway 5J108-111 YP_002303280.1Coxiella burnetii MSU Goat Q177 ZP_01946732.1Coxiella burnetti CbuGQ212 YP_002303280.1 Cyanothece sp. Strain PCC7425 YP_002485891.1 Cyanothece sp. Strain PCC7425 YP_002483601.1 Cyanothece sp. Strain PCC7425 YP_002484655.1Cyanothece sp. Strain PCC7425 YP_002483742.1Cyanothece sp. Strain PCC8801 YP_002372173.1Cyanothece sp. Strain PCC8802 YP_003137732.1 Enterococcus faecalis pAD1 AAA62650.1 cytolysin Gemmatimonas aurantiaca T-27 YP_002763400.1 Geobacillus thermodenitrificans NG80–2 YP_001126159.1Herpetosiphon aurantiacus DSM 785 YP_001544639.1Kribbella flavida DSM 17836 YP_003384323.1Ktedonobacter racemifer DSM 44963 ZP_06966363.1 Lactobacillus sakei CAA91110.1 lactocin S Lactococcus lactis DPC3147 NP_047321.1 Ltnα Lactococcus lactis DPC3147 NP_047323.1 LtnβLactococcus lactis subsp. lactis AAC72258.1 lacticin 481 Lysobacter sp. ATCC 53042 AEH59095.1Microcoleus chthonoplastes PCC 7420 ZP_05025883.1 Moorea producta 3L ZP_08425581.1 Myxococcus Xanthus DK 1622 YP_631068.1 Myxococcus Xanthus DK 1622 YP_634512.1

25

Mycobacterium marinum M YP_001849230.1 Mycobacterium kansasii ATCC 12478 ZP_04749218.1 Myxococcus fulvus HW-1 YP_004667773.1 Myxococcus fulvus HW-1 YP_004667388.1Nostoc punctiforme PCC 73102 YP_001869999.1Nostoc punctiforme PCC 73102 YP_001866693.1 Nostoc punctiforme PCC 73102 YP_001868329.1 Nostoc punctiforme PCC 73102 YP_001866601.1 Nostoc sp. strain PCC 7120 NP_486065.1 Oscillatoria sp. PCC 6506 ZP_07114145.1Planctomyces brasiliensis DSM 5305 YP_004271311.1Prochlorococcus marinus MIT 9313 NP_894083.1 prochlorosin Prochlorococcus marinus MIT 9303 YP_001018107.1 prochlorosin Ruminococcus gnavus CAB93674.2 ruminococcin Saccharopolyspora erythraea NRRL 2338 YP_001106583.1 Salinispora arenicola CNS-205 YP_001535270.1Staphylococcus warneri NP_940773.2 nukacin ISK-1 Staphylococcus aureus pETB NP_478387.1 staphylococcin Staphylococcus aureus pETB NP_478385.1 staphylococcin Stenotrophomonas sp. strain SKA14 ZP_05136619.1 Stigmatella aurantiaca DW4/3-1 ZP_01460524.1 Streptococcus equinus ACA51935.1Streptococcus macedonicus ABI30229.1 macedonicin Streptococcus mutans BAD72771.1 Smb Streptococcus mutans ABK59358.1 mutacin K8 Streptococcus mutans AAC38145.1 mutacin II Streptococcus pneumoniae CDC0288-04 ZP_02716217.1 Streptococcus pneumonia ATCC 700669 YP_002511204.1Streptococcus pneumoniae SP23-BS72 ZP_01834975.1Streptococcus pyogenes AAB92602.1 Streptococcin A-FF22 Streptococcus pyogenes MGAS10750 YP_603221.1 Streptococcus ratti AAZ76597.1 BHT Streptococcus salivarius ABI63629.1 Streptococcus salivarius ABI54435.1 salivaricin A1 Streptococcus salivarius ABI63640.1 salivaricin B Streptomyces cinnamoneus CAD60521.1 cinnamycin Streptomyces griseus subsp. griseus NBRC13350 YP_001826321.1 Streptomyces sp. SirexAA-E YP_004802651.1 Synechococcus sp. strain RS9916 ZP_01470939.1

26

Table S4. Accession numbers of LanKC enzymes used in this study.

Strain LanKC Lanthipeptide

Actinomadura namibiensis CAX48971.1 labyrinthopeptin Actinosynnema mirum DSM 43827 YP_003102521.1 Amycolicicoccus subflavus DQS3-9A1 YP_004483521.1 Anaerococcus prevotii DSM 20548 YP_003142327.1 Bacillus cereus NC7401 YP_005107064.1 Bifidobacterium longum subsp. infantis ATCC 15697 YP_002321962.1 Brevibacterium linens BL2 ZP_05915733.1 Catenulispora acidiphila DSM 44928 YP_003114944.1 catenulipeptin Clavibacter michiganensis subsp. michiganensis NCPPB 382 YP_001221418.1 Deinococcus gobiensis I-0 AFD28133.1Finegoldia magna ACS-171-V-Col3 ZP_07269239.1 Kribbella flavida DSM 17836 YP_003380339.1 Kribbella flavida DSM 17836 YP_003384295.1 Lactobacillus delbrueckii subsp. bulgaricus ATCC 11842 YP_618260.1 Lactobacillus iners LactinV 11V1-d ZP_07698467.1 Lysinibacillus fusiformis ZC1 ZP_07051456.1 Micromonospora sp. ATCC 39149 ZP_04604181.1 Nocardiopsis dassonvillei subsp. dassonvillei DSM 43111 YP_003682623.1 Rheinheimera nanhaiensis E407-8 ZP_09989548.1 Saccharopolyspora erythraea NRRL 2338 YP_001106424.1 erythreapeptin Stackebrandtia nassauensis DSM 44728 YP_003509339.1Staphylococcus hominis subsp. hominis C80 ZP_07844578.1 Stenotrophomonas maltophilia R551-3 YP_002028517.1 Stigmatella aurantiaca DW4/3-1 YP_003956746.1 Stigmatella aurantiaca DW4/3-1 YP_003956421.1 Streptococcus pneumoniae 70585 YP_002740622.1 Streptococcus pneumoniae AP200 YP_003876963.1 Streptomyces avermitilis MA-4680 NP_828679.1 avermipeptin Streptomyces cattleya NRRL 8057 YP_004910942.1 Streptomyces chartreusis NRRL 12338 ZP_09956766.1 Streptomyces coelicolor A3(2) NP_630756.1 RamC Streptomyces ghanaensis ATCC 14672 ZP_06580911.1 Streptomyces griseus subsp. griseus NBRC 13350 YP_001826317.1 griseopeptin Streptomyces lividans TK24 ZP_06527130.1 Streptomyces viridochromogenes DSM 40736 ZP_07308381.1 Streptosporangium roseum DSM 43021 YP_003341890.1 Thermomonospora curvata DSM 43183 YP_003301182.1 Verrucosispora maris AB-18-032 YP_004406242.1 Xanthomonas oryzae pv. oryzicola BLS256 YP_005628914.1

27

Table S5. Accession numbers of LanL enzymes used in this study.

Strain LanL Lanthipeptide

Actinoplanes sp. SE50/110 AEV85426.1 Catenulispora acidiphila DSM 44928 YP_003113256.1Legionella longbeachae D-4968 ZP_06185769.1 Nocardia brasiliensis ATCC 700358 ZP_09836649.1 Saccharopolyspora erythraea NRRL 2338 YP_001106807.1 Saccharopolyspora erythraea NRRL 2338 YP_001106221.1 Stackebrandtia nassauensis DSM 44728 YP_003509405.1 Streptococcus pneumoniae CDC1087-00 ZP_02710581.1Streptococcus pneumoniae NP141 EHZ96054.1 Streptomyces clavuligerus ATCC 27064 ZP_05005408.1 Streptomyces griseus subsp. griseus NBRC 13350 YP_001821664.1 Streptomyces griseochromogenes AAP03109.1 Streptomyces sp. e14 ZP_06711824.1Streptomyces venezuelae ATCC 10712 AEA03262.1 venezuelin Thermomonospora curvata DSM 43183 YP_003298015.1

Table S6. Primer sequences (5’ to 3’).

NisA-ElxA-FP GAAAGATTCAGGTGCATCACCAGAGTCAGCTAGTATTG

TTAAAACAACT

Nis-ElxA-RP CTTAAGCATTATGCGGCCGCAAGCTTTTATTTTTTACCA

GTAAAGTGAC

ProcA3.2Lea-K-NisAStr-Conn-FP GAAGGTGTGGCTGGGGGAAAAATTACAAGTATTTCG

ProcA3.2Lea-K-NisAStr-Conn-RP CGAAATACTTGTAATTTTTCCCCCAGCCACACCTTC

ProcA3.2Lea-LctAStr-Conn-FP GGTGTGGCTGGGGGAAAAGGCGGCAGTGGAG

ProcA3.2Lea-LctAStr-Conn-RP CTCCACTGCCGCCTTTTCCCCCAGCCACACC

ProcA3.2 NdeI-FP GCAACCTACATATGTCAGAAGAACAACTCAAGGC

NisA XhoI-RP ACAGACGACTCGAGTTATTTGCTTACGTGAAT

ACTACAATGACAAG

LctA XhoI-RP TCAGATCTCGAGTTAAGAGCAGCAAGTAAATAC

ProcA3.2 EcoRI-FP CAGGATCCGAATTCGATGTCAGAAGAACAACTCAAGG

NisA NotI-RP AAGGAAAAAAGCGGCCGCTTATTTGCTTACGTGAAT

LctA NotI-RP GGAAAAAAGCGGCCGCTTAAGAGCAGCAAGTA

28

References 1. Thompson, J. D., Gibson, T. J., Plewniak, F., Jeanmougin, F., & Higgins, D. G. (1997) The

CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by

quality analysis tools. Nucleic Acids Res, 25: 4876-4882.

2. Mcwilliam, H., et al. (2009) Web services at the European Bioinformatics Institute-2009.

Nucleic Acids Res, 37: W6-W10.

3. Ronquist, F., et al. (2012) MrBayes 3.2: Efficient Bayesian Phylogenetic Inference and Model

Choice Across a Large Model Space. Syst Biol, 61: 539-542.

4. Page, R. D. (1996) TreeView: an application to display phylogenetic trees on personal

computers. Comput Appl Biosci, 12: 357-358.

5. Tamura, K., Dudley, J., Nei, M., & Kumar, S. (2007) MEGA4: Molecular Evolutionary

Genetics Analysis (MEGA) software version 4.0. Mol Biol Evol, 24: 1596-1599.

6. Guindon, S. & Gascuel, O. (2003) A simple, fast, and accurate algorithm to estimate large

phylogenies by maximum likelihood. Syst Biol, 52: 696-704.

7. Whelan, S. & Goldman, N. (2001) A general empirical model of protein evolution derived

from multiple protein families using a maximum-likelihood approach. Mol Biol Evol, 18:

691-699.

8. Darriba, D., Taboada, G. L., Doallo, R., & Posada, D. (2011) ProtTest 3: fast selection of

best-fit models of protein evolution. Bioinformatics, 27: 1164-1165.

9. Anisimova, M. & Gascuel, O. (2006) Approximate likelihood-ratio test for branches: A fast,

accurate, and powerful alternative. Syst Biol, 55: 539-552.

10. Wright, F. (1990) The Effective Number of Codons Used in a Gene. Gene, 87: 23-29.

11. Shi, Y., Yang, X., Garg, N., & van der Donk, W. A. (2011) Production of lantipeptides in

Escherichia coli. J Am Chem Soc, 133: 2338-2341.

12. Li, B., et al. (2010) Catalytic promiscuity in the biosynthesis of cyclic peptide secondary

metabolites in planktonic marine cyanobacteria. Proc Natl Acad Sci USA, 107: 10430-10435.

13. Xie, L., et al. (2004) Lacticin 481: in vitro reconstitution of lantibiotic synthetase activity.

Science, 303: 679-681.

14. Li, B. & van der Donk, W. A. (2007) Identification of essential catalytic residues of the

cyclase NisC involved in the biosynthesis of nisin. J Biol Chem, 282: 21169-21175.

15. Onaka, H., Nakaho, M., Hayashi, K., Igarashi, Y., & Furumai, T. (2005) Cloning and

characterization of the goadsporin biosynthetic gene cluster from Streptomyces sp. TP-A0584.

Microbiology, 151: 3923-3933.

16. Li, C. & Kelly, W. L. (2010) Recent advances in thiopeptide antibiotic biosynthesis. Nat Prod

Rep, 27: 153-164.

17. Smith, A. B. (1994) Rooting Molecular Trees - Problems and Strategies. Biol J Linn Soc, 51:

279-292.

18. Levengood, M. R., Knerr, P. J., Oman, T. J., & van der Donk, W. A. (2009) In vitro

mutasynthesis of lantibiotic analogues containing nonproteinogenic amino acids. J Am Chem

Soc, 131: 12024-12025.