bioorganic & medicinal chemistry -...

6
Employing directed evolution for the functional analysis of multi-specific proteins Maxim Levin a , Dotan Amar a , Amir Aharoni a,b,a Department of Life Sciences, Ben-Gurion University of the Negev, Be’er Sheva 84105, Israel b National Institute for Biotechnology in the Negev (NIBN), Ben-Gurion University of the Negev, Be’er Sheva 84105, Israel article info Article history: Available online xxxx Keywords: Multi-specific proteins Directed evolution Protein-networks abstract Multi-specific proteins located at the heart of complex protein–protein interaction (PPI) networks play essential roles in the survival and fitness of the cell. In addition, multi-specific or promiscuous enzymes exhibit activity toward a wide range of substrates so as to increase cell evolvability and robustness. How- ever, despite their high importance, investigating the in vivo function of these proteins is difficult, due to their complex nature. Typically, deletion of these proteins leads to the abolishment of large PPI networks, highlighting the difficulty in examining the contributions of specific interactions/activities to complex biological processes and cell phenotypes. Protein engineering approaches, including directed evolution and computational protein design, allow for the generation of multi-specific proteins in which certain activities remain intact while others are abolished. The generation and examination of these mutants both in vitro and in vivo can provide high-resolution analysis of biological processes and cell phenotypes and provide new insight into the evolution and molecular function of this important protein family. Ó 2013 Elsevier Ltd. All rights reserved. 1. Introduction For many years, specificity has been considered as an essential property of proteins to enable their function in the complex cellu- lar environment. However, over the past two decades, this hypoth- esis has been challenged by many studies showing that low or multi-specificity in proteins is far from a rare phenomenon. Moreover, this property was shown to be universal for molecular recognition and was detected in many proteins participating in protein–protein, protein–peptide and protein–DNA interactions. The identification of multi-specific proteins participating in pro- tein–protein interaction (PPI) networks has been systematically investigated by several high-throughput experiments aimed at identifying all possible PPI in several different organisms. 1–3 These studies showed that while most proteins have only two–three interaction partners, a small but significant number of proteins can interact with more than 10 partners. In contrast to the identi- fication of proteins participating in multiple PPIs, the systematic identification of multi-specific enzymes is extremely difficult, due to the lack of a general screening approach for the detection of multiple catalytic reactions mediated by individual enzymes. Still, the abundance of multi-functional Escherichia coli enzymes was recently examined using a comprehensive bioinformatics ap- proach. This study found that around 40% of the E. coli metabolic enzymes are multi-specific, catalyzing 65% of metabolic reactions in the cell. 4 Further studies investigating the properties of the mul- ti-specific proteins showed that they are more conserved than are specific proteins and tend to be essential for organism viability. 5,6 In terms of cellular function, these multi-specific proteins play essential roles in enabling the function of complex cellular net- works by regulating the activity of numerous network proteins. 7,8 Despite the importance of multi-specific proteins for the sur- vival of organisms, the functional roles of these proteins in the cel- lular environment are difficult to investigate, due to their complex properties. Currently, the in vivo function of proteins is mainly examined following deletion of the gene encoding a target protein or a loss of function mutation. These modified strains are subse- quently examined for global cell phenotypes, including effects on viability, growth rate and sensitivity to a variety of chemical agents, or for defects in specific cellular processes, such as cell cy- cle progression, transcription regulation and protein folding. 9,10 In the case of multi-specific proteins, deletion of the encoding gene can lead to non-viability as the result of a complete disruption of the multiple interactions/catalytic activities of the protein, leading to the abolishment of large PPI networks. Thus, this all-or-none ap- proach prevents detailed examination of the contribution of spe- cific interactions or catalytic activities to complex biological processes and cell phenotypes. High-resolution analysis of the importance of specific interac- tions/activities of multi-specific protein in promoting biological processes can be performed using engineered mutants in which 0968-0896/$ - see front matter Ó 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.bmc.2013.04.052 Corresponding author. Tel.: +972 8 6472645; fax: +972 8 6479218. E-mail address: [email protected] (A. Aharoni). Bioorganic & Medicinal Chemistry xxx (2013) xxx–xxx Contents lists available at SciVerse ScienceDirect Bioorganic & Medicinal Chemistry journal homepage: www.elsevier.com/locate/bmc Please cite this article in press as: Levin, M.; et al. Bioorg. Med. Chem. (2013), http://dx.doi.org/10.1016/j.bmc.2013.04.052

Upload: others

Post on 19-Nov-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Bioorganic & Medicinal Chemistry - BGUlifeserv.bgu.ac.il/wp/aaharoni/wp-content/uploads/2017/... · 2017. 5. 25. · Employing directed evolution for the functional analysis of multi-specific

Bioorganic & Medicinal Chemistry xxx (2013) xxx–xxx

Contents lists available at SciVerse ScienceDirect

Bioorganic & Medicinal Chemistry

journal homepage: www.elsevier .com/locate /bmc

Employing directed evolution for the functional analysisof multi-specific proteins

0968-0896/$ - see front matter � 2013 Elsevier Ltd. All rights reserved.http://dx.doi.org/10.1016/j.bmc.2013.04.052

⇑ Corresponding author. Tel.: +972 8 6472645; fax: +972 8 6479218.E-mail address: [email protected] (A. Aharoni).

Please cite this article in press as: Levin, M.; et al. Bioorg. Med. Chem. (2013), http://dx.doi.org/10.1016/j.bmc.2013.04.052

Maxim Levin a, Dotan Amar a, Amir Aharoni a,b,⇑a Department of Life Sciences, Ben-Gurion University of the Negev, Be’er Sheva 84105, Israelb National Institute for Biotechnology in the Negev (NIBN), Ben-Gurion University of the Negev, Be’er Sheva 84105, Israel

a r t i c l e i n f o

Article history:Available online xxxx

Keywords:Multi-specific proteinsDirected evolutionProtein-networks

a b s t r a c t

Multi-specific proteins located at the heart of complex protein–protein interaction (PPI) networks playessential roles in the survival and fitness of the cell. In addition, multi-specific or promiscuous enzymesexhibit activity toward a wide range of substrates so as to increase cell evolvability and robustness. How-ever, despite their high importance, investigating the in vivo function of these proteins is difficult, due totheir complex nature. Typically, deletion of these proteins leads to the abolishment of large PPI networks,highlighting the difficulty in examining the contributions of specific interactions/activities to complexbiological processes and cell phenotypes. Protein engineering approaches, including directed evolutionand computational protein design, allow for the generation of multi-specific proteins in which certainactivities remain intact while others are abolished. The generation and examination of these mutantsboth in vitro and in vivo can provide high-resolution analysis of biological processes and cell phenotypesand provide new insight into the evolution and molecular function of this important protein family.

� 2013 Elsevier Ltd. All rights reserved.

1. Introduction

For many years, specificity has been considered as an essentialproperty of proteins to enable their function in the complex cellu-lar environment. However, over the past two decades, this hypoth-esis has been challenged by many studies showing that low ormulti-specificity in proteins is far from a rare phenomenon.Moreover, this property was shown to be universal for molecularrecognition and was detected in many proteins participating inprotein–protein, protein–peptide and protein–DNA interactions.The identification of multi-specific proteins participating in pro-tein–protein interaction (PPI) networks has been systematicallyinvestigated by several high-throughput experiments aimed atidentifying all possible PPI in several different organisms.1–3 Thesestudies showed that while most proteins have only two–threeinteraction partners, a small but significant number of proteinscan interact with more than 10 partners. In contrast to the identi-fication of proteins participating in multiple PPIs, the systematicidentification of multi-specific enzymes is extremely difficult,due to the lack of a general screening approach for the detectionof multiple catalytic reactions mediated by individual enzymes.Still, the abundance of multi-functional Escherichia coli enzymeswas recently examined using a comprehensive bioinformatics ap-proach. This study found that around 40% of the E. coli metabolic

enzymes are multi-specific, catalyzing 65% of metabolic reactionsin the cell.4 Further studies investigating the properties of the mul-ti-specific proteins showed that they are more conserved than arespecific proteins and tend to be essential for organism viability.5,6

In terms of cellular function, these multi-specific proteins playessential roles in enabling the function of complex cellular net-works by regulating the activity of numerous network proteins.7,8

Despite the importance of multi-specific proteins for the sur-vival of organisms, the functional roles of these proteins in the cel-lular environment are difficult to investigate, due to their complexproperties. Currently, the in vivo function of proteins is mainlyexamined following deletion of the gene encoding a target proteinor a loss of function mutation. These modified strains are subse-quently examined for global cell phenotypes, including effects onviability, growth rate and sensitivity to a variety of chemicalagents, or for defects in specific cellular processes, such as cell cy-cle progression, transcription regulation and protein folding.9,10 Inthe case of multi-specific proteins, deletion of the encoding genecan lead to non-viability as the result of a complete disruption ofthe multiple interactions/catalytic activities of the protein, leadingto the abolishment of large PPI networks. Thus, this all-or-none ap-proach prevents detailed examination of the contribution of spe-cific interactions or catalytic activities to complex biologicalprocesses and cell phenotypes.

High-resolution analysis of the importance of specific interac-tions/activities of multi-specific protein in promoting biologicalprocesses can be performed using engineered mutants in which

Page 2: Bioorganic & Medicinal Chemistry - BGUlifeserv.bgu.ac.il/wp/aaharoni/wp-content/uploads/2017/... · 2017. 5. 25. · Employing directed evolution for the functional analysis of multi-specific

2 M. Levin et al. / Bioorg. Med. Chem. xxx (2013) xxx–xxx

part of the interactions/enzymatic activities are abolished, whileothers are maintained. In this short review, we will describe mul-ti-specificity and promiscuity in proteins, directed evolution andcomputational design approaches for generating mutant proteinswith altered specificity and the bioinformatics and experimentalanalysis of the co-evolution of multi-specific proteins togetherwith their partners/substrates.

2. Examples for multi-specific proteins that play essential rolesin the cell

Candidate multispecific proteins for an in vivo functional anal-ysis are proteins that interact with many partners to mediate di-verse biological processes. These proteins can interact withmultiple other proteins or catalyse the transformation of multiplesubstrates through the same binding/active site. One example ofsuch protein is the proliferating cellular nuclear antigen (PCNA)that is essential for cell viability. PCNA is a hub protein that medi-ates DNA replication and repair in all eukaryotes via the recruit-ment of numerous partners to the DNA template in a highlyregulated manner. In yeast at least 15 different proteins bind PCNAat a highly conserved binding site in a competitive manner while inhumans the number of PCNA partners is more than doubled.11 De-spite extensive research, little is known regarding the mechanismsby which PCNA regulates such interactions and switch between itsdifferent partners to coordinate complex biological processes withhigh precision. In this case, protein engineering can be used to gen-erate variants with higher specificity that will maintain some ofPCNA–partner interactions while abolishing others. Alternatively,protein engineering can be used to generate high affinity PCNAmutants toward specific partners to assess the importance of affin-ity based competition between partners for the in vivo progressionof DNA replication and repair. Indeed, using directed evolution, wegenerated a panel of yeast PCNA mutants with enhanced affinityfor five different partners. Interestingly, we found that yeast strainscontaining these mutants as a sole source of PCNA are either non-viable or exhibit severe defects in DNA replication and repair.These results show the importance of affinity based competitionbetween partners for PCNA-mediated DNA replication and high-light the susceptibility of these processes to subtle increases inPCNA–partner interaction affinities.12

Other multi-specific proteins that can promote diverse biologi-cal processes are transcription factors. Some TFs enable the tran-scription of hundreds of genes to promote diverse biologicalprocesses such as cell cycle progression and stress response. Thesemaster regulators enable the transcription of many genes by bind-ing to consensus short sequences located at the promoter regionsof its regulated genes. At present, the effects of TF deletion on geneexpression and cell phenotypes reveals how a TF-mediated geneexpression collectively contributes to cell phenotype. However,such on–off analysis do not allow detailed understanding of whatis the contribution of the expression of individual genes or smallgroups of genes to cell phenotype. For example, cell cycle progres-sion is mediated by the coordinated activity of several TFs that col-lectively transcribes hundreds of different genes. Specifically, inyeast Mbp1 and Swi4 TFs promote the progression from G1 to Sphase by enabling the regulated transcription of �300 hundredgenes including DNA replication and bud formation genes.13,14

The generation of Mbp1 and/or Swi4 variants with narrow specific-ity that can transcribe only a subset of genes followed by pheno-typic analysis of such series of mutants will enable addressingquestions such as: what is the minimal set of genes (regulon) thatis essential for G1 to S progression? What genes act in a coordi-nated manner to promote specific phenotypes such as bud forma-tion? What are the changes in the TFs binding domain that are

Please cite this article in press as: Levin, M.; et al. Bioorg. Med. Chem. (

required for obtaining higher in vivo specificity? Such analysiscan significantly increase our understanding of how gene expres-sion promotes cell phenotype and can be applied for many TFs thatpromote diverse biological processes in health and disease.

Finally, many cellular enzymes exhibit multi-specificity by act-ing on many protein substrates that are physiologically relevantto regulate their activity. These multi-specific enzymes includechaperones that promote the folding of multiple protein sub-strates,15 phosphatases that catalyse the dephosphorylation ofmany proteins16 and deacetylases. For example, the sirtuin enzymefamily catalyzes the NAD+ dependent deacetylation of acetyl-lysineresidues within many different protein targets.17 This family of en-zymes is conserved from bacteria to humans18 and overexpressionof sirtuins in yeast and higher eukaryotes was shown to increasethe organism life span. The human sirtuin family is composed of se-ven members (SIRT1-7) exhibiting diverse cellular localization, cat-alytic activity, and substrate specificity. The SIRT1 protein is the beststudied of the human sirtuins and has been shown to be involved ina large number of biological processes, including DNA repair, stressresistance, apoptosis and aging and in a variety of diseases, includ-ing diabetes, cancer, and inflammation.19 SIRT1 involvement in cen-tral biological processes take place through the deacetylation of keyprotein substrates at the same active site, including p53, histones,FOXO, HSF1 and NF-jB, leading to modulation of their biologicalactivities17 (Fig. 1). Thus, SIRT1 constitutes an excellent system forthe study of multispecific enzymes with a diverse set of substratesand cellular functions. Protein engineering can enable the genera-tion of SIRT1 mutants with higher specificity and/or higher catalyticefficiency toward specific substrates. In vivo analysis of these mu-tants can reveal what activities are essential for SIRT1 function inthe cell and what activities are dispensable thus allowing newmechanistic analysis of SIRT1 activity. Due to the many cellular sub-strates of SIRT1 a panel of mutants is needed for detailed analysis.The correlation between the biochemical and in vivo analysis ofsuch mutants can reveal new combination of protein substrates thatupon deacetylation act in a coordinated manner to promote SIRT1mediated activities.

3. The molecular basis for multi-specificity and promiscuity inenzyme catalysis

The term multi-specificity or broad-specificity (as describedabove) defines enzymes that perform the same reaction on a rangeof substrates that are physiologically relevant.20 In contrast, enzymepromiscuity is often used to describe secondary non-native enzy-matic activities that differ from that reaction for which an enzymeevolved.21 The mechanistic and structural aspects of enzyme pro-miscuity/multi-specificity have been extensively described and dis-cussed in previous reviews.20–22 In general, it was shown that thecatalytic activities of these enzymes with the native and promiscu-ous substrates are performed at the same active site, using similarcatalytic residues (Fig. 1). Such broad specificity in many cases islinked to conformational diversity and active site flexibility that al-low the accommodation of diverse substrates. Relying on structuraland kinetic analysis, it was shown that cytosolic sulfotransferaseSULT1A1, glutathione-S-transferases and P450 promiscuity islinked to active site flexibility, enabling the enzyme to accommo-date diverse substrates in a productive mode.23–27 Another interest-ing property of promiscuous enzymes is the usage of different sub-sites within the same active site to catalyze the transformation ofdifferent substrates. In this case, the actual set of catalytic residuesthat lead to enhanced catalysis can differ between substrates andreactions. One such notable example is serum paraoxonase(PON1), a mammalian lactonase with promiscuous esterase andphosphotriesterase activities. All PON1 hydrolytic activities rely

2013), http://dx.doi.org/10.1016/j.bmc.2013.04.052

Page 3: Bioorganic & Medicinal Chemistry - BGUlifeserv.bgu.ac.il/wp/aaharoni/wp-content/uploads/2017/... · 2017. 5. 25. · Employing directed evolution for the functional analysis of multi-specific

Figure 1. Representation of the structure of the deacetylase domain of Sir2 family,highlighting the common binding site for several peptide substrates. (A) SIRT1deacetylase core region structure (PDB: 4I5I) was superimposed with peptidesubstrate derived from H3 and the NAD+ cofactor extracted from Sir2 structures(PDB: 2H2G and 2H4F, respectively). Cartoon and surface representation of SIRT1DAC is represented in green, NAD+ in red and the H3 peptide in magenta. The H3peptide acetylated lysine is represented in purple. (B) Superposition of threepeptide substrates derived from H4 (PDB: 2OD7, magenta), H3 (PDB: 2H2G, cyan)and p53 (PDB: 1MA3, blue) with the Sir2 structure (PDB: 1MA3). Sir2 structure(PDB: 1MA3) is represented in grey surface display.

M. Levin et al. / Bioorg. Med. Chem. xxx (2013) xxx–xxx 3

on a calcium ion in the active site that coordinate and stabilize thephosphoryl/carbonyl oxygen. However, the hydrolysis of estersand lacotones involves the His115-His134 dyad28 while the hydro-lysis of phoshotriesters involves another set of residues.29

4. Directed evolution and computational design for thegeneration of multi-specific proteins with novel function

Over the past two decades, protein engineering methodologies,including rational design and directed evolution, have proved tobe highly valuable for the generation of proteins with novel bindingand/or enzymatic activities.30–32 By exerting a defined selectionpressure on a known protein to change and adopt a novel function,directed evolution not only provides proteins of broad utility but

Please cite this article in press as: Levin, M.; et al. Bioorg. Med. Chem. (

also yields meaningful insight into enzyme-substrate specificity,protein–protein interaction and protein evolution. The directedevolution process is based on the generation of gene libraries fol-lowed by screening or selection assays to isolate improved mutants.In the past few years, it was shown that the combination of bioinfor-matics and structural information, in conjunction with advances inDNA technologies, allow for the generation of focused gene librariescontaining targeted diversity that are significantly enriched in mu-tants with novel properties.31 These focused gene libraries enablethe screening of a relatively small number of variants for the isola-tion of mutants with desired properties. Moreover, these librariesfacilitate the engineering of enzymes catalyzing reactions that aredifficult to monitor and are not amenable to high-throughputscreening methodologies. The directed evolution of multi-specificproteins or promiscuous enzymes usually requires additional activ-ity assays with a panel of substrates to isolate novel mutants withaltered specificity (Fig. 2). In order to screen directly for multi-spec-ificity a two-step screening assay should be employed. In the firststep the library is examined toward a degenerate substrate, forexample in the case of SIRT1, for the activity against acetyl-lysine(Fig. 2C). Viable mutants exhibiting �WT activity toward this sub-strate are further screened with a panel of substrates derived fromthe physiologically relevant target proteins (Fig. 2D). This secondaryscreening assay enables the identification of mutants with alteredspecificity such as mutants exhibiting dramatic decrease in activityfor several substrates but WT level or improved activity towardother substrates. These mutants can be further characterized usinga variety of structural, biochemical, genetic and cell biology ap-proaches to gain novel understanding of the molecular and cellularfunctions of these proteins in vitro, as well as in the complex cellularenvironment (Fig. 2).

Many studies published over the last two decades have demon-strated the power of directed evolution in generating promiscuousenzymes with enhanced activity toward specific substrates and al-tered specificity. In general, it was shown that only a few muta-tions can lead to dramatic increases in the promiscuous activityof enzymes, often reaching 2–5 orders of magnitude. Successful di-rected evolution experiments have been reported on hydrolases,including b-lactamase,33 carbonic anhydrase,34 b-glucuronidase,35

phosphotriesterase,36 phosphonate monoester hydrolase37 andserum PON1.38 In addition, directed evolution experiments werealso performed on transferases, including DNA polymerases andmethyltransferases, GST, SULTs and glycosyltransferases, and onenzymes catalyzing oxidation and reduction of small molecules,including horseradish peroxidase, nitroreductases, P450 and others(for reviews please see Refs. 20,22,39,40). Many of these studieshave revealed that the promiscuous activity can be increased withonly small decreases in the native activity of the enzyme38 and thatthese activities can be utilized as a starting point for the evolutionof novel protein functions with physiological significance.41 In gen-eral, these studies provided tremendous insight into the structure–function relationships of enzymes and detailed understanding oftheir mechanisms of action. However, these studies did not utilizethe engineered enzymes as tools to study their in vivo function andunderstand how alterations in specificity can affect complex pro-cesses and cellular phenotypes.

Computational protein design offers a complementary approachfor the generation of multi-specific proteins with novel specifici-ties. This approach is based on the search for amino acid sequencesadopting a desired structure with a specific function.42,43 In the lastdecade, computational design approaches served to generate avariety of proteins with novel functions, including novel enzymescatalyzing reactions with no natural biocatalysts, protein–proteininteractions, protein–ligand interactions and oligomeric assemblyof proteins.43,44 In the case of protein–protein interactions, altera-tions of the specificity of multi-specific proteins can be achieved by

2013), http://dx.doi.org/10.1016/j.bmc.2013.04.052

Page 4: Bioorganic & Medicinal Chemistry - BGUlifeserv.bgu.ac.il/wp/aaharoni/wp-content/uploads/2017/... · 2017. 5. 25. · Employing directed evolution for the functional analysis of multi-specific

Figure 2. A schematic representation of the directed evolution process used for the generation and characterization of multi-specific enzymes with improved catalyticactivity and/or altered specificity. (A) The generation of gene libraries based on sequence, phylogenetic and structural analysis. (B) Transformation and expression of thelibraries into E. coli cells in 96-well plate format. (C) Medium-throughput screening of the mutants for catalytic activity with a generic substrate to identify mutants thatretain basal catalytic activity. (D) Low-throughput screening testing for activity toward several substrates to identify changes in specificity. (E) Biochemical characterizationof selected mutants using kinetic assays to quantify kinetic parameters. (F) In vivo analysis of selected mutants in living cells.

4 M. Levin et al. / Bioorg. Med. Chem. xxx (2013) xxx–xxx

computational design. Redesign of multi-specific interfaces can beperformed to enable the protein to recognize part but not all of itsbinding partners.43,45 A recent computational design study gener-ated two protocols for optimizing the binding of hub proteins tomultiple partners and was applied to 20 multi-specific proteins,including GTPases, actin and ubiquitin.46 The first protocol at-tempted to optimize the shared interface of the protein for recog-nition of multiple partners (‘multi-constraint’), while the secondprotocol optimized the binding interface for each partner sepa-rately (‘single-constraint’). Comparing the outcome of the two ap-proaches showed that multi-specific constraint approach muchbetter captured the natural interface of multi-specific proteinsand suggests how the interface can be changed to obtain alteredspecificity. Overall, the generation of multi-specific proteins withnovel properties through computational design and/or directedevolution can pave the way for high resolution analysis of the bio-chemical and in vivo function of these proteins in living cells.

5. The evolution of multi-specific proteins

To maintain the function of multi-specific proteins over largeevolutionary time scales, these proteins must co-adapt or co-evolve together with their partners/substrates. Such co-evolution,manifested in many cases by correlated mutations in the multi-specific protein and partners, can reshape the topology of PPI net-works and the mode of recognition within these networks. In thepast decade, co-evolution of proteins was mainly examined usingbioinformatics approaches to identify interactions in underex-plored organisms and to assign functions to proteins based onthe identification of protein networks.47,48 In addition, bioinfor-matics approaches were developed and used to examine themolecular basis for protein co-evolution and identify correlated

Please cite this article in press as: Levin, M.; et al. Bioorg. Med. Chem. (

mutations between and within proteins. The detailed analysis ofmultiple sequence alignments (MSA) allows for automated detec-tion of partially conserved residues that are conserved withinsub-families of proteins but not throughout the entire family.49–

51 Such residues, termed specificity-determining positions (SDPs),were recently detected on a large scale and analyzed with respectto their location in the protein structure.52 In accordance with theirpotential as SDPs, these residues are enriched in the vicinity of pro-tein active sites and at protein–protein interfaces.52 Analysis ofSDPs in individual protein families, including protein kinases,dehydrogeneases and the Ras family,50,53 shows that automateddetection of such residues, in conjunction with structural informa-tion, is a powerful tool for identifying functionally important sitesin proteins. However, despite the rich bioinformatics data availableon protein co-evolution, very little experimental work has beenperformed addressing the functional implications of natural se-quence changes due to protein co-evolution. Specifically, little iscurrently known regarding the co-evolution of multi-specific pro-teins and the importance of natural sequence changes in maintain-ing the integrity of PPI or protein–DNA interaction (PDI) networksover large evolutionary time scales. Protein engineering ap-proaches enabling the systematic replacement of specific aminoacids, domains or whole proteins with the corresponding ortho-logues will enable experimental examination of co-evolution ofPPI or PDI networks both in vitro and in vivo (Fig. 3).54 These ap-proaches allow the identification of conserved versus co-evolvinginteractions and the identification of major divergent points of pro-tein function through natural evolution. Moreover, biochemicalanalysis of the binding specificity of variants to their cognate andnon-cognate partners can reveal major transitions in specificityduring evolution and can be used as a rich source for detailedstructure–function relationship analysis. This approach was

2013), http://dx.doi.org/10.1016/j.bmc.2013.04.052

Page 5: Bioorganic & Medicinal Chemistry - BGUlifeserv.bgu.ac.il/wp/aaharoni/wp-content/uploads/2017/... · 2017. 5. 25. · Employing directed evolution for the functional analysis of multi-specific

Figure 3. The experimental analysis of protein co-evolution can be performed using genetic and biochemical approaches. Using genetic approaches, a specific gene in a modelorganism (e.g., the yeast Saccharomyces cerevisiae) can be replaced by several orthologues to examine compatibility in a non-native environment. This approach can beperformed in a systematic manner to reveal the dynamics of co-evolution over large evolutionary time scales and to identify genetic barriers for gene transfer betweenorganisms (the fungal phylogenetic tree is shown as an example). It is possible to substitute the native gene with DNA encoding an orthologue or a hybrid protein to examinethe co-evolution of specific domain or proteins containing specific mutations. These genetically manipulated organisms can be examined for phenotypes, including viability,growth rate and temperature or stress sensitivity. In addition, specific phenotypes can be examined, depending on the function of the examined proteins (e.g., proteinsparticipating in replication and repair can be examined by mutation rate analysis and sensitivity to specific DNA-damaging agents). Finally, to examine changes in specificitydue to protein co-evolution, biochemical binding assays (e.g., the yeast two-hybrid approach) or catalytic assays can be performed, using cognate or non-cognate partners/substrates.

M. Levin et al. / Bioorg. Med. Chem. xxx (2013) xxx–xxx 5

recently utilized to investigate the co-evolution in PCNA–partnerinteraction networks that is essential for the progression of DNAreplication and repair.55 The authors discovered that PCNA–part-ner interactions tightly co-evolved in fungal species, leading tospecific modes of recognition. The experimental approach revealedthat fungal PCNA–partner interaction networks diverged relativelyearly in fungal evolution into two distinct groups and that hybridnetworks of these groups are functionally non-compatible. This re-search, moreover, revealed the divergence point of PCNA functionin fugal evolution and demonstrated that co-evolution can formfunctional barriers between fungal species that can serve as a driv-ing force to promote and fix speciation.

In summary, in this review we described how we anticipate thatprotein engineering will deepen our understanding of the functionof multi-specific proteins in the cell. We believe that the combina-tion of protein engineering with genetic and cell biology analysiswill complement the popular gene knock-out approach for thein vivo study of protein function. We provided examples of majormulti-specific cellular proteins that can be investigated using thisapproach to address fundamental biological questions from anew perspective such as what is the connection between gene

Please cite this article in press as: Levin, M.; et al. Bioorg. Med. Chem. (

expression and cell phenotype and how robust is the cell for partialdeletions of interaction/activities of essential multi-specificproteins. Finally, the engineering and in vivo examination of mul-ti-specific proteins with decreased protein/DNA network connec-tivity can reveal what are the minimal biological networks thatcan still mediate and execute diverse biological processes. Weenvision that such engineered proteins can be useful for the designof minimal protein networks for executing diverse processes forsynthetic biological applications.

References and notes

1. Giot, L.; Bader, J. S.; Brouwer, C.; Chaudhuri, A.; Kuang, B.; Li, Y.; Hao, Y. L.; Ooi,C. E.; Godwin, B.; Vitols, E.; Vijayadamodar, G.; Pochart, P.; Machineni, H.;Welsh, M.; Kong, Y.; Zerhusen, B.; Malcolm, R.; Varrone, Z.; Collis, A.; Minto, M.;Burgess, S.; McDaniel, L.; Stimpson, E.; Spriggs, F.; Williams, J.; Neurath, K.;Ioime, N.; Agee, M.; Voss, E.; Furtak, K.; Renzulli, R.; Aanensen, N.; Carrolla, S.;Bickelhaupt, E.; Lazovatsky, Y.; DaSilva, A.; Zhong, J.; Stanyon, C. A.; Finley, R. L.,Jr.; White, K. P.; Braverman, M.; Jarvie, T.; Gold, S.; Leach, M.; Knight, J.;Shimkets, R. A.; McKenna, M. P.; Chant, J.; Rothberg, J. M. Science 2003, 302,1727.

2. Li, S.; Armstrong, C. M.; Bertin, N.; Ge, H.; Milstein, S.; Boxem, M.; Vidalain, P.O.; Han, J. D.; Chesneau, A.; Hao, T.; Goldberg, D. S.; Li, N.; Martinez, M.; Rual, J.

2013), http://dx.doi.org/10.1016/j.bmc.2013.04.052

Page 6: Bioorganic & Medicinal Chemistry - BGUlifeserv.bgu.ac.il/wp/aaharoni/wp-content/uploads/2017/... · 2017. 5. 25. · Employing directed evolution for the functional analysis of multi-specific

6 M. Levin et al. / Bioorg. Med. Chem. xxx (2013) xxx–xxx

F.; Lamesch, P.; Xu, L.; Tewari, M.; Wong, S. L.; Zhang, L. V.; Berriz, G. F.; Jacotot,L.; Vaglio, P.; Reboul, J.; Hirozane-Kishikawa, T.; Li, Q.; Gabel, H. W.; Elewa, A.;Baumgartner, B.; Rose, D. J.; Yu, H.; Bosak, S.; Sequerra, R.; Fraser, A.; Mango, S.E.; Saxton, W. M.; Strome, S.; Van Den Heuvel, S.; Piano, F.; Vandenhaute, J.;Sardet, C.; Gerstein, M.; Doucette-Stamm, L.; Gunsalus, K. C.; Harper, J. W.;Cusick, M. E.; Roth, F. P.; Hill, D. E.; Vidal, M. Science 2004, 303, 540.

3. Uetz, P.; Giot, L.; Cagney, G.; Mansfield, T. A.; Judson, R. S.; Knight, J. R.;Lockshon, D.; Narayan, V.; Srinivasan, M.; Pochart, P.; Qureshi-Emili, A.; Li, Y.;Godwin, B.; Conover, D.; Kalbfleisch, T.; Vijayadamodar, G.; Yang, M.; Johnston,M.; Fields, S.; Rothberg, J. M. Nature 2000, 403, 623.

4. Nam, H.; Lewis, N. E.; Lerman, J. A.; Lee, D. H.; Chang, R. L.; Kim, D.; Palsson, B.O. Science 2012, 337, 1101.

5. Fraser, H. B.; Hirsh, A. E.; Steinmetz, L. M.; Scharfe, C.; Feldman, M. W. Science2002, 296, 750.

6. Jeong, H.; Mason, S. P.; Barabasi, A. L.; Oltvai, Z. N. Nature 2001, 411, 41.7. Agarwal, S.; Deane, C. M.; Porter, M. A.; Jones, N. S. PLoS Comput. Biol. 2010, 6,

e1000817.8. Ekman, D.; Light, S.; Bjorklund, A. K.; Elofsson, A. Genome Biol. 2006, 7, R45.9. Winzeler, E. A.; Shoemaker, D. D.; Astromoff, A.; Liang, H.; Anderson, K.; Andre,

B.; Bangham, R.; Benito, R.; Boeke, J. D.; Bussey, H.; Chu, A. M.; Connelly, C.;Davis, K.; Dietrich, F.; Dow, S. W.; El Bakkoury, M.; Foury, F.; Friend, S. H.;Gentalen, E.; Giaever, G.; Hegemann, J. H.; Jones, T.; Laub, M.; Liao, H.;Liebundguth, N.; Lockhart, D. J.; Lucau-Danila, A.; Lussier, M.; M’Rabet, N.;Menard, P.; Mittmann, M.; Pai, C.; Rebischung, C.; Revuelta, J. L.; Riles, L.;Roberts, C. J.; Ross-MacDonald, P.; Scherens, B.; Snyder, M.; Sookhai-Mahadeo,S.; Storms, R. K.; Veronneau, S.; Voet, M.; Volckaert, G.; Ward, T. R.; Wysocki, R.;Yen, G. S.; Yu, K.; Zimmermann, K.; Philippsen, P.; Johnston, M.; Davis, R. W.Science 1999, 285, 901.

10. Kim, J. K.; Gabel, H. W.; Kamath, R. S.; Tewari, M.; Pasquinelli, A.; Rual, J. F.;Kennedy, S.; Dybbs, M.; Bertin, N.; Kaplan, J. M.; Vidal, M.; Ruvkun, G. Science2005, 308, 1164.

11. Moldovan, G. L.; Pfander, B.; Jentsch, S. Cell 2007, 129, 665.12. Fridman, Y.; Palgi, N.; Dovrat, D.; Ben-Aroya, S.; Hieter, P.; Aharoni, A. PLoS Biol.

2010, 8, e1000507.13. Koch, C.; Nasmyth, K. Curr. Opin. Cell Biol. 1994, 6, 451.14. Bean, J. M.; Siggia, E. D.; Cross, F. R. Genetics 2005, 171, 49.15. Martin, J.; Hartl, F. U. Curr. Opin. Struct. Biol. 1997, 7, 41.16. Roy, J.; Cyert, M. S. Sci. Signalling 2009, 2, re9.17. Smith, B. C.; Hallows, W. C.; Denu, J. M. Chem. Biol. 2008, 15, 1002.18. Imai, S.; Armstrong, C. M.; Kaeberlein, M.; Guarente, L. Nature 2000, 403, 795.19. Donmez, G.; Guarente, L. Aging Cell 2010, 9, 285.20. Khersonsky, O.; Tawfik, D. S. Annu. Rev. Biochem. 2010, 79, 471.21. Babtie, A.; Tokuriki, N.; Hollfelder, F. Curr. Opin. Chem. Biol. 2010, 14, 200.22. Hult, K.; Berglund, P. Trends Biotechnol. 2007, 25, 231.

Please cite this article in press as: Levin, M.; et al. Bioorg. Med. Chem. (

23. Amar, D.; Berger, I.; Amara, N.; Tafa, G.; Meijler, M. M.; Aharoni, A. J. Mol. Biol.2012, 416, 21.

24. Berger, I.; Guttman, C.; Amar, D.; Zarivach, R.; Aharoni, A. PLoS ONE 2011, 6,e26794.

25. Gamage, N. U.; Tsvetanov, S.; Duggleby, R. G.; McManus, M. E.; Martin, J. L. J.Biol. Chem. 2005, 280, 41482.

26. Hou, L.; Honaker, M. T.; Shireman, L. M.; Balogh, L. M.; Roberts, A. G.; Ng, K. C.;Nath, A.; Atkins, W. M. J. Biol. Chem. 2007, 282, 23264.

27. Zhao, Y.; Sun, L.; Muralidhara, B. K.; Kumar, S.; White, M. A.; Stout, C. D.;Halpert, J. R. Biochemistry 2007, 46, 11559.

28. Khersonsky, O.; Tawfik, D. S. J. Biol. Chem. 2006, 281, 7649.29. Yeung, D. T.; Lenz, D. E.; Cerasoli, D. M. FEBS J. 2005, 272, 2225.30. Aharoni, A.; Griffiths, A. D.; Tawfik, D. S. Curr. Opin. Chem. Biol. 2005, 9, 210.31. Lutz, S. Curr. Opin. Biotechnol. 2010, 21, 734.32. Turner, N. J. Nat. Chem. Biol. 2009, 5, 567.33. Guntas, G.; Kanwar, M.; Ostermeier, M. PLoS ONE 2012, 7, e35998.34. Gould, S. M.; Tawfik, D. S. Biochemistry 2005, 44, 5444.35. Geddie, M. L.; Matsumura, I. J. Biol. Chem. 2004, 279, 26462.36. Tsai, P. C.; Fox, N.; Bigley, A. N.; Harvey, S. P.; Barondeau, D. P.; Raushel, F. M.

Biochemistry 2012, 51, 6463.37. van Loo, B.; Jonas, S.; Babtie, A. C.; Benjdia, A.; Berteau, O.; Hyvonen, M.;

Hollfelder, F. Proc. Natl. Acad. Sci. U.S.A. 2010, 107, 2740.38. Aharoni, A.; Gaidukov, L.; Khersonsky, O.; Mc, Q. G. S.; Roodveldt, C.; Tawfik, D.

S. Nat. Genet. 2005, 37, 73.39. Nobeli, I.; Favia, A. D.; Thornton, J. M. Nat. Biotechnol. 2009, 27, 157.40. O’Brien, P. J.; Herschlag, D. Chem. Biol. 1999, 6, R91.41. McLoughlin, S. Y.; Copley, S. D. Proc. Natl. Acad. Sci. U.S.A. 2008, 105, 13497.42. Fleishman, S. J.; Baker, D. Cell 2012, 149, 262.43. Mandell, D. J.; Kortemme, T. Nat. Chem. Biol. 2009, 5, 797.44. Khare, S. D.; Fleishman, S. J. FEBS J. 2012.45. Erijman, A.; Aizner, Y.; Shifman, J. M. Biochemistry 2011, 50, 602.46. Humphris, E. L.; Kortemme, T. PLoS Comput. Biol. 2007, 3, e164.47. Pazos, F.; Valencia, A. EMBO J. 2008, 27, 2648.48. Lovell, S. C.; Robertson, D. L. Mol. Biol. Evol. 2010, 27, 2567.49. Casari, G.; Sander, C.; Valencia, A. Nat. Struct. Biol. 1995, 2, 171.50. Hannenhalli, S. S.; Russell, R. B. J. Mol. Biol. 2000, 303, 61.51. Pazos, F.; Sternberg, M. J. Proc. Natl. Acad. Sci. U.S.A. 2004, 101, 14754.52. Rausell, A.; Juan, D.; Pazos, F.; Valencia, A. Proc. Natl. Acad. Sci. U.S.A. 1995, 2010,

107.53. Rojas, A. M.; Fuentes, G.; Rausell, A.; Valencia, A. J. Cell Biol. 2012, 196, 189.54. Sandler, I.; Abu-Qarn, M.; Aharoni, A. Mol. Biosys. 2013, 9, 175.55. Zamir, L.; Zaretsky, M.; Fridman, Y.; Ner-Gaon, H.; Rubin, E.; Aharoni, A. Proc.

Natl. Acad. Sci. U.S.A. 2012, 109, E406.

2013), http://dx.doi.org/10.1016/j.bmc.2013.04.052