high-throughput, image-based screening of genetic … each associated with more than one genetic...

Post on 24-Apr-2018

215 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

High-throughput,image-basedscreeningofgeneticvariantlibraries

GeorgeEmanuel1,2,3,JeffreyR.Moffitt1,3,andXiaoweiZhuang1,3,4

1HowardHughesMedicalInstitute,2GraduatePrograminBiophysics,3DepartmentofChemistryandChemicalBiology,4DepartmentofPhysics,HarvardUniversity,Cambridge,MA02138,USA

Correspondencetozhuang@chemistry.harvard.edu

Abstract

Image-based,high-throughput,high-contentscreeningofpooledlibrariesofgeneticperturbationswillgreatlyadvanceourunderstandingbiologicalsystemsandfacilitatemanybiotechnologyapplications.Hereweintroduceahigh-throughputscreeningmethodthatallowshighlydiversegenotypesandthecorrespondingphenotypestobeimagedinnumerousindividualcells.Tofacilitategenotypingbyimaging,barcodedgeneticvariantsareintroducedintothecells,eachcellcarryingasinglegeneticvariantconnectedtoaunique,nucleic-acidbarcode.Toidentifythegenotype-phenotypecorrespondence,weperformlive-cellimagingtodeterminethephenotypeofeachcell,andmassivelymultiplexedFISHimagingtomeasurethebarcodeexpressedinthesamecell.WedemonstratedtheutilityofthisapproachbyscreeningforbrighterandmorephotostablevariantsofthefluorescentproteinYFAST.Weimaged20millioncellsexpressing~60,000YFASTmutantsandidentifiednovelYFASTvariantsthataresubstantiallybrighterand/ormorephotostablethanthewild-typeprotein.

peer-reviewed) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/143966doi: bioRxiv preprint first posted online May. 30, 2017;

2

High-throughputscreeningofgeneticperturbationsisplayinganincreasinglyimportantroleinadvancingbiologyandbiotechnology.Forexample,byobservingtheeffectsofalargenumberofaminoacidchangeswithinaselectedprotein,large-scalescreeningcanenableefficientsearchesforfluorescentproteinsbetteradaptedforbioimagingorproteinandnucleicaciddrugswithdesiredtherapeuticproperties.Itcanalsoenabletheexaminationofhowmutationsofaproteinaffectcellfunctionorphysiology.Sinceeachcelliscomposedofmanygenes,large-throughputscreeningcanalsoallowtheeffectsofinhibitionoractivationofindividualgenesorcombinationsofgenestobetestedatthegenomicscale,whichwillhelpdeciphertheeffectsofgenesandgenenetworksoncellularbehaviors.

Large-scalescreeningeffortsaregreatlyfacilitatedbypooled,high-diversitylibrariesofgeneticperturbationsbecauseoftheeaseandscalabilityassociatedwiththeconstructionofthesepools.Byusingmethodssuchaserror-pronePCR1orcloningwithlarge,definedpoolsofarray-synthesizedoligonucleotides2,itisoftenpossibletocreatepooledlibrarieswithaverylargenumberofgeneticvariantsorperturbationsusingasimilardegreeofeffortasrequiredtomakeanyindividuallibrarymember.Thescreeningofpooledlibrariesthendependscriticallyontheabilitytomeasurenotonlythedesiredphenotypebutalsothegenotypeofthecorrespondinglibrarymembers.Themeasuredphenotypescanbesimpleinsomecases,suchasproteinaffinityasmeasuredbystandardpull-downassays3,4,orcellularfluorescenceasdeterminedbyFACS5,andinbothcases,thegenotypecanbedeterminedviasortingorenrichinglibrarymemberswiththedesiredphenotypefollowedbytechniquessuchassequencingtodeterminethegenotype.However,therearemanyphenotypesthatcannotbequantifiedwiththesetechniques.Phenotypesrangingfromcellularmorphologyanddynamicstotheintracellularorganizationofdifferentcellularcomponentsrequirehigh-resolution,time-lapseopticalmicroscopytobemeasured.Unfortunately,ithasprovenchallengingtocombinepooledlibraryscreeningwithhigh-resolutionopticalmicroscopybecauseitisdifficulttoisolateorrecoverindividuallibrarymembersbasedontheirimagedphenotypeandthendeterminetheirgenotype.If,however,onehadtheabilitytomeasurethegenotypeofindividuallibrarymembersalsobyimaging,thenlibrarymemberisolationandrecoverywouldnotberequired,makingitpossibletocombinetheeaseandscalabilityofpooledlibraryscreeningwithimaging-basedphenotypemeasurements.

Herewereportanovelhigh-throughput,imaging-basedscreeningmethodthatallowsthecharacterizationofbothphenotypeandgenotypeforpooledpopulationsofgeneticallydiversecells.Inthismethod,weassociateeachgeneticvariantwithauniquebarcodecomposedofaseriesofshortoligonucleotidehybridizationsites.Afterintroducingthebarcodedgeneticvariantlibraryintoapopulationofcellsandmeasuringphenotypeswithimaging,thecellsarefixedandthebarcodesaredeterminedusingmultiplexederrorrobustfluorescenceinsituhybridization(MERFISH),amethodthatutilizescombinatoriallabelingandsequentialimagingtoidentifyalargenumberofbarcodes6.Wetestedthefeasibilityandquantifiedtheaccuracyofthisscreeningapproachbyscreeningalibraryof1.5millioncellswith80,000uniquebarcodesinwhichcellseitherdidordidnotexpressthefluorescentproteinmMaple37.Basedonthesemeasurements,weestimatedthatourgenotypemisidentificationrateislessthan1%forindividualcells.Todemonstratethepowerofthisapproach,weutilizedittoimprovethebrightnessandphotostabilityofYFAST,arecentlydevelopedfluorescentproteinthat

peer-reviewed) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/143966doi: bioRxiv preprint first posted online May. 30, 2017;

3

becomesfluorescentuponbindingtoanexogenouschromophore8.Withthisapproach,weefficientlyscreened20millionE.colicellscontaining~160,000uniquebarcodesand~60,000uniqueYFASTmutants,whichresultedintheidentificationofYFASTvariantwithsubstantiallyincreasedbrightnessandphotostability.Byutilizinghigh-resolutionimaging,weenvisionthatthisapproachalsohasthepotentialtoscreentheeffectofgeneticperturbationsonothercellularpropertiesthatwouldbedifficulttomeasurewithnon-imagingmethods,suchasthemorphologyanddynamicsofthecell,ortheintracellulardistributionsofproteins/RNAsandthespatialorganizationofthegenome.

InourpreviousdemonstrationofMERFISH6,weutilizedthisapproachtomassivelyincreasethemultiplexingofsingle-moleculefluorescenceinsituhybridization9,10andmeasurealargenumberofdistinctRNAspeciessimultaneouslyinsinglecells6.WesubsequentlyimprovedthethroughputofMERFISHtoallowalargenumberofcellstobemeasuredinashortperiodoftime11.HerewereasonedthatwecouldusethisapproachtoidentifythegenotypeofindividualcellsifeachcellcontainsauniqueDNAbarcodethatisassociatedwiththegenevariantofinterest.Thus,cellscouldbeidentifiedbyeffectivelymeasuringtheidentityoftheRNAthatitexpressedfromthebarcode.Toconstructsuchbarcodes,wedesignednucleotidesequencesthatcontainaconcatenationofafixednumberofhybridizationsites,wherethesequenceofeachhybridizationsitewasoneofapairofuniquesequencesassociatedwiththatsite(Fig.1A).Wetermthesesequencesreadoutsequences.EffectivelyeachbarcodesequencecanbethoughtofasrepresentinganN-bitbinarycode,wherethereisauniquereadoutsequencetorepresenta“1”ora”0”ineachbit.Thus,eachbarcodeiscomprisedofasetofNhybridizationsitesdrawnfrom2Nuniquereadoutsequences:readoutsequence1-0,readoutsequence1-1,readoutsequence2-0,readoutsequence2-1,…,readoutsequenceN-0,readoutsequenceN-1.Forexample,abarcodesequencecorrespondingtothebinaryword101…1wouldconsistofreadoutsequence1-1,followedbyreadoutsequence2-0,thenreadoutsequence3-1,…,andfinallyreadoutsequenceN-1.Asimilarapproachcouldbeusedtoconstructalternativebarcodesusingmoreuniquereadoutsequencesateachsite,i.e.threesiteswouldcorrespondtoaternarycode,orusingtheabsenceofasiteasameasurablesignal,i.e.a‘1’couldbethepresenceofasitewhilea‘0’couldberepresentedbyitsabsence.

Asanillustrationofonewayinwhichthisbarcodingschemecouldbeusedtoidentifygeneticvariants,wecreatedalibraryofplasmidsexpressingtheseN-bitbarcodesandalibraryofplasmidsexpressingaseriesofgeneticvariantsofaproteinofinterest.Tocreateabarcodedlibraryofgeneticvariants,wefusedthebarcodesequencesandthesequencesexpressingthegeneticvariantstocreatealibraryofnewplasmids,eachofwhichexpressesarandomcombinationofageneticvariantandabarcode,andintroducedthelibraryintoE.colicellssuchthateachcellonlyexpressesoneplasmid(Fig.1B)(seeMaterialsandMethodsforthedetails).Toreducethechanceofabarcodeappearinginthelibraryassociatedwithmorethanonegeneticvariant,webottleneckedthediversityofthebarcodedgeneticvariantlibrarybylimitingthenumberofcellsexpressingthebarcodedgeneticvariantlibraryto1to10%ofthetotalbarcodediversityof2N.Wethendeterminedwhichbarcodeisassociatedwithwhichgeneticvariantbyextractingtheseplasmidsandsequencingthemwithnextgenerationsequencingtoconstructalook-uptable.Thesemeasurementsalsodetectedaremainingsmallfractionofbarcodesthatwere

peer-reviewed) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/143966doi: bioRxiv preprint first posted online May. 30, 2017;

4

eachassociatedwithmorethanonegeneticvariant.Ifdetected,thesebarcodeswereremovedfromfurtheranalysis.

Toscreenthisbarcodedgeneticvariantlibrary,weimagedcellularphenotypeswhilethecellswerestillalive(Fig.1C).Then,thecellswerefixedwithoutremovingthemfromthemicroscope,andtheRNAexpressedfromthebarcodeswerereadoutusingmultiplexederror-robustfluorescenceinsituhybridization(MERFISH)6.

Duringthebarcodereadoutprocess,multiplehybridizationroundswereusedandfluorescentlylabeledreadoutprobescomplementarytoeachreadoutsequenceonthebarcodewerehybridizedineachroundtodetectwhichreadoutsequencesarepresentinwhichcells.First,readoutprobe1-0,complementarytoreadoutsequence1-0,wasintroducedsothatitcanhybridizetocellsthatcontainreadoutsequence1-0,namelythecellscontainingbarcodeswhosefirstbitis“1”,causingthosecellstobecomebrightlyfluorescent.Allthecellswereimagedandthenthefluorescencesignalremovedfromthesample.Then,readoutprobe1-1washybridized.Sinceeverybarcodecontainseitherreadoutsequence1-0orreadoutsequence1-1,thecellsthatdidnotbecomefluorescentinthefirstroundshouldbecamefluorescentinthesecondround.Thevalueforbit1foreachcellwasthenassignedbasedonthefluorescenceintensityratiobetweenthesetworounds.ThisprocesswasiterateduntilallNbitswereprobed.Toreducethenumberofhybridizationrounds,weusedmultiplecolorimagingwithspectrallydistinctfluorescentdyestoprobeforthepresenceofmultiplereadoutsequencessimultaneouslyineachround11.Three-colorimagingwasusedinthisworkthoughareadoutschemeusingmorecolorsisalsopossible.

SinceeachcellexpressesmanycopiesofitscorrespondingbarcodeRNA,thefluorescencesignalwasverybright,andhencethereadouterrorrateforeachbitwasverysmall.Thus,wedidnotfinditnecessarytouseerrorcorrectingcodesinthiscase,aswepreviouslyusedforMERFISH6,butinsteadall2Npossiblebarcodescouldbeusedinprinciple.However,inpractice,toavoidabarcodeappearingpairedwithmultiplemutantsinthesamelibrary,webottleneckedthenumberofuniquelibrarymembers,asdescribedabove.Theuseofonlyasubsetofbarcodesallowedtheexperimentalmeasurementofthefrequencywithwhichunusedbarcodesweredetected,whichinturnallowedustoquantifytherateofmisidentifyingthegenotypeofacell,aswedescribebelow.Withaconstantdegreeofbottlenecking,thistypeofinternalerrormeasurementanderrorrobustnessismaintainedevenasthenumberofpossiblebinarybarcodesincreasesexponentiallywiththenumberofbits.Thus,thisbarcodingschemeshouldallowmillionsofuniquebarcodestobemeasuredwithonlytensofhybridizationrounds.

Totesttheaccuracyofthisscreeningapproach,wecreatedalibrarycontainingonlytwo“geneticvariants”,themTagBFP2geneandthefusionofmTagBFP2andmMaple3genes(Fig.2A).Twobarcodedlibrarieswerecreatedbymerginga21-bitbinarybarcodelibrary,consistingofmorethan2million(221)uniquebarcodes,withthetwoplasmids(onecontainingthemTagBFP2andtheothercontainingmTagBFP2-mMaple3gene),oneforeachlibrary,byisothermalassembly.Then,eachofthesecompletelibrarieswaselectroporatedintoE.colicells,andafixednumberofcellswereextractedtobottlenecktheselibrariesto~40,000uniquemembers.Theplasmidswereextractedfromthecellsandsequenced

peer-reviewed) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/143966doi: bioRxiv preprint first posted online May. 30, 2017;

5

todeterminewhichofthe~2millionpossiblebarcodeswerepresentineachlibraryand,thus,associatedwiththepresenceorabsenceofmMaple3.Sequencingconfirmedthatinthemixtureofthetwolibraries,~80,000uniquebarcodeswerepresent,asexpected,representing4%ofall221possiblebarcodes.

Wethencharacterizedthiscombinedlibraryusingourscreeningstrategy.WefirstmeasuredthefluorescencepropertiesofthecellsexpressingmTagBFP2ormTagBFP2-mMaple3byilluminatingwith405-nmlighttomeasuremTagBFP2fluorescence,illuminatingwith405-nmlightforanadditional~4sinordertoswitchthemMaple3proteintoitsred-shiftedfluorescentstate,andthenmeasuringthefluorescenceintensityofthered-shiftedmMaple3byilluminatingwith560-nmlight.Wethenfixedthecellswithmethanolandacetoneandreadoutthebarcodesineachcellusingtheproceduredescribedabove.WeutilizedalcoholfixationasopposedtocrosslinkingfixativessuchaparaformaldehydesinceithasbeenestablishedthatalcoholfixationgreatlyenhancestherateofhybridizationtoRNA12.

Duringthebarcodereadingstep,weexaminedthefluorescencesignalobservedforindividualcellsduringdifferentroundsofhybridizationandimaging.Indeed,asexpected,wefoundthatcellsthatwerebrightforonereadoutofagivenbitweredimfortheotherreadout(Fig.2B).Forall1.5millioncellsobserved,atwo-dimensional(2D)histogramofthebit1measurements,i.e.thefluorescenceintensitiesdeterminedintheprobe1-0imagingroundandprobe1-1imaginground,wasconstructed,andthishistogramsuggeststherearetwodistinctpopulationsofcells(Fig.2C).Thefirstpopulationappearedbrightwhenhybridizedtoprobe1-0anddimwhenhybridizedtoprobe1-1whilethesecondpopulationappeareddimwhenhybridizedtoprobe1-0andbrightwhenhybridizedtoprobe1-1.Thisobservationisconsistentwiththereadoutsequence1-0beingpresentinthefirstpopulationandthereadoutsequence1-1beingpresentinthesecondpopulation.However,asubstantialfractionofcellsappeareddarkinbothimagingrounds,possiblybecausetheyarenotexpressingsufficientbarcodeRNA,ortheyareinsufficientlypermeabilizedforreadoutprobehybridization.Wethereforeusedathresholdingstrategytoremovethesedimcellsfromfurtheranalysis.Specifically,werequirethatthe“0”or“1”readoutsignalforeachbitislargerthanthemedianintensityobservedforthatreadoutsignalacrossallcells.Morethan600,000measuredcellssatisfiedthisconservativecriterion,andthebarcodesexpressedinthesecellsweredetermined.Amongthesecells,84%ofthemeasuredbarcodesmatchedabarcodecontainedinthelibraryasdeterminedbysequencing(Fig.2D).

Fortheunmatched16%ofthecells,anexperimentalerrormusthaveoccurred.Eitherthebarcodewaspresentinthelibrarybutitwasnotdetectedbysequencingorthebarcodewasnotpresentinthelibraryandanerroroccurredduringbarcodeimagingthatmisidentifiedthebarcode.Whilewedidnotusethesecellsinfurtheranalysis,thepresenceofthisunmatchedfractioncanbeusedtoestimateourexperimentalerrorratesinbarcodeidentification.Wefirstnotethatthelibraryonlycontains4%ofallpossiblebarcodesforthe21-bitbinaryencodingused,asdescribedabove,soassumingthatareadouterrorintheimagingprocessisequallylikelytoresultinacellbeingassignedanyofthe221barcodes,thereisa96%chancethattheerrorresultsinidentifyingabarcodethatdoesnotmatchoneinthelibrary(type1error).Next,wedenotetheprobabilitythatthebarcodeinacellisincorrectlydeterminedasx.Thenthisprobabilityxmultipliedby96%shouldbeequalto16%,thefractionofcellsthatwefoundcontainingbarcodesthatdidnotmatchanyoneinthelibrary.Hence,xshouldbeequalto0.167andthe

peer-reviewed) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/143966doi: bioRxiv preprint first posted online May. 30, 2017;

6

probabilitythattheerrorresultsinadifferentbarcodethatispresentinthelibraryshouldbeonly4%ofx,whichis~0.67%(type2error).Wenotethatonlytype2errorswouldaffectourfinalresultsbecauseonlytheseerrorshavethecapacitytogenerateamisidentifiedgenotype.Type1errorswouldbedetectedasunmatchedbarcodesanddiscarded.Therefore,ourestimatedmisidentificationrate,i.e.theprobabilitythatbothanerroroccursandtheerroryieldsabarcodethisisalreadypresentinthelibrary,islessthanonepercent.

ThefidelityofthebarcodemeasurementcanalsobeverifiedbycomparingthemeasuredphenotypeofcellsfromthemMaple3fluorescenceleveltothephenotypepredictedbythemeasuredbarcode.WeusedthephenotypemeasurementsdescribedabovetodeterminethemTagBFP2fluorescenceintensityandthemMaple3fluorescenceintensityofeachcell(Fig.2F).AllcellsthatcontainedthesamemeasuredRNAbarcodeweregroupedandthemedianratioofmMaple3intensitytomTagBFP2intensitywascalculatedforeachgroupcontainingatleast5cells.SincefromsequencingweknowwhichbarcodesshouldbeassociatewithmMaple3,wecalculatedthemedianintensityratioforcellsidentifiedforeachofthesebarcodesandconstructedahistogramoftheseratios.Usingthesameapproach,weconstructedahistogramoftheratiosforthebarcodesknowntoonlybeassociatedwithmTagBFP2(Fig.2G).Thesetwodistributionsarelargelyseparatedwithonlyasmalloverlap.WesetathresholdbasedontheintersectionpointofthetwohistogramssuchthatthecellswithfluorescenceintensityratioslargerthanthisthresholdwereclassifiedascontainingthemTagBFP2-mMaple3fusionproteinandthecellswiththeintensityratiobelowthethresholdascontainingmTagBFP2.Wethencomparedthesephenotypeassignmentsbasedonfluorescenceintensitytothegenotypesbasedonthemeasuredbarcodes.Wefoundthatlessthan1%ofcellshaveaphenotypeandgenotypedisagreement,indicatingarateofmisidentificationcomparabletothatestimatedabove.However,wenotethatthiserrorrateislikelyanoverestimatesincethenaturalspreadintheintensitydistributionofcellsineachgroupshouldallowfromsomeoverlapofthesedistributions.Basedontheabovequantifications,weconcludethatourhigh-throughputimaging-basedscreeningapproachiscapableofaccuratelyassociatingthephenotypewiththegenotypeinalargenumberofcells.

Todemonstratetheutilityofourapproachforscreeningalargelibraryofmutantstofindproteinswithdesiredproperties,wescreenedforimprovedvariantsofarecentlydevelopedfluorescentprotein,YFAST.YFASTisnotitselffluorescentbutonlybecomesfluorescentuponbindingtoanexogenous,GFP-likechromophore,suchasHBRorHMBR(Fig.3A)8.WecreatedalibraryofYFASTvariants,mergeditwithour21-bitbarcodelibrary,andsequencedtheresultingplasmidlibrarytobuildthelookuptablebetweenvariantandbarcode.

WesoughttosimultaneouslyimprovetwopropertiesofYFAST:thefluorophorebrightnessandthephotobleachingkinetics.Wenotethatwhilefluorophorebrightnessisapropertythatcanbemeasuredandselectedviamoretraditionalscreeningmethods,e.g.FACS,photobleachingkineticsrequireatime-lapsemeasurementofthefluorescencefromasinglecelland,thus,wouldbechallengingtoperformwithotherapproaches.Inparticular,weobservedthatthephotobleachingdecayofYFASTfollowsadoubleexponentialdecaywithonecomponentdecayingmuchfasterthantheother.Wesoughttoidentifymutantsthateliminatethisfastdecayingcomponent.

peer-reviewed) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/143966doi: bioRxiv preprint first posted online May. 30, 2017;

7

TomeasurethebrightnessofdifferentYFASTvariantswhilecontrollingforpotentialvariationsintheexpressionlevel,wefusedYFASTvariantstomTagBFP2andimagedindividualcellswithboth405-nmand488-nmillumination,respectively(Fig.3A).TherelativebrightnessofYFASTwascalculatedastheratioofthebackgroundsubtractedYFASTfluorescenceintensitymeasuredwith488-nmilluminationinthepresenceofthechromophoretothemTagBFP2intensitymeasuredwith405-nmilluminationintheabsenceofthechromophore.TocharacterizethephotobleachingkineticsofYFASTvariants,wemeasuredthedecreaseinintensityover20framesunderconstant488-nmilluminationalone(Fig.3B).

WeconstructedaseriesofYFASTvariantlibrariesthatcontain(1)allpossiblesingleaminoacidsubstitutions,insertions,anddeletions(termedsingle-amino-acidlibraries),(2)multiplemutationssurroundingthechromophorebindingpocketineachlibrarymember(chromophoreadjacentlibrary),(3)acombinationofthebestmutationsidentifiedfromthechromophoreadjacentlibraryandthebestmutationsidentifiedfromthesingleaminoacidlibrariesthatarealsonearthechromophore,or(4)allpossiblesingleaminoacidsubstitutions,insertions,anddeletionsbasedonafavorablemutantderivedfromtheabovelibraries.Intotal,weconstructedlibrariescontainingroughly60,000uniqueYFASTvariantsassociatedwith~160,000barcodes,andweusedourhigh-throughput,image-basedscreeningmethodtomeasurethebrightness,photobleachingkinetics,andgenotypeforthislibraryofvariantsacross20milliontotalcells.Fromeachcell’sphotobeachingdecaycurve,wedeterminetherelativeamplitudeofthefastdecaycomponent(fast-photobleachingamplitude).Wethengroupedcellsbasedonthegenotypesmeasuredandcomputedthemedianbrightnessandfast-photobleachingamplitudeforeachofthesecellgroups(Fig.3C).

Weobserveawiderangeofrelativebrightnessvaluesandfast-photobleachingamplitudes.Toconfirmthatthesevariationsrepresenttruephenotypicvariability,wefirstselectedtwovariants,thewildtypeYFAST(greendotinFig.3C)andavariantwithamuchsmaller(nearlyeliminated)fast-photobeachingamplitude(bluedotinFig.3C),andplottedthemeasuredphotobeachingdecaycurvesforthehundredsofmeasuredindividualcellsthatcontainedthetwobarcodesassociatedwiththesegenotypes.Thephotobleachingdecaycurvesmeasuredforthesetwosetsofcellsclearlyseparateintotwopopulationsthatcorrelatestronglywiththeirgenotypes,indicatingahighdegreeofreproducibilityinthemeasurementsofphenotypeswithinindividualcells.Next,weindividuallyclonedthesetwoYFASTvariantsandmeasuredtheirpropertiesinpureculture.Weobservenearlyidenticalimprovementinphotobleachingkineticsinthesemeasurementsaswedidwhenthesephenotypesweremeasuredinthecontextofthevariantlibrary.Byscreening~60,000variantsofYFAST,weidentifiedmutantsthataresubstantiallybrighter,ormutantsthathaveeliminatedthefast-photobeachingcomponent,ormutantswithimprovementsinbothaspects(Fig.3F,G).Moreover,becausewehaveanexactgenotypemeasuredforeachphenotype,wehaveproducedarichdatasetofYFASTmutationsandtheirphenotypicconsequencesthatcouldbeexaminedtoextractinformationonboththebiophysicalpropertiesofthisproteinaswellasinformationthatcouldguidefuturemutationalscreens.

Insummary,wedevelopedamethodforimage-basedscreeningoflargegeneticvariantlibrariesbyco-expressingthegeneticvariantsandbarcodethatcanidentifythesegeneticvariantsincells,anddeterminingboththephenotypesofthegeneticvariantsandthebarcodesinthesamecellsusingimaging.ByreadingoutbarcodesusingmassivelymultiplexedFISH,wedemonstratedtheabilityto

peer-reviewed) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/143966doi: bioRxiv preprint first posted online May. 30, 2017;

8

screenhundredsofthousandsofbarcodesthatcorrespondtotensofthousandsofuniquegeneticvariations.Usingthisapproach,weidentifiedmutationsintheYFASTprotein,arecentlydiscoveredligand-dependentfluorescentprotein,withsubstantiallyimprovedbrightnessandphotostability.Weexpectthatthisnovelhigh-throughput,image-basedscreeningmethodcanbeappliedbroadlytoimprovingpropertiesoridentifyingnewpropertiesofproteinsandnucleicacids,aswellastodecipheringtheeffectsofgenesandgenenetworksoncellular/organismbehaviorsatthegenomicscale.

peer-reviewed) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/143966doi: bioRxiv preprint first posted online May. 30, 2017;

9

MaterialsandMethods

Barcodelibraryassembly

Thebarcodelibraryconsistsofasetofplasmids,eachcontainingaDNAbarcodesequencethatencodesaRNAdesignedtorepresentasingle22-bitbinarywordthatistranscribedbythelpppromoter.Everybarcodeinthelibraryhasexactly22readoutsequences,onecorrespondingtoeachbit,designedtobereadoutbyhybridizingfluorescentprobeswiththecomplementarysequence.Although22bitsarepresentinthebarcode,toreducethenumberofhybridizationrounds,experimentswereconductedreadingouteither18or21ofthepossiblebits.Foreachbitposition,weassignedone20-mersequencetoencodeavalueof0andanother20-mersequencetoencodeavalueof1.Toaidquickhybridization,theseencodingsequenceswereconstructedfromathree-letternucleotidealphabet,onewithonlyA,T,andC,inordertodestabilizeanypotentialsecondarystructures13.TheutilizedsequencesweredrawnfromthosepreviouslyusedforMERFISHwithadditionalsequencesdesignedusingapproachesdescribedpreviously11.Foreachbarcode,thebitsareconcatenatedwithasingleGseparatingeach.

Weassembledthisbarcodelibrarybyligatingamixtureofshort,overlappingoligonucleotides.Foreachpairofadjacentbits,therearefouruniquecombinationsofbitvalues.Eachcorrespondingsequencewassynthesizedasasingle-strandedoligo.Theseoligoswerethenligatedtofromcomplete,double-strandedbarcodesthatcontainconcatenatedsequencesofallbitswithallpossiblebitvalues.

Fortheligationstep,alloligosweremixedanddilutedsothateacholigowaspresentataconcentrationof100nM.ThemixturewasphosphorylatedbyincubationwithT4polynucleotidekinase(16µLoligomixture,2µLT4ligasebuffer,2µLPNK(NEB,M0201S))at37°Cfor30minutesandligatedbyadding1µLT4ligase(NEB,M0202S)andincubatingfor1houratroomtemperature.

Toprepareaplasmidlibrarycontainingthesebarcodesequencesalongwiththedesiredpromoter,wedilutedtheligationproduct10-foldandamplifiedbylimited-cyclePCRonaBio-RadCFX96usingPhusionpolymerase(NEB,M0531S0)andEvaGreen(Biotium,3100).ThePCRproductwasruninanagarosegelandthebandoftheexpectedlengthwasextractedandpurified(ZymoZymocleanGelDNARecoveryKit,D4002).Thepurifiedproductwasinsertedbyisothermalassembly14for1hourat50°C(NEBNEBuilderHiFiDNAAssemblyMasterMix,E2621L)intoaplasmidbackbonefragmentcontainingthecolE1origin,theampicillinresistancegene,andotherelementstakenfromthepZseriesofplasmids15.Theassembledplasmidswerepurified(ZymoDNACleanandConcentration,D4003),elutedinto6µLwater,mixedwith10µLofelectro-competentE.colionice(NEB,C2986K),andelectroporatedusinganAmaxaNucleofectorII.Immediatelyafterelectroporation,1mLSOCwasaddedandtheculturewasincubatedat37°Conashakerforonehour.Subsequently,theSOCculturewasdilutedinto50mLofLB(Teknova,L8000)supplementedwith0.1mg/mLcarbenicillin(ThermoFisher,10177-012)andplacedontheshakerat37°Covernight.Thefollowingday,theculturewasminiprepped(ZymoZyppyPlasmidMiniprepKit,D4019),yieldingthecompletebarcodelibrary.

Assemblingproteinmutantlibraries

peer-reviewed) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/143966doi: bioRxiv preprint first posted online May. 30, 2017;

10

Tocreatealibraryofmutantproteins,shortnucleotidesequencescontainingregionsoftheproteinofinterestwiththedesiredmutationsweresynthesizedascomplexoligonucleotidepools.Tothencreatethedesiredmutantgenesfromthesepools,weamplifiedthepoolanditscorrespondingexpressionplasmidvialimitedcyclePCRandassembledthesefragmentsusingisothermalassembly14.TheexpressionbackbonewasderivedfromthecolE1originandthechloramphenicolresistancegenefromthepZseriesofplasmids15.Oligopoolsynthesisispronetodeletions,whichcouldleadtoframeshiftmutationsthatproducenon-viableproteins.Toremovethesevariantspriortomeasurement,theproteinvariantsweretranslationallyfusedupstreamtothechloramphenicolresistanceprotein.TheseconstructswereelectroporatedintoE.coli,asdescribedabove,andtheseculturesgrowninthepresenceofchloramphenicoltoselectonlyforproteinvariantsthatdidnothaveframe-shiftmutationsandwhichcould,thus,translatecomponentchloramphenicolresistance.Theseplasmidswerere-isolatedviaplasmidminiprepandthegeneticvariantsextractedviaPCRpriortocombinationwiththebarcodelibrary.

Mergingmutationlibrarieswiththebarcodelibrary

Tomergeamutantlibrarywiththebarcodelibrary,thecorrespondinghalvesofeachplasmidlibrarywereamplifiedbylimited-cyclePCR.Ofnote,theforwardprimerforamplifyingthebarcodelibrarycontained20randomnucleotidessothateachassembledplasmidcontaineda20-meruniquemolecularidentifier(UMI).Also,theproteinmutanthalfcontainedtheplasmid’sreplicationorigin(colE1)whilethebarcodehalfcontainedtheampicillinresistancegeneensuringthatonlyplasmidscontainingbothhalveswerecompetent.ThetwohalveswereassembledbyisothermalassembleandtransfectedintoelectrocompetentE.coliasdescribedearlier.AfterincubatinginSOCfor1hourat37°C,theculturewasagaindilutedinto50mLandgrownuntilitreachedanOD600of~1.Tolimitthepossibilitythatasinglebacteriumhadtakenupmorethanoneplasmid,plasmidswereextractedagainfromthisculture,andre-electroporatedintofreshE.coliatadefined,lowconcentration.Thisculturewasthengrownandre-dilutedtothedesirednumberofcellsassuminganOD600of1correspondedto800millioncells.Thedilutedculturewasincubatedat37°Covernightandthefollowingdayitwasarchivedforfutureimagingexperimentsbydiluting1:1in50%glycerol(Teknova,G1796),separatinginto100µLaliquots,andstoredat-80°C.Theremainingculturewasmini-preppedtouseasaPCRtemplateforconstructingthebarcodetogenotypelookuptable.

Constructingbarcodetogenotypelookuptable

Sincebarcodesandmutantsareassembledrandomly,nextgenerationsequencingwasusedtoconstructalook-uptablefrombarcodetomutant.Thetotallengthoftheproteinmutantandthebarcodeexceededthereadlengthofthesequencingplatformused(IlluminaMiSeq).Tocircumventthischallenge,multiplefragmentswereextractedfromeachlibrary,sequencedindependentlyandgroupedcomputationallyusingtheUMI.

Themini-preppedlibrarieswerepreparedforsequencingbytwosequentiallimited-cyclePCRs.ThefirstPCRextractedthedesiredregionwhileaddingthesequencingprimingregions,andthesecondPCR

peer-reviewed) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/143966doi: bioRxiv preprint first posted online May. 30, 2017;

11

addedmultiplexingindicesandtheIlluminaadaptersequences.BetweenPCRs,theproductwaspurifiedinanagarosegelandthefinalproductwasgelpurifiedpriortosequencing.

Foreachsequencingread,thecorrespondingbarcodeormutantsequencewasextracted.ThereadswerethengroupedbycommonUMIandthemostfrequentlyoccurringbarcodeandproteinmutantseenforeachUMIwasassignedtothatUMI,constructingthebarcodetomutantlookuptableforeveryvariantinthelibrary.

Phenotypeandbarcodeimaging

Eachlibrarywaspreparedforimagingbythawingthe100µLaliquotfrom-80°Ctoroomtemperatureanddilutinginto2mLLBsupplementedwith0.1mg/mLcarbenicillin.Imagingcoverslips(Bioptechs,0420-0323-2)in60-mm-diametercellculturedisheswerepreparedbycoveringin1%polyethylenimine(Sigma-Aldrich,P3143-500ML)inwaterfor30minutesfollowedbyasinglewashwithphosphatebufferedsaline(PBS).TheE.coliculturewasdiluted10-foldintoPBS,pouredintotheculturedish,andspunat100gfor5minutestoadherecellstothesurface.

ThesamplecoverslipwasassembledintoaBioptech’sFCS2flowchamber.Aperistalticpump(Gilson,MINIPULS3)pulledliquidthroughthechamberwhilethreecomputer-controlledvalves(Hamilton,MVPandHVXM8-5)wereusedtoselecttheinputfluid.ThesamplewasimagedonacustommicroscopebuiltaroundaNikonTi-UmicroscopebodywithaNikonCFIPlanApoLambda60xoilimmersionobjectivewith1.4NA.Illuminationwasprovidedat405,488,560,647,and750nmusingsolid-statesingle-modelasers(Coherent,Obis405nmLX200mW;Coherent,GenesisMX488-1000;MPBCommunications,2RU-VFL-P-2000-560-B1R,MPBCommunication,2RU-VFL-P-1500-647-B1R;andMPBCommunications,2RU-VFL-P-500-750-B1R)inadditiontotheoverheadhalogenlampforbrightfieldillumination.TheGaussianprofilefromthelaserswastransformedintoatop-hatprofileusingarefractivebeamshaper(Newport,GBS-AR14).Theintensityofthe488-,560-,and647-nmlaserswascontrolledbyanacousto-optictunable-filter(AOTF),the405-nmlaserwasmodulatedbyadirectdigitalsignal,andthe750-nmlaserandoverheadlampwereswitchedbymechanicalshutters.Theexcitationilluminationwasseparatedfromtheemissionusingacustomdichroic(Chroma,zy405/488/561/647/752RP-UF1)andemissionfilter(Chroma,ZET405/488/461/647-656/752m).TheemissionwasimagedontoanAndoriXon+888EMCCDcamera.Duringacquisition,thesamplewastranslatedusingamotorizedXYstage(Ludl,BioPrecision2)andkeptinfocususingahome-builtautofocussystem.

Phenotypemeasurementswereconductedimmediatelyaftercellsweredepositedontothecoverslipandinsertedintotheflowchamber.FortheYFASTmutants,imageswerefirstacquiredinPBSintheabsenceofthechromophorewith405-nmilluminationtomeasurethemTagBFP2fluorescencefollowedbyanimagewithbrightfieldilluminationforalignmentbetweenmultipleimagingrounds.Then10µMHMBRorHBRinPBSwasflowedoverthecellsandafluorescenceimagewasacquiredwith488-nmilluminationtomeasureYFASTandabrightfieldimageforalignment,followedbytwentyoreightyimagesat8.4Hzwithconstant488-nmilluminationtomeasurethedecreaseinintensityuponphotobleaching.Ineachimaginground,imageswereacquiredat1,427or3,223locationsinthesample.

peer-reviewed) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/143966doi: bioRxiv preprint first posted online May. 30, 2017;

12

Followingthephenotypemeasurement,thecellswerefixedbyincubationinamixtureofmethanolandacetoneata4:1ratiofor30minutes.Topreventsaltsfromprecipitatingandcloggingtheflowsystem,waterwasflowedbeforeandafterthefixationmixture.Oncefixed,thecellswerewashedin2xSalineSodiumChloride(SSC)andhybridizationbegan.

TodeterminetheRNAbarcodeexpressedwithineachcell,MERFISHwasperformedusingsimilarprotocolstothosedescribedpreviously11.Foreachhybridizationround,thesamplewasincubatedfor30minutesinhybridizationbuffer[2xSSC;5%w/vdextransulfate(EMDMillipore,3730-100ML),5%w/vethylenecarbonate(Sigma-Aldrich,E26258-500G),0.05%w/vyeasttRNA,and0.1%v/vMurineRNaseinhibitor(NEB,M0314L)]withamixtureofreadoutprobeslabeledwitheitherATTO565,Cy5,orAlexa750(Bio-SynthesisInc)eachataconcentrationof10nM.Thedyeswerelinkedtothereadoutprobesthroughadisulfidebond.Then,thehybridizationbufferwasreplacedbyanoxygen-scavengingbufferforimaging[2xSSC;50mMTrisHClpH8,10%w/vglucose(Sigma-Aldrich,G8270),2mMTrolox(Sigma-Aldrich,238813),0.5mg/mLglucoseoxidase(Sigma-Aldrich,G2133),and40µg/mLcatalase(Sigma-Aldrich,C100-500mg)].Eachpositionintheflowcellwasimagedwith750-,647-,and560-nmilluminationfromlongesttoshortestwavelengthfollowedbybrightfieldilluminationbeforecontinuingtothenextlocation.Followingtheimagingofallregions,thedisulfidebondlinkingthedyestothereadoutprobeswerecleavedbyincubatingthesamplein50mMtris(2-carboxyethyl)phosphine(TCEP;Sigma-Aldrich,646547-10X1ML)in2xSSCfor15minutes.Thesamplewasthenrinsedin2xSSCandthenexthybridizationroundstarted.Foreachroundofhybridization,threereadoutswithspectrallydiscernabledyeswerehybridizedsimultaneously.Altogether,with14hybridizationrounds,all42readoutscorrespondingto21bitsweremeasuredin40hours.Forsmallerlibraries,theimagingareaandthenumberofhybridizationroundsweredecreasedtoreducethemeasurementtimeto22hours.

Imageanalysis

Tocorrectforresidualilluminationvariationsacrossthecamera,aflatfieldcorrectionwasperformed.Everyimagewasdividedbythemeanintensityimageforallimageswiththegivenilluminationcolor.Then,theimagesfordifferentroundscorrespondingtothesameregionwereregisteredusingtheimageacquiredunderbrightfieldilluminationbyup-sampledcross-correlationcreatinganormalizedimagestackofallimagesateachpositionintheflowchamber.Iftheradialpowerspectraldensityofanygivenbrightfieldimagedidnotcontainsufficienthighfrequencypower,theimagewasdesignatedasout-of-focusandallimagesforthecorrespondingregionwereexcludedforfurtheranalysis.

Toextractcellintensities,theedgesofeachcellweredetectedusingtheCannyedgedetectionalgorithmonthefirstimageacquiredwith405-nmillumination.Theedgesthatformedclosedboundarieswerefilledinandclosedregionsofpixelswereextracted.Ifagivenclosedpixelregionhadafilledareaofmorethan20pixelsandtheratioofthefilledareatotheareaoftheconvexhullwasgreaterthan0.9,itwasclassifiedasacell.Toincreasethecelldetectionefficiency,thedetectedcellswerethenremovedfromthebinaryimage,theimagewasdilated,filled,anderodedandcellswereextractedagain.Thisallowedcellswheregapsexistinthedetectededgestostillbedetected.Foreachcell,themeanintensitywasextractedforthecorrespondingpixelsineveryimage.

peer-reviewed) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/143966doi: bioRxiv preprint first posted online May. 30, 2017;

13

Fromthecellintensities,thephenotypesandbarcodeswerecalculated.Foreachmeasuredreadoutsequence,themeasuredintensitywasnormalizedbysubtractingtheminimumandtakingthemedianofthesignalobservedforthatreadoutsequenceinallcells.Todeterminewhetherabarcodecontaineda“1”ora“0”ateachbit,themeasuredintensityofthe“1”readoutsequenceandthe“0”readoutsequenceforthatbitwerecompared.Specifically,athresholdwasselectedontheratioofthesetwovalues.Ifthisratiowasabovethethreshold,thebitwascalledasa“1”.Ifitfellbelow,thebitwascalledasa“0”.Becausethe“1”and“0”readoutsequencesassociatedwitheachbitmightbeassignedtodifferentfluorophores,itwasnecessarytooptimizethisthresholdforeachbitindividually.Thisoptimizationwasperformedbyrandomlyselecting150barcodes(atrainingset)fromthesetofknownbarcodescontainedinthelibrary,asidentifiedbysequencing.Aninitialsetofthresholdswasselectedandthefractionofcellsmatchingthesebarcodeswasdetermined.Thethresholdforeachbitwasthenvariedindependentlytoidentifythethresholdsetthatmaximizesthisfraction.Thisoptimizedthresholdsetwasthenusedfordeterminingthebitvaluesforallcells.

Oncethebarcodewasdeterminedforeachcells,cellsweregroupedbybarcodeandthemedianofthevariousphenotypevalueswascomputedtodeterminethemeasuredphenotypeforthegenotypecorrespondingtothatbarcode.ForYFASTmeasurements,thenormalizedintensitywasdeterminedbytheratiooftheYFASTfluorescenceintensitiesunder488-nmilluminationinthepresenceoftheYFASTchromophoretothemTagBFP2fluorescenceintensitiesunder405-nmilluminationintheabsenceofchromophore.Toaccountforthenon-negligiblefluorescencebackgroundpresentinE.coliupon488-nmillumination,themagnitudeofthebackgroundwassubtractedbeforecalculatingthefluorescenceratio.Thebackgroundwasestimatedbycalculatingthemedianintensityofallcellsupon488-nmilluminationpredictedtocontainanon-fluorescentYFASTmutant.Cells,groupedbybarcode,wereassignedtothedarkpopulationifthePearsoncorrelationcoefficientbetweenthe488-nmintensityandthe405-nmintensityforthegroupedcellsfellbelowthearbitrarythresholdof0.2.Whenthetwointensitiesareuncorrelated,itsuggeststhenumberofYFASTproteinsinthecellsassociatedwiththatbarcodedoesnotaffectthebrightnessofthecellandhencetheYFASTshouldbedark.TodeterminetherelativeamplitudeofthefastdecaycomponentofthephotobleachingmeasurementofeachYFASTmutant,wefitthebackgroundsubtractedphotobleachingcurvetothesumoftwoexponentials,withoneofthedecayratessettothefastdecayratedeterminedfromtheoriginalYFAST.TodeterminethefastdecayrateoftheoriginalYFAST,itsbackgroundsubtractedphotobleachingcurvewasfittothesumoftwoexponentialswithbothdecayratesasadjustableparametersandthefasterofthetwodecayrateswasselected.

peer-reviewed) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/143966doi: bioRxiv preprint first posted online May. 30, 2017;

14

References

1. Cadwell,R.C.&Joyce,G.F.RandomizationofgenesbyPCRmutagenesis.PCRMethodsAppl.2,28–33(1992).

2. Kosuri,S.&Church,G.M.Large-scaledenovoDNAsynthesis:technologiesandapplications.Nat.Methods11,499–507(2014).

3. Smith,G.P.Filamentousfusionphage:novelexpressionvectorsthatdisplayclonedantigensonthevirionsurface.Science228,1315–7(1985).

4. Miltenyi,S.,Müller,W.,Weichel,W.&Radbruch,A.HighgradientmagneticcellseparationwithMACS.Cytometry11,231–238(1990).

5. Anderson,M.T.etal.Simultaneousfluorescence-activatedcellsorteranalysisoftwodistincttranscriptionalelementswithinasinglecellusingengineeredgreenfluorescentproteins.Proc.Natl.Acad.Sci.93,8508–8511(1996).

6. Chen,K.H.,Boettiger,A.N.,Moffitt,J.R.,Wang,S.&Zhuang,X.Spatiallyresolved,highlymultiplexedRNAprofilinginsinglecells.Science(80-.).348,aaa6090--aaa6090(2015).

7. Wang,S.,Moffitt,J.R.,Dempsey,G.T.,Xie,X.S.&Zhuang,X.Characterizationanddevelopmentofphotoactivatablefluorescentproteinsforsingle-molecule-basedsuperresolutionimaging.Proc.Natl.Acad.Sci.111,8452–8457(2014).

8. Plamont,M.-A.etal.Smallfluorescence-activatingandabsorption-shiftingtagfortunableproteinimaginginvivo.Proc.Natl.Acad.Sci.U.S.A.113,497–502(2016).

9. Femino,A.M.,Fay,F.S.,Fogarty,K.&Singer,R.VisualizationofSingleRNATranscriptsinSitu.Science(80-.).280,585–590(1998).

10. Raj,A.,vandenBogaard,P.,Rifkin,S.A.,vanOudenaarden,A.&Tyagi,S.ImagingindividualmRNAmoleculesusingmultiplesinglylabeledprobes.Nat.Methods5,877–879(2008).

11. Moffitt,J.R.etal.High-throughputsingle-cellgene-expressionprofilingwithmultiplexederror-robustfluorescenceinsituhybridization.Proc.Natl.Acad.Sci.201612826(2016).doi:10.1073/pnas.1612826113

12. Shaffer,S.M.,Wu,M.-T.,Levesque,&M.J.,Raj,A..TurboFISH:AMethodforRapidSingleMoleculeRNAFISH.PLoSOne8,e75120(2013).

13. Zhang,Z.,Revyakin,A.,Grimm,J.B.,Lavis,L.D.&Tjian,R.Single-moleculetrackingofthetranscriptioncyclebysub-secondRNAdetection.Elife3,e01775(2014).

14. Gibson,D.G.etal.EnzymaticassemblyofDNAmoleculesuptoseveralhundredkilobases.Nat.Methods6,343–345(2009).

15. Lutz,R.&Bujard,H.IndependentandtightregulationoftranscriptionalunitsinEscherichiacoliviatheLacR/O,theTetR/OandAraC/I1-I2regulatoryelements.NucleicAcidsRes.25,1203–10(1997).

peer-reviewed) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/143966doi: bioRxiv preprint first posted online May. 30, 2017;

15

Figures

Fig.1.Ahigh-through,image-basedscreeningmethodusingmassivelymultiplexedfluorescenceinsituhybridization.(A)SchematicdepictionofanRNAbarcode.EachRNAbarcodeconsistsoftheconcatenationofaseriesofhybridizationsequences,eachofwhichcanbeassociatedwithadifferentbitinanN-bitbinarybarcode.Eachhybridizationsequencecanutilizeoneoftworeadoutsequencesuniquetothatposition,withonereadoutsequenceassociatedwitha“1”atthatbitandanotherwitha“0”.(B)Schematicdepictionoflibraryconstruction.Thelibraryofbarcodesismergedwithalibraryofgeneticvariantsandtransformedintobacteria.Thecorrespondencebetweenthebarcodesandgeneticvariantsisdeterminedbysequencing.(C)Schematicdiagramoftheimage-basedphenotype-genotypecharacterization.Thephenotypeisfirstcharacterizedinsurface-adheredcells.Then,thecellsarefixed,andmultipleroundsofhybridizationareusedtomeasurethebarcodes.Duringthefirstround,readoutprobe1-0isaddedandcellswithbarcodesthatread“0”inthefirstbit,i.e.whichcontainthereadoutsequence1-0,shouldbindtotheprobeandbecomefluorescent,whereascellswithbarcodesthatread“1”inthefirstbitshouldremaindark.Oncereadoutprobe1-0isextinguished,readoutprobe1-1isaddedandthecellswithbarcodesthatread“1”inthefirstbit,whichcontainthereadoutsequence1-1,shouldbecomefluorescent.Thisdifferenceinfluorescenceintensityallowsthevalueofbit1tobedeterminedforeachcell.Thisprocessisrepeatedfortheremainingbits.Aftermeasuringallbits,thebarcodeisdetermined,revealingtheidentityofthegeneticvariantcontainedinthecellandlinkingthemeasuredphenotypetothegenotype.

peer-reviewed) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/143966doi: bioRxiv preprint first posted online May. 30, 2017;

16

Fig.2.Performancecharacterizationofthescreeningmethodbymeasuring600,000cellscontaining80,000uniquebarcodesassociatedwithtwoknowngenotypesandphenotypes.(A)Schematicdiagramofthelibraryconstituents.Amongthe80,000distinct21-bitbarcodes,halfareassociatedwiththemTagBFP2genewhiletheotherhalfareassociatedwiththemTagBFP2-mMaple3fusiongene.BothsetsofcellswillexpressRNAbarcodes,butonlythoseexpressingmTagBFP2-mMaple3willbefluorescentinthe560-nmchannel.(B)Fluorescentimagesforeachreadoutfromasubsetofthefull21bits.Imagesafterthereadoutsequencecorrespondingtoa“0”(top)ortoa“1”(middle)arehybridizedareshown.Thedifferenceimage(bottom)indicateswhethera“0”or“1”isassignedtothebarcodewithinthatcellinthatbit(redandgrayindicate“0”and“1”,respectively).(C)Two-dimensionalhistogramofnormalizedfluorescenceintensitiesforreadout0andreadout1ofbit1foreachcell.Thefluorescenceintensitiesarenormalizedtothemedianvalues.Thedottedlinedepictsthethresholdusedforeliminatingcellsthatappeardiminbothreadouts.Theshadeofgreenindicatesthenumberofcells.(D)Thepercentofbarcodesdecodedintheimagingexperimentthatmatchbarcodesdeterminedtobeinthelibrarybysequencing(orange)andthenumberofcells(magenta)abovethereadoutintensitythresholdwithvaryingthresholdmagnitude.Thedottedlinecorrespondswiththethresholdof1shownin(C).(E)Abundanceofeachbarcode.Theabundanceisthenumberofcellsintheimagingexperimentassignedtoeachbarcode.(F)FluorescenceimageofmTagBFP2andfluorescenceimageofpost-activationmMaple3ofthesameregionas(B).(G)HistogramsofmedianmMaple3fluorescenceintensitynormalizedtomTagBFPintensityforbarcodesassociatedwiththemMaple3-mTagBFP2fusiongene(red)andforthoseassociatedwiththemTagBFP2-gene(cyan).

peer-reviewed) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/143966doi: bioRxiv preprint first posted online May. 30, 2017;

17

Fig.3.ScreeningYFASTmutantlibrariesforimprovedbrightnessandphotobleachingkinetics.(A)SchematicdiagramofYFASTlibrarydesign.YFASTisdarkonitsown,butitbecomesfluorescentuponbindingtotheligand,forexampleHMBR8.AlibraryofYFASTvariantsfusedtomTagBFP2fornormalizationismergedwithalibraryofbarcodesandtransformedintoE.colicells.(B)Theinitialintensityandthephotobleachingcurve(blackcircles)oftheoriginalYFASTmeasuredfromasinglecellinthelibraryscreenmeasurement.Theinitialintensity(graydashedline)ismeasuredastheintensityofYFASTfluorescenceunder488-nmilluminationafterbackgroundsubtractionnormalizedtothemTagBFP2fluorescenceintensityunder405-nmillumination.Thephotobleachingcurve(circles)ismeasuredbyilluminationwith488-nmlighttoexciteYFASTonly.Thecurveisfittoadoubleexponentialdecay(redline)withthebackgroundleveldeterminedbytheintensityofcellsthathavedarkYFASTvariants.(C)Scatterplotoftherelativebrightnessandfast-photobleachingamplitudeforeachmeasuredmutant.Eachpointdepictsthemedianbrightnessandfast-photobleachingamplitudeofall

peer-reviewed) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/143966doi: bioRxiv preprint first posted online May. 30, 2017;

18

cellsassociatedwithonemutantinthelibrary.ThebrightnessvaluesarenormalizedtothatoftheoriginalYFAST.Hereonlythemutantsthatcontainatleast10imagedcellsaredepicted.TheoriginalYFAST(green)andseveralselectedmutants(blue,red,andcyan)withgreaterbrightnessand/ormuchsmallerfast-photobleachingamplitudesaremarked.(D)ThephotobleachingcurvesforindividualcellcorrespondingtotheoriginalYFAST(green)andoneselectedmutant(bluedotin(C))fromthelibrarymeasurement.Thefluorescenceintensitiesoftheinitialtimepointarenormalizedto1.(E)TheaveragebleachingcurvesfortheoriginalYFAST(green)andoneselectedmutant(bluedotin(C))measuredinisolationinpureculture.TheoriginalYFASTandmutantwereindividuallyexpressedintheE.colicellstogetherwithamTagBFP2usingthesameplasmidconstructasdescribedforthelibraryconstructs.TheinitialbrightnessvaluesofYFASTandmutantarenormalizedasdescribedin(D).(FandG)Barchartsoftherelativebrightnessvalues(F)andfast-photobleachingamplitudes(G)forseveralselectedmutantsasmarkedin(C)bythesamecolors.ThebrightnessvaluesarenormalizedtothatoftheoriginalYFAST.*indicatesavalueclosetozeroandhencenotvisibleinthebargraph.

peer-reviewed) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/143966doi: bioRxiv preprint first posted online May. 30, 2017;

top related