distributed systems communication · approach of building distributed systems to address these...
TRANSCRIPT
1
DISTRIBUTEDSYSTEMSCOMMUNICATIONLastclasswediscussedaboutthecorechallengesofbuildingdistributedsystems(incrementalscalabilityishard,atscalefailuresareinevitable,constantattacks,etc.).We'vesaidthatthecoreapproachofbuildingdistributedsystemstoaddressthesechallengesistoconstructlayersuponlayersofsystemsandservicesthatraisethelevelofabstractionincreasinglythroughwell-definedAPIs.Google'sstoragestack,aswellasSparkandothersystems,areperfectexamplesofthatlayereddesignfordistributedsystems.Inthislecture,welookatoneoftheseabstractions:RPC,whichenablescommunication.Tocooperate,machinesneedtofirstcommunicate.Howshouldtheydothat?RPCisthepredominantwayofcommunicatinginaDS.Interestinglyenough,RPC–thesimplestpossibleDSabstraction–reflectsmostofthechallengesindistributedsystems,showingyouhowfundamentalthosethingsare.
I.RemoteProcedureCalls(RPCs)Overallgoal:Tosimplifytheprogrammer'sjobwhenbuildingdistributedsystems.
- manycomplexitieswhenbuildingadistributedprogram- somearefundamentalandmustbedealtwithbyprogrammer,whileothersaremechanical
andcanbesolvedbygoodlanguage,runtimesupport- pre-RPC,allofthegorydetailsofsocket-levelcommunicationwereexposedtothe
programmer- post-RPC,aremoteinteractionlooks(nearly)identicaltoalocalprocedurecall,withthe
systemhiding(nearly)allofthedifferencesExampleof(badlywritten)socketcommunicationcode:
Anyissueswithit?Many…-lotsofuglyboilerplate-bugprone-portabilitydetailsareeasytomiss-hardtounderstandandhardtomaintain-hardtoevolveprotocol-vulnerabilityprone…
RPC:RPCtriestomakenetcommunicationlookjustlikealocalprocedurecall(LPC):
structfoomsg{u_int32_tlen;}send_foo(intoutsock,char*contents){intmsglen=sizeof(structfoomsg)+strlen(contents);charbuf=malloc(msglen);structfoomsg*fm=(structfoomsg*)buf;fm->len=htonl(strlen(contents));memcpy(buf+sizeof(structfoomsg),contents,strlen(contents));write(outsock,buf,msglen);}
Client-sidecode:z=fn(x,y)
Server-sidecode:fn(x,y){
//computeresultzreturnz;
}
2
WhytheLPCabstraction?Because,it’ssimple,elegant,andevennoviceprogrammersknowhowtousefunctioncalls!Othermodelsexist,buttheyare(arguably)hardertothinkabout.RPCmessagediagram:ClientServerrequest--->process<---responseRPCsoftwarestructure:
(figuretakenfromRPCpaperbyBirrell,et.al.)
Inimplementations,stubsaregeneratedautomaticallybyRPCframeworks(libraries),whichalsoprovidetheRPCRuntime.Theprogrammeronlywritesdefinitionsfortheirdatastructuresandprotocolsinan“interfacedefinitionlanguage”(IDL).We’llseetwoexamplesofRPClibrarieslater.GoodthingsaboutRPCabstraction:
- Hidesgorynetwork/marshalingdetailsthatonewouldhavetoimplementifdoing,e.g.,network-levelcommunication,byteorders(bigendianvs.littleendian)
- Supportsevolutionofthecommunicatingcomponentsindependently.(ManyRPCframeworksprovideexplicitsupportforevolution,e.g.,protocolversionnumbers,rulesofwhentoremove/renameargumentstoprocedurecalls,etc.)
- Allowsforefficientpackaging- Authenticationsupport- Locationindependence(particularlyifcombinedwithadirectoryservice)
ProblemstheRPCabstraction(i.e.,wheredistributionofRPCpeeksthroughtheLPCillusion):
- Latency:LPCisfast,RPCcanbereallyexpensive(esp.ifitgoesoverWAN,butnotonly)
3
- Pointertransfers:localaddressspaceisn’tsharedwithremoteprocess.Theprogrammer/RPClibraryhastomakeadecisionofwhethertofollowpointersandserializeindepth,ortoexcludethatinformation.Typically,IDLsavoidthisproblembyrequiringonetospecifyone’smessages/datastructurestoavoidpointerconfusion.
- Failures:theyaremorefundamentalthaninthesingle-process/LPCcase.o IfIissueanLPC,thenIcanbeinoneofthreesituations:(1)theLPCreturns,
thereforeIknowtheprocedurehasbeenexecuted;or(2)theLPCdoesn’treturn,inwhichcaseIknowtheLPChasn’tfinishedexecuting(itmaystillbeexecuting,ortheprocessthathostsboththecallerandthecalleemighthavedied).So,withLPC,thecallerandcalleecan’thavea“splitmind.”WithRPC,theycan.
o IfIissueanRPCandgetareturn,Iknowtheprocedurehasbeenexecutedremotely.ButifIdon’tgetareturn,whatdoIknow?HasthefunctioncompletedontheremotesideandIjustdon’tknowaboutitbecause,e.g.,theresponsehasbeenlostoverthenetwork?Hasitnothappened,because,e.g.,myrequesthasbeenlostoverthenetworkortheremotemachineisdead?Ormaybethemachine/networkisslow–shouldIwaitlongerfortheresponse?HowlongshouldIwaitforaresponse?Ican’ttellthedifferencebetweenthesesituations,andIdon’tknowhowtoact.Incertainscenarios,that’snotabigproblem,butinothersitis.
Forexample,imaginethedistributedsystemiscomposedofanATMmachineanditsbankback-endsystem.ApersonrequestsawithdrawalofcashfromtheATMmachine.TheATMmachineneedstocheckthatthepersonhassufficientfundsin,andwithdrawthesumfrom,hisbankaccountbeforegivingoutthemoney.TheATMrequeststhemoneytransferandthenwaits,andwaits,andwaits.Noresponse.Shoulditgiveoutthemoneyorshoulditnot,ifitdoesn’tgetaresponsefromthebank?Nosimpleanswer.Needsometimeout(personcan’twaitforever),butwhatshouldthatbe?SayATMrefusestogiveoutthemoneyuponatimeout.Butwhatifthebankalreadydeductedthemoneyfromtheaccount,howshouldwedealwiththat?TheATMneedstokeeptryingtotalkwiththebanktotellitthatithasn’tactuallyfinishedthetransaction,sothebankcanputbackthemoneyintheperson’saccount.ButwhatiftheATMitselffailsatsomepoint?Istheperson’smoneylostforever?IwouldhateinteractingwithsuchanATM.Alternatively,sayATMgivesoutthemoneyuponatimeout.Thisofcoursecanopenthebanktofraud:apersonmightquicklygotoanotherATMandextractthesamemoneyagain.
Hereinliesthebiggestdifficultyindistributedsystemscomparedtosingle-processsystems:withsinglemachines,youlargelyhavesharedfate(i.e.,eitherallfailsornothingdoes);withdistributedsystems,machines/processescanfailindependently,soyoucanhavesomemachinesthatareupwhileothersaredown,andyoucanalsohaveallmachinesupbutunabletocommunicateduetonetworkfailuresandpartitions.Thisleadstocoordinationchallenges,whichinturnleadtoinconsistenciesbetweenthedecisionsthesemachinesmake.SothatgoalofDSofprovidinga“coherentservice”isdifficulttoachieve.
4
II.Semantics
TheprecedingATMexampleleadsdirectlytoaquestionofsemantics.Whatshouldbethemeaning(semantic)ofanRPCinvocation?Networkfailuresimplytimeoutsandretransmissions.Theyimplythatclientswillendupretransmittingrequests,andtheserverwillseeduplicates.Howshouldtheserverbehavewhenitseesduplicates?Belowaresomepotentialanswers,anddifferentRPCframeworkswillimplementdifferentsemantics.At-least-once:theRPCcall,onceissuedbytheclient,isexecutedeventuallyatleastonce,butpossiblymultipletimes.Thissemanticistheeasiesttoimplementbutraisesissues.Toimplement,theRPClibraryontheclientkeepsissuingtheRPCmultipletimesuntilitgetsaresponsefromtheserver.Assumingthatfailures(ofnetwork,server)aretemporary,theRPCwillbeexecutedatleastonce.Thisisfineforidempotentrequests(e.g.,thisemailinauser’sinboxhasbeenread),butmightbeproblematicforothertypesofrequests(e.g.,ATMexample,purchasinganitem).
At-most-once:theRPCcall,onceissuedbyaclient,getsexecutedzerooronetime.Thissemanticisnexteasiesttoimplement,buthascomplicationsifyoutrytodoitsensibly.Toimplement,theserverhastorememberrequeststhatithasalreadyseenandprocessed,todetectduplicatestosquelch.Client/serversneedtoembeduniqueIDs(XIDs)inmessages.Somethinglikefollows:
Servermaintainss[xid](state),r[xid](returnvalue)ifs[xid]==DONE:returnr[xid]x=handler(args)s[xid]=DONEr[xid]=xreturnx
Onecomplication:whathappensifservercrashes/restarts?Serverwilllosethes,rtables.Solution:storeondisk[slow,expensive].Anothercomplication:evenifstoreondisk,whathappensifservercrashesjustbeforeoraftercalltohandler()?-afterrestart,servercan'ttellifhandler()wascalled-servermustreplywith"CRASH"whenclientresendsrequestSolution:serverhas"servernonce"thatidentifiesrestartgeneration-clientobtainsnonceduringbindwithserver-clienttransmitsnonceineveryRPCrequest-ifnoncedoesn'tmatchserver'sstate,returnCRASHExactlyonce:thisistheideal(itresemblestheLPCmodelmostcloselyandit’seasiesttounderstand),butit’ssurprisinglyhardtoimplement.ThisiswhentheRPC,onceissuedbytheclient,isinvokedexactlyoncebytheserver.Whyhardtobuild?-serverfailureataninopportunetimecanstymie--clientblocksforever-needtoimplementhandler()asatransactionthatincludess[xid],r[xid]commitonserver
5
III.ExampleLibrariesSomesamplecodeusingvariousRPClibrariescanbefoundathttps://columbia.github.io/ds1-class/lectures/02-rpc-handout.pdf.
IV.OtherCommunicationModelsRPCisjustonecommunicationmodelindistributedsystems.Itworkswellwhenthecommunicationpatternlookslikethis:
-point-to-point,request/process/responsetoaserverorservice.-lotsofcommunicationindistributedsystemslooklikethat,butnotall.
Herearesomeotherpopularcommunicationpatternsandthecorrespondingcommunicationabstractionsthatsupportthem.Wewon’tstudythemindepthinthiscourse.
Acknowledgements
TheprecedingnoteswereadaptedfromtheUniversityofWashingtondistributedsystemscourse,SteveGribble’sedition:http://courses.cs.washington.edu/courses/csep552/13sp/lectures/1/rpc.txt