last time - university of cambridge · 2017-01-15 · last time •distributed systems are...

Post on 13-May-2020

2 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1/13/17

1

DistributedsystemsLecture2:TheNetworkFileSystem(NFS)and

ObjectOrientedMiddleware(OOM)

Dr.RobertN.M.Watson

1

Lasttime• Distributedsystemsareeverywhere– Challengesincludingconcurrency,delays,failures– Theimportanceoftransparency

• Simplestdistributedsystemsareclient/server– Clientsendsrequestasmessage– Servergetsmessage,performsoperation,andreplies– Somecarerequiredhandlingretrysemantics,timeouts

• OnepopularmodelisRemoteProcedureCall(RPC)– Clientcallsfunctionsontheservervianetwork– Middleware generatesstubcodewhichcanmarshal/unmarshal arguments/returnvalues– e.g.SunRPC/XDR

– Transparencyfortheprogrammer,notjusttheuser

2

1/13/17

2

Firstcasestudy:NFS• NFS =NetworkedFileSystem(developedSun)– Aimedtoprovidedistributedfilingbyremoteaccess

• Keydesigndecisions:– Distributedfilesystemvs.remotedisks– Client-servermodel– Highdegreeoftransparency– Tolerantofnodecrashesornetworkfailure

• Firstpublicversion,NFSv2(1989),didthisvia:– Unixfilesystemsemantics(oralmost)– Integrationintokernel(includingmount)– Simplestatelessclient/serverarchitecture

• A setofRPC“programs”:mountd,nfsd,lockd,statd,...

Transparencyforusersand

applications,butalsoNFSprogrammers:henceSunRPC

3

NFS:Client/ServerArchitecture

UserProgram

Syscall Level

Clientside

RPCRequest

VFSLayer

LocalFS NFSClient

Syscall Level

ServerSide

VFSLayer

NFSServer LocalFS

RPCResponse

1

2

3 4

5

• Clientusesopaquefilehandlestorefertofiles• Servertranslatesthesetolocalinode numbers• SunRPC withXDRrunningoverUDP(originally)

4

1/13/17

3

NFS:mountingremotefilesystems

NFSClient

NFSServer

/tmp

/

/home

x y

/home

/

/bin

foo bar

• NFSRPCsaremethodsonfilesidentifiedbyfilehandle(s)• Bootstrapviadedicatedmount RPC‘program’that:

– Performsauthentication(ifany);– Negotiatesanyoptionalsessionparameters;and– Returnsrootfilehandle

5

NFSfilehandlesandscoping

• Purenames exposenovisiblesemantics(e.g.,NFShandle)• Impurenames haveexposesemantics(e.g.,filepaths)

UserProgram

Syscall Level

VFSLayer

NFSClient

VFSLayer

NFSServer

LocalFS

int

file*vnode *vnode *

nfsnode *fhandle_t fhandle_t

vnode *

fhandle_tvnode *inode *

fsid

len

Exampleserver-definedfhandle_t

pad ino

NFS

Localfilesystemgen

6

• Argumentsateachlayerarewithspecificscopes– Layerstranslatebetweennamespacesforencapsulation– Contentsofnamesbetweenlayersoftenopaque

1/13/17

4

NFSisstateless

• KeyNFSdesigndecisiontoeasefaultrecovery– Obviously,filesystemsaren’tstateless,so…

• Statelessmeanstheprotocoldoesn’trequire:– Keepinganyrecordofcurrentclients– Keepinganyrecordofcurrentopenfiles

• Servercancrash+reboot,andclientsdonothavetodoanything(exceptwait!)

• Clientscancrash,andserversdonotneedtodoanything(nocleanupetc)

7

Implicationsofstateless-ness• No“open”or“close”operations

– fh = lookup(<directory fh>, <filename>)– Allfileoperationsareviaper-filehandles

• NoimpliedstatelinkingmultipleRPCs;e.g.,– UNIXfiledescriptorhas“currentoffset”forI/O:

read(fd, buf, 2048)– NFSfilehandlehasnooffset;operationsareexplicit:

read(fh, buf, offset, 2048)• Thismakesmanyoperationsidempotent– ThisuseofSunRPC givesat-least-oncesemantics– Toleratemessageduplicationinnetwork,RPCretries

• ChallengesinprovidingUnixFSsemantics…

8

1/13/17

5

Semantictricks(andmesses)• rename(<old filename>, <new filename>)– Fundamentallynon-idempotent– Strongexpectationofatomicity– Servers-side“cache”recentRPCrepliesforreplay

• unlink(<old filename>)– UNIXrequiresopenfilestopersistafterunlink()– Whatiftheserverremovesafilethatisopenonaclient?– Sillyrename:clientstranslateunlink() torename()– Onlywithinclient(notserverdelete,norforotherclients)– Otherclientswillhaveastale filehandle:ESTALE

• Statelessfilelocking seemsimpossible– Problemavoided(?):separateRPCprotocols

9

Performanceproblems

• Neithersideknowsifotherisaliveordead– Allwritesmustbesynchronouslycommittedonserverbeforeitreturnssuccess

• Verylimitedclientcaching…– Riskofinconsistentupdatesifmultipleclientshavefileopenforwritingatthesametime

• ThesetwofactsalonemeantthatNFSv2hadtrulydreadful performance

10

1/13/17

6

NFSv3(1995)

• Mostlyminorprotocolenhancements– Scalability

• Removelimitsonpath- andfile-namelengths• Allow64-bitoffsetsforlargefiles• Allowlarge(>8KB)transfer-sizenegotiation

– Explicitasynchrony• Servercandoasynchronouswrites(write-back)• Clientsendsexplicitcommit aftersome#writes• Filetimestampspiggybackedonserverrepliesallowclientstomanagecache:close-to-openconsistency

– OptimizedRPCs(readdirplus,symlink)• Buthadmajor impactonperformance

11

NFSv3readdirplus

• NFSv2behaviour for“ls –l”– readdir() triggersNFS_READDIR torequestnamesandhandles

– stat() oneachfiletriggersoneNFS_GETATTR RPC

• NFS3_READDIRPLUS returnsanames,handles,andattributes– Eliminatesavastnumberof

round-triptimes• Principle:masknetworklatencyby

batchingsynchronousoperations

NFSv2Client

NFSv2Server

NFSv3Client

NFSv3Server

NFSv3

READDIR

GETATTR

GETATTR

GETATTR

READDIRPLUS

4xRTT(Round-TripTime)

1xRTT

drwxr-xr-x 55 al565 al565 12288 Feb 8 15:47 al565/drwxr-xr-x 115 am21 am21 49152 Feb 10 18:19 am21/drwxr-xr-x 214 atm26 atm26 36864 Feb 1 17:09 atm26/

12

1/13/17

7

Distributedfilesystemconsistency• Canadistributedapplicationexpectdatawrittenonclient

A tobevisibletoclientB?– Afterwrite() onA,willaread() onB seeit?– WhatifaprocessonA writestoafile,andthensendsa

messagetoaprocessonBtoreadthefile?• InNFSv3,no!

– Amayhavefreshlywrittendatainitscache thatithasnotyetsenttotheserverviaawriteRPC

– TheserverwillreturnstaledatatoB’sreadRPC• Or:

– Bmayreturnstaledatainitscachefromapriorread• Thisproblemisknownasinconsistency:

– Clientsmayseedifferentversionsofthesamesharedobject

13

NFSclose-to-openconsistency(1)• Guaranteeingglobalvisibilityforeverywrite()requiredsynchronousRPCsandpreventedcaching

• NFSv3implementsclose-to-open consistency,whichreducessynchronousRPCsandpermitscaching1. Foreachfileitstores,theservermaintainsa timestamp

ofthelastwriteperformed2. Whenafileisopened,theclientreceivesthetimestamp;

ifthetimestamphaschangedsincedatawascached,theclientinvalidates itsreadcache,forcingfreshreadRPCs

3. Whilethefileisopen,datareads/writesforthefilecanbecachedontheclient,andwriteRPCscanbedeferred

4. Whenthefileisclosed,pendingwritesmustbesenttotheserver(andack’d)beforeclose() canreturn

14

1/13/17

8

NFSclose-to-openconsistency(2)• Wenowhaveaconsistencymodelthatprogrammerscan

usetoreasonaboutwhenwriteswillbevisibleinNFS:– IfaprogramonhostA needswritestoafiletobevisibletoa

programonhostB,itmustclose() thefile– IfaprogramonhostB needsreadsfromafiletoincludethose

writes,itmustopen() itafter thecorrespondingclose()• Thisworksquitewellforsomeapplications– E.g.,distributedbuilds:inputs/outputsarewholefiles– E.g.,UNIXmaildir format(eachemailinitsownfile)

• Itworksverybadlyforothers– E.g.,long-runningdatabasesthatmodifyrecordswithinafile– E.g.,UNIXmbox format(allemailsinonelargefile)

• ApplicationsusingNFStosharedatamustbedesignedforthesesemantics,ortheywillbehaveverybadly!

15

NFSv4(2003)

• Timeforamajorrethink– Singlestateful protocol(includingmount,lock)– TCP(oratleastreliabletransport)only– Explicitopen andclose operations– Sharereservations– Delegation– Arbitrarycompoundoperations–ManylessonslearnedfromAFS(laterinterm)

• Nowseeingwidespreaddeployment

16

1/13/17

9

ImprovingoverSunRPC• SunRPC (now“ONCRPC”)verysuccessfulbut– Clunky(manualprogram,procedurenumbers,etc)– Limitedtypeinformation(evenwithXDR)– Hardtoscalebeyondsimpleclient/server

• OneimprovementwasOSFDCE(early90’s)– AnotherprojectthatlearnedfromAFS– DCE=“DistributedComputingEnvironment”– Largermiddlewaresystemincludingadistributedfilesystem,adirectoryservice,andDCERPC

– Dealswithacollectionofmachines– acell – ratherthanjustwithindividualclientsandservers

17

DCERPCversusSunRPC• Quitesimilarinmanyways– InterfaceswritteninInterfaceDefinitionNotation(IDN),andcompiledtoskeletonsandstubs

– NDRwireformat:little-endianbydefault!– Canoperateovervarioustransportprotocols

• Bettersecurity,andlocationtransparency– Servicesidentifiedby128-bit“Universally”UniqueIdentifiers(UUIDs),generatedbyuuidgen

– ServerregistersUUIDwithcell-widedirectoryservice– Clientcontactsdirectoryservicetolocateserver…whichsupportsservicemove,orreplication

18

1/13/17

10

Object-OrientedMiddleware• SunRPC /DCERPCforwardfunctions,anddonothavesupportformorecomplextypes,exceptions,orpolymorphism

• Object-OrientedMiddleware(OOM)aroseintheearly90stoaddressthis– AssumeprogrammeriswritinginOO-style– ’Remoteobjects’willbehavelikelocalobjects,buttheymethodswillbeforwardedoverthenetworkalaRPC

– Referencestoobjectscanbepassedasargumentsorreturnvalues– e.g.,passingadirectoryobjectreference

• Makesitmucheasiertoprogram– especiallyifyourprogramisobjectoriented!

19

CORBA(1989)

• FirstOOMsystemwasCORBA– CommonObjectRequestBrokerArchitecture– specifiedbytheOMG:ObjectManagementGroup

• OMA(ObjectManagementArchitecture)isthegeneralmodelofhowobjectsinteroperate– Objectsprovideservices.– Clientsmakesarequesttoanobjectforaservice.– Clientdoesn’tneedtoknowwheretheobjectis,oranythingabouthowtheobjectisimplemented!

– Objectinterfacemustbeknown(public)

20

1/13/17

11

ObjectRequestBroker(ORB)

• TheORBisthecoreofthearchitecture– Connectsclientstoobjectimplementations– Conceptuallyspansmultiplemachines(inpractice,ORBsoftwarerunsoneachmachine)

ORB

ObjectImplementationClient

GeneratedStubCode

GeneratedSkeletonCode

21

InvokingObjects

• Clientsobtainanobjectreference– Typicallyviathenamingservice ortradingservice– (Objectreferencescanalsobesavedforuselater)

• InterfacesdefinedbyCORBAIDL• Clientscancallremotemethodsin2ways:

1. StaticInvocation:usingstubsbuiltatcompiletime(justlikewithRPC)

2. DynamicInvocation:actualmethodcalliscreatedonthefly.Itispossibleforaclienttodiscovernewobjectsatruntimeandaccesstheobjectmethods

22

1/13/17

12

CORBAIDL• Definitionoflanguage-independentremoteinterfaces– LanguagemappingstoC++,Java,Smalltalk,…– TranslationbyIDLcompiler

• Typesystem– basictypes:long(32bit),longlong (64bit),short,float,char,boolean,octet,any,…

– constructedtypes:struct,union,sequence,array,enum– objects (commonsupertypeObject)

• Parameterpassing– in,out,inout (=sendremote,modify,update)– basic&constructedtypespassedbyvalue– objectspassedbyreference

23

CORBAProsandCons• CORBAhassomeuniqueadvantages– Industrystandard(OMG)– Language&OSagnostic:mixandmatch– RicherthansimpleRPC(e.g.interfacerepository,implementationrepository,DIIsupport,…)

– Manyadditionalservices(trading&naming,events&notifications,security,transactions,…)

• However:– Really,reallycomplicated/ugly/buzzwordy– Poorinteroperability,atleastatfirst– Generallytobeavoidedunlessyouneedit!

24

1/13/17

13

Summary+nexttime• NFSasanRPC,distributed-filesystemcasestudy– Retrysemanticsvs.RPCsemantics– Scoping,purevs.impurenames– Close-to-openconsistency– Batchingtomasknetworklatency

• DCERPC• Object-OrientedMiddleware(OOM)• CORBA

• Javaremotemethodinvocation(RMI)• XML-RPC,SOAP,etc,etc,etc.• Startingtotalkaboutdistributedtime

25

top related