lb server cluster switches - computer science … · load balancing in ... lb server cluster...
TRANSCRIPT
HashinginNetworkedSystems
COS461:ComputerNetworksSpring2011
MikeFreedmanh@p://www.cs.princeton.edu/courses/archive/spring11/cos461/
LB
ServerCluster
Switches
Hashing• HashfuncIon
– FuncIonthatmapsalarge,possiblyvariable‐sizeddatumintoasmalldatum,oNenasingleintegerthatservestoindexanassociaIvearray
– Inshort:mapsn‐bitdatumintokbuckets(k<<2n)
– ProvidesIme‐&space‐savingdatastructureforlookup
• Maingoals:– Lowcost– DeterminisIc– Uniformity(loadbalanced)
2
Today’soutline• Usesofhashing
– Equal‐costmulIpathrouInginswitches
– Networkloadbalancinginserverclusters– Per‐flowstaIsIcsinswitches(QoS,IDS)– CachingincooperaIveCDNsandP2Pfilesharing– DataparIIoningindistributedstorageservices
• Varioushashingstrategies– Modulohashing– Consistenthashing– BloomFilters
3
UsesofHashing
4
Equal‐costmulIpathrouIng(ECMP)
• ECMP– MulIpathrouIngstrategythatsplitstrafficovermulIplepathsforloadbalancing
• Whynotjustround‐robinpackets?– Reordering(leadtotripleduplicateACKinTCP?)– DifferentRTTperpath(forTCPRTO)…– DifferentMTUsperpath
5
Equal‐costmulIpathrouIng(ECMP)
• Path‐selecIonviahashing– #buckets=#outgoinglinks– HashnetworkinformaIon(source/destIPaddrs)toselectoutgoinglink:preservesflowaffinity
6
Now:ECMPindatacenters
• DatacenternetworksaremulI‐rootedtree– Goal:Supportfor100,000sofservers– RecallEthernetspanningtreeproblems:Noloops
– L3rouIngandECMP:TakeadvantageofmulIplepaths
7
Networkloadbalancing
• Goal:Splitrequestsevenlyoverkservers– Mapnewflowstoanyserver– PacketsofexisIngflowsconInuetousesameserver
• 3approaches– LoadbalancerterminatesTCP,opensownconnecIontoserver
– VirtualIP/DedicatedIP(VIP/DIP)approaches• Oneglobal‐facingvirtualIPrepresentsallserversincluster• Hashclient’snetworkinformaIon(sourceIP:port)• NATapproach:ReplacevirtualIPwithserver’sactualIP• DirectServerReturn(DSR)
8
LoadbalancingwithDSR
• ServersbindtobothvirtualanddedicatedIP• LoadbalancerjustreplacesdestMACaddr• ServerseesclientIP,respondsdirectly
– PacketinreversedirecIondonotpassthroughloadbalancer– Greaterscalability,parIcularlyfortrafficwithassymmetricbandwidth(e.g.,HTTPGETs)
LB
ServerCluster
Switches
9
Per‐flowstateinswitches
• SwitchesoNenneedtomaintainconnecIonrecordsorper‐flowstate– Quality‐of‐serviceforflows– Flow‐basedmeasurementandmonitoring– PayloadanalysisinIntrusionDetecIonSystems(IDSs)
• Onpacketreceipt:– HashflowinformaIon(packet5‐tuple)– Performlookupifpacketbelongstoknownflow– Otherwise,possiblycreatenewflowentry– ProbabilisIcmatch(falseposiIves)maybeokay
10
CooperaIveWebCDNs• Tree‐liketopologyofcooperaIvewebcaches
– Checklocal– Ifmiss,checksiblings/parent
• Oneapproach– InternetCacheProtocol(ICP)– UDP‐basedlookup,shortImeout
• AlternaIveapproach– Aprioriguessissiblings/childrenhavecontent– Nodessharehashtableofcachedcontentwithparent/siblings– ProbabilisIccheck(falseposiIves)okay,asactualICPlookuptoneighborcouldjustreturnfalse
11
publicInternet
Parentwebcache
HashtablesinP2Pfile‐sharing
• Two‐layernetwork(e.g.,Gnutella,Kazaa)– Ultrapeersaremorestable,notNATted,higherbandwidth– Leafnodesconnectwith1ormoreultrapeers
• Ultrapeershandlecontentsearchers– Leafnodessendhashtableofcontenttoultrapeers– Searchrequestsfloodedthroughultrapeernetwork– Whenultrapeergetsrequest,checkshashtablesofitschildrenformatch
12
DataparIIoning
• Networkloadbalancing:Allmachinesareequal
• DataparIIoning:Machinesstoredifferentcontent
• Non‐hash‐basedsoluIon– “Directory”servermaintainsmappingfromO(entries)tomachines(e.g.,Networkfilesystem,GoogleFileSystem)
– Nameddatacanbeplacedonanymachine
• Hash‐basedsoluIon– NodesmaintainmappingsfromO(buckets)tomachines
– Dataplacedonthemachinethatownsthename’sbucket
13
ExamplesofdataparIIoning• Akamai
– 1000clustersaroundInternet,each>=1servers– Hash(URL’sdomain)tomaptooneserver
– AkamaiDNSawareofhashfuncIon,returnsmachinethat1. isingeographically‐nearbycluster2. managesparIcularcustomerdomain
• Memcached(Facebook,Twi@er,…)– Employkmachinesforin‐memorykey‐valuecaching– Onread:
• Checkmemcache
• Ifmiss,readdatafromDB,writetomemcache
– Onwrite:invalidatecache,writedatatoDB
14
HowAkamaiWorks–AlreadyCached
End‐user
15
cnn.com (content provider) DNS root server Akamai server
1 2Akamai high-level DNS server
Akamai low-level DNS server
Nearby hash-chosen Akamai server
7
8
9
10
GET index.html
GET /cnn.com/foo.jpg Cluster
HashingTechniques
16
BasicHashTechniques
• Simpleapproachforuniformdata– IfdatadistributeduniformlyoverN,forN>>n
– Hashfn=<data>modn– Failsgoalofuniformityifdatanotuniform
• Non‐uniformdata,variable‐lengthstrings– Typicallysplitstringsintoblocks– PerformrollingcomputaIonoverblocks
• CRC32checksum
• CryptographichashfuncIons(SHA‐1has64byteblocks)
17
ApplyingBasicHashing
• ConsiderproblemofdataparIIon:– GivendocumentX,chooseoneofkserverstouse
• Supposeweusemodulohashing– Numberservers1..k
– PlaceXonserveri=(Xmodk)• Problem?Datamaynotbeuniformlydistributed
– PlaceXonserveri=hash(X)modk• Problem?
– Whathappensifaserverfailsorjoins(kk±1)?
– WhatisdifferentclientshasdifferentesImateofk?
– Answer:Allentriesgetremappedtonewnodes!
18
• ConsistenthashingparIIonskey‐spaceamongnodes
• Contactappropriatenodetolookup/storekey– Bluenodedeterminesrednodeisresponsibleforkey1
– Bluenodesendslookuporinserttorednode
key1 key2 key3
key1=value
insert(key1,value)
19
ConsistentHashing
lookup(key1)
• ParIIoningkey‐spaceamongnodes
– NodeschooserandomidenIfiers: e.g.,hash(IP)
– KeysrandomlydistributedinID‐space: e.g.,hash(URL)
– Keysassignedtonode“nearest”inID‐space– Spreadsownershipofkeysevenlyacrossnodes
0000 0010 0110 1010 1111 1100 1110 URL1 URL2 URL3 0001 0100 1011
20
ConsistentHashing
ConsistentHashing0
4
8
12 Bucket
14 • ConstrucIon– AssignChashbucketstorandompointsonmod2ncircle;hashkeysize=n
– MapobjecttorandomposiIononcircle
– Hashofobject=closestclockwisebucket
• Desiredfeatures– Balanced:NobuckethasdisproporIonatenumberofobjects
– Smoothness:AddiIon/removalofbucketdoesnotcausemovementamongexisIngbuckets(onlyimmediatebuckets)
– Spreadandload:Smallsetofbucketsthatlienearobject
21
BloomFilters
• DatastructureforprobabilisIcmembershiptesIng– Smallamountofspace,constantImeoperaIons– FalseposiIvespossible,nofalsenegaIves– Usefulinper‐flownetworkstaIsIcs,sharinginformaIonbetweencooperaIvecaches,etc.
• Basicideausinghashfn’sandbitarray– UsekindependenthashfuncIonstomapitemtoarray
– Ifallarrayelementsare1,it’spresent.Otherwise,not
22
Startwithanmbitarray,filledwith0s.
Toinsert,hasheachitemkImes.IfHi(x)=a,setArray[a]=1.
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 1 0 0 1 0 1 0 0 1 1 1 0 1 1 0
Tocheckifyisinset,checkarrayatHi(y).Allkvaluesmustbe1.
0 1 0 0 1 0 1 0 0 1 1 1 0 1 1 0
0 1 0 0 1 0 1 0 0 1 1 1 0 1 1 0
PossibletohaveafalseposiIve:allkvaluesare1,butyisnotinset.
BloomFilters23
Today’soutline• Usesofhashing
– Equal‐costmulIpathrouInginswitches
– Networkloadbalancinginserverclusters– Per‐flowstaIsIcsinswitches(QoS,IDS)– CachingincooperaIveCDNsandP2Pfilesharing– DataparIIoningindistributedstorageservices
• Varioushashingstrategies– Modulohashing– Consistenthashing– BloomFilters
24