dns and cdn

45
DNS AND CDN

Upload: lael

Post on 23-Feb-2016

157 views

Category:

Documents


0 download

DESCRIPTION

DNS and CDN. What we have learned so far…. Socket programming, Internet Process and Thread Concurrent programming RPC Logical Time (Distributed) Transactions and concurrency control. What we will learn…. ACID vs BASE. We will learn how real systems work: DNS, CDN - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: DNS and CDN

DNS AND CDN

Page 2: DNS and CDN

What we have learned so far… Socket programming, Internet Process and Thread Concurrent programming RPC Logical Time (Distributed) Transactions and concurrency control

Page 3: DNS and CDN

What we will learn… ACID vs BASE. We will learn how real systems work:

DNS, CDN Distributed file system (NFS, AFS, Dropbox) MapReduce/Hadoop SSH, PKI

Page 4: DNS and CDN

Midterm

100 95 90 85 80 75 70 65 60 55 500

1

2

3

4AVG=77

Page 5: DNS and CDN

PA1+PA2(part I)+Midterm

100 95 90 85 80 75 70 65 60 55 500

1

2

3

4

5

Total

Page 6: DNS and CDN

6

Today's Lecture DNS: one of world’s largest distributed system" CDNs: More than 60% of Internet traffic is video. More

than half is served by CDNs.

Page 7: DNS and CDN

What is DNS? Domain Name Service Translates human readable names into IP addresses. Challenge

How do we scale these to the wide area and to many users? How do we make this efficient? How do we support many updates?

Page 8: DNS and CDN

This is what DNS gives you (dig)

dongsuh@ina:~/public_html$ dig ina.kaist.ac.kr

; <<>> DiG 9.8.1-P1 <<>> ina.kaist.ac.kr;; global options: +cmd;; Got answer:;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 3211;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 2, AD-DITIONAL: 2

;; QUESTION SECTION:;ina.kaist.ac.kr.    IN    A

;; ANSWER SECTION:ina.kaist.ac.kr.    7200    IN    A    143.248.56.236

;; AUTHORITY SECTION:kaist.ac.kr.    7200    IN    NS    ns1.kaist.ac.kr.kaist.ac.kr.    7200    IN    NS    n-s.kaist.ac.kr.

;; ADDITIONAL SECTION:ns.kaist.ac.kr.    7200    IN    A    143.248.1.177ns1.kaist.ac.kr.    7200    IN    A    143.248.2.177

;; Query time: 2 msec;; SERVER: 127.0.0.1#53(127.0.0.1);; WHEN: Thu Nov  7 08:14:11 2013;; MSG SIZE  rcvd: 116

Page 9: DNS and CDN

; <<>> DiG 9.8.1-P1 <<>> www.google.com;; global options: +cmd;; Got answer:;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 45520;; flags: qr rd ra; QUERY: 1, ANSWER: 6, AUTHORITY: 4, AD-DITIONAL: 4

;; QUESTION SECTION:;www.google.com.    IN    A

;; ANSWER SECTION:www.google.com.    44    IN    A    74.125.128.103www.google.com.    44    IN    A    74.125.128.105www.google.com.    44    IN    A    74.125.128.104www.google.com.    44    IN    A    74.125.128.106www.google.com.    44    IN    A    74.125.128.147www.google.com.    44    IN    A    74.125.128.99

;; AUTHORITY SECTION:google.com.    268411    IN    NS    ns3.google.com.google.com.    268411    IN    NS    ns1.google.com.google.com.    268411    IN    NS    ns4.google.com.google.com.    268411    IN    NS    ns2.google.com.

;; ADDITIONAL SECTION:ns2.google.com.    265844    IN    A    216.239.34.10ns1.google.com.    265844    IN    A    216.239.32.10ns3.google.com.    265844    IN    A    216.239.36.10ns4.google.com.    265844    IN    A    216.239.38.10

;; Query time: 0 msec;; SERVER: 127.0.0.1#53(127.0.0.1);; WHEN: Thu Nov  7 08:14:32 2013;; MSG SIZE  rcvd: 264

Page 10: DNS and CDN

Why is DNS distributed system? Why can’t it be centralized?

Single point of failure Large amount of requests Must handle large updates!

Does not scale. Why not use distributed transactions?

ACID approach.

Page 11: DNS and CDN

11

Domain Name System Goals

Basically a wide-area distributed database Scalability Decentralized maintenance Robustness Global scope

Names mean the same thing everywhere Don’t need

Atomicity Strong consistency

Page 12: DNS and CDN

12

Typical Resolution Steps for resolving www.google.com

Application calls gethostbyname() (RESOLVER) Resolver contacts local name server (S1) S1 queries root server (S2) for (www.google.com) S2 returns NS record for google.com. (S3) ns1.google.com What about A record for S3?

This is what the additional information section is for (PREFETCHING) S1 queries S3 (ns1.google.com) for www.google.com S3 returns A record for www.google.com

Can return multiple A records what does this mean?

Page 13: DNS and CDN

13

Lookup MethodsRecursive query: Server goes out and searches for

more info (recursive) Only returns final answer or “not

found”Iterative query: Server responds with as much as

it knows (iterative) “I don’t know this name, but ask

this server”

Workload impact on choice? Local server typically does recur-

sive Root/distant server does iterative

requesting hostsurf.eurecom.fr

gaia.cs.u-mass.edu

root name server

local name serverdns.eurecom.fr

1

2

34

5 6authoritative name server

dns.cs.umass.edu

intermediate name serverdns.umass.edu

7

8

iterated query

Page 14: DNS and CDN

14

Workload and Caching Are all servers/names likely to be equally popular?

Why might this be a problem? How can we solve this problem? DNS responses are cached

Quick response for repeated translations Other queries may reuse some parts of lookup

NS records for domains DNS negative queries are cached

Don’t have to repeat past mistakes E.g. misspellings, search strings in resolv.conf

Cached data periodically times out Lifetime (TTL) of data controlled by owner of data TTL passed with every record

Page 15: DNS and CDN

15

Typical Resolution

Client Local DNS server

root & edu DNS server

ns1.cmu.edu DNS server

www.cs.cmu.edu

NS ns1.cmu.eduwww.cs.cmu.edu

NS ns1.cs.cmu.edu

A www=IPaddr

ns1.cs.cmu.eduDNS

server

Page 16: DNS and CDN

16

Subsequent Lookup Example

Client Local DNS server

root & edu DNS server

cmu.edu DNS server

cs.cmu.eduDNS

server

ftp.cs.cmu.edu

ftp=IPaddr

ftp.cs.cmu.edu

Page 17: DNS and CDN

digdongsuh@ina:~/Documents/sdn-cdn-doc/nsdi$ dig www.google.com +trace

; <<>> DiG 9.8.1-P1 <<>> www.google.com +trace;; global options: +cmd.    436533    IN    NS    k.root-servers.net..    436533    IN    NS    d.root-servers.net..    436533    IN    NS    m.root-servers.net..    436533    IN    NS    f.root-servers.net..    436533    IN    NS    l.root-servers.net..    436533    IN    NS    j.root-servers.net..    436533    IN    NS    h.root-servers.net..    436533    IN    NS    b.root-servers.net..    436533    IN    NS    c.root-servers.net..    436533    IN    NS    e.root-servers.net..    436533    IN    NS    a.root-servers.net..    436533    IN    NS    i.root-servers.net..    436533    IN    NS    g.root-servers.net.;; Received 420 bytes from 127.0.0.1#53(127.0.0.1) in 11 ms

com.    172800    IN    NS    d.gtld-servers.net.com.    172800    IN    NS    m.gtld-servers.net.com.    172800    IN    NS    g.gtld-servers.net.com.    172800    IN    NS    j.gtld-servers.net.com.    172800    IN    NS    i.gtld-servers.net.com.    172800    IN    NS    a.gtld-servers.net.com.    172800    IN    NS    f.gtld-servers.net.com.    172800    IN    NS    e.gtld-servers.net.com.    172800    IN    NS    k.gtld-servers.net.com.    172800    IN    NS    b.gtld-servers.net.com.    172800    IN    NS    l.gtld-servers.net.com.    172800    IN    NS    h.gtld-servers.net.com.    172800    IN    NS    c.gtld-servers.net.;; Received 492 bytes from 192.112.36.4#53(192.112.36.4) in 1275 ms

google.com.    172800    IN    NS    ns2.google.com.google.com.    172800    IN    NS    ns1.google.com.google.com.    172800    IN    NS    ns3.google.com.google.com.    172800    IN    NS    ns4.google.com.;; Received 168 bytes from 192.35.51.30#53(192.35.51.30) in 210 ms

www.google.com.    300    IN    A    74.125.128.99www.google.com.    300    IN    A    74.125.128.147www.google.com.    300    IN    A    74.125.128.106www.google.com.    300    IN    A    74.125.128.105www.google.com.    300    IN    A    74.125.128.104www.google.com.    300    IN    A    74.125.128.103;; Received 128 bytes from 216.239.38.10#53(216.239.38.10) in 79 ms

Page 18: DNS and CDN

Switch gTCL and ccTLD (incorrect fig-ure) But you get the idea

Page 19: DNS and CDN

Root name servers (13 servers)

Page 20: DNS and CDN

Actually many physical root name servers

Page 21: DNS and CDN

Several root servers have multiple physical servers They have

Packets routed to “nearest” server by “Anycast” proto-col

http://www.root-servers.org/

Page 22: DNS and CDN

22

Reverse DNS Task

Given IP address, find its name Method

Maintain separate hierarchy based on IP names

Write 128.2.194.242 as 242.194.128.2.in-addr.arpa Why is the address reversed?

Managing Authority manages IP addresses assigned

to it E.g., CMU manages name space 128.2.in-

addr.arpa

edu

cmu

cs

kittyhawk128.2.194.242

cmcl

unnamed root

arpa

in-addr

128

2

194

242

Page 23: DNS and CDN

23

.arpa Name Server Hierarchy

At each level of hierarchy, have group of servers that are authorized to handle that region of hierarchy

128

2

194

kittyhawk128.2.194.242

in-addr.arpa a.root-servers.net • • • m.root-servers.net

chia.arin.net(dill, henna, indigo, epazote, figwort, ginseng)

cucumber.srv.cs.cmu.edu,t-ns1.net.cmu.edut-ns2.net.cmu.edu

mango.srv.cs.cmu.edu(peach, banana, blueberry)

Page 24: DNS and CDN

24

Prefetching Name servers can add additional data to response Typically used for prefetching

CNAME/MX/NS typically point to another host name Responses include address of host referred to in “additional

section”

Page 25: DNS and CDN

25

Mail Addresses MX records point to mail exchanger for a name

E.g. mail.acm.org is MX for acm.org Addition of MX record type proved to be a challenge

How to get mail programs to lookup MX record for mail de-livery?

Needed critical mass of such mailers

Page 26: DNS and CDN

26

DNS (Summary) Motivations large distributed database

Scalability Independent update Robustness

Hierarchical database structure Zones How is a lookup done

Caching/prefetching and TTLs Reverse name lookup

Page 27: DNS and CDN

27

HTTP Caching Clients often cache documents

Challenge: update of documents If-Modified-Since requests to check

HTTP 0.9/1.0 used just date HTTP 1.1 has an opaque “entity tag (ETAG)” (could be a file signa-

ture, etc.) as well When/how often should the original be checked for

changes? Check every time? Check each session? Day? Etc? Use Expires header

If no Expires, often use Last-Modified as estimate

Page 28: DNS and CDN

28

Example Cache Check RequestGET / HTTP/1.1Accept: */*Accept-Language: en-usAccept-Encoding: gzip, deflateIf-Modified-Since: Mon, 29 Jan 2001 17:54:18 GMTIf-None-Match: "7a11f-10ed-3a75ae4a"User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)Host: www.intel-iris.netConnection: Keep-Alive

Page 29: DNS and CDN

29

Example Cache Check ResponseHTTP/1.1 304 Not ModifiedDate: Tue, 27 Mar 2001 03:50:51 GMTServer: Apache/1.3.14 (Unix) (Red-Hat/Linux) mod_ssl/

2.7.1 OpenSSL/0.9.5a DAV/1.0.2 PHP/4.0.1pl2 mod_perl/1.24

Connection: Keep-AliveKeep-Alive: timeout=15, max=100ETag: "7a11f-10ed-3a75ae4a"

Page 30: DNS and CDN

30

Content Distribution Networks (CDNs) The content providers are the

CDN customers. Content replication CDN company installs hun-

dreds of CDN servers throughout Internet Close to users

CDN replicates its customers’ content in CDN servers. When provider updates content, CDN updates servers

origin server in North America

CDN distribution node

CDN serverin S. America CDN server

in Europe

CDN serverin Asia

Page 31: DNS and CDN

31

Server Selection Service is replicated in many places in network How do direct clients to a particular server?

As part of routing anycast, cluster load balancing As part of application HTTP redirect As part of naming DNS

Which server? Lowest load to balance load on servers Best performance to improve client performance

Based on Geography? RTT? Throughput? Load? Any alive node to provide fault tolerance

Page 32: DNS and CDN

32

Routing Based Anycast

Give service a single IP address Each node implementing service advertises route to ad-

dress Packets get routed routed from client to “closest” service

node Closest is defined by routing metrics May not mirror performance/application needs

What about the stability of routes?

Page 33: DNS and CDN

33

Application Based HTTP support simple way to indicate that Web page has

moved Server gets Get request from client

Decides which server is best suited for particular client and ob-ject

Returns HTTP redirect to that server Can make informed application specific decision May introduce additional overhead multiple connection

setup, name lookups, etc. While good solution in general HTTP Redirect has some de-

sign flaws – especially with current browsers

Page 34: DNS and CDN

34

Naming Based Client does name lookup for service Name server chooses appropriate server address

A-record returned is “best” one for the client What information can name server base decision on?

Server load/location must be collected Information in the name lookup request

Name service client typically the local name server for client

Page 35: DNS and CDN

35

How Akamai Works Clients fetch html document from primary server

E.g. fetch index.html from cnn.com URLs for replicated content are replaced in html

E.g. <img src=“http://cnn.com/af/x.gif”> replaced with <img src=“http://a73.g.akamaitech.net/7/23/cnn.com/af/x.gif”>

Client is forced to resolve aXYZ.g.akamaitech.net host-name, where XYZ is the hash of the URL

Page 36: DNS and CDN

36

How Akamai Works How is content replicated? Akamai only replicates static content (*) Modified name contains original file name Akamai server is asked for content

First checks local cache If not in cache, requests file from primary server and caches

file

* (At least, the version we’re talking about today. Akamai actually lets sites write code that can run on Akamai’s servers, but that’s a pretty different beast)

Page 37: DNS and CDN

37

How Akamai Works Root server gives NS record for akamai.net Akamai.net name server returns NS record for g.aka-

maitech.net Name server chosen to be in region of client’s name server TTL is large

G.akamaitech.net nameserver chooses server in region Should try to chose server that has file in cache - How to

choose? Uses aXYZ name and hash TTL is small why?

Page 38: DNS and CDN

38

Simple Hashing Given document XYZ, we need to choose a server to

use Suppose we use modulo Number servers from 1…n

Place document XYZ on server (XYZ mod n) What happens when a servers fails? n n-1

Same if different people have different measures of n Why might this be bad?

Page 39: DNS and CDN

39

Simple Hashing Given document XYZ, we need to choose a server to

use Suppose we use modulo Number servers from 1…n

Place document XYZ on server (XYZ mod n) What happens when a servers fails? n n-1

Same if different people have different measures of n Why might this be bad?

Page 40: DNS and CDN

40

Consistent Hash “view” = subset of all hash buckets that are visible Desired features

Balanced – in any one view, load is equal across buckets Smoothness – little impact on hash bucket contents when

buckets are added/removed Spread – small set of hash buckets that may hold an object

regardless of views Load – across all views # of objects assigned to hash bucket

is small

Page 41: DNS and CDN

41

Consistent Hash – Example

Smoothness addition of bucket does not cause movement between existing buckets

Spread & Load small set of buckets that lie near object

Balance no bucket is responsible for large number of objects

• Construction• Assign each of C hash buckets to random points on mod

2n circle, where, hash key size = n.• Map object to random position on circle• Hash of object = closest clockwise bucket

0

4

8

12Bucket

14

Page 42: DNS and CDN

42

How Akamai Works

End-user

cnn.com (content provider) DNS root server Akamai server

1 2 3

4

Akamai high-level DNS server

Akamai low-level DNS server

Nearby matchingAkamai server

11

67

8

9

10

Get in-dex.html

Get /cnn.com/foo.jpg

12

Get foo.jpg

5

Page 43: DNS and CDN

43

Akamai – Subsequent Requests

End-user

cnn.com (content provider) DNS root server Akamai server

1 2 Akamai high-level DNS server

Akamai low-level DNS server

7

8

9

10

Get in-dex.html

Get /cnn.com/foo.jpg

Nearby matchingAkamai server

Page 44: DNS and CDN

44

Important Lessons Akamai CDN illustrate range of ideas

BASE (not ACID design) Weak consistency Naming of objects location translation Consistent hashing

Why are these the right design choices for this applica-tion?

Page 45: DNS and CDN