the internet: a distributed system nik/dist-sys.ppt

Post on 28-Mar-2015

219 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

The Internet: A Distributed System

http://people.freebsd.org/~nik/dist-sys.ppt

Copyright © 2002 Nik Clayton

All rights reserved.

Redistribution and use, with or without modification, are permitted provided that the following condition is met:

• Redistributions of this presentation must retain the above copyright notice, this list of conditions and the following disclaimer.

THIS PRESENTATION IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Obligatory biographical bit

• Used to be cs92njc@brunel.ac.uk

• Now nik@FreeBSD.org, nik@slashdot.org

• One of five running mail for Citigroup

– 11m msgs/week, 850MB/day

• Editor, “FreeBSD Handbook”

Looking at…

• How the Internet works

• How the Domain Name System (DNS) works on top of this

• How the Simple Mail Transport Protocol uses both of these to shuffle e-mail around the place

So, how does the Internet work?• Three key protocols involved:

– IP: Internet Protocol

– UDP: User Datagram Protocol

– TCP: Transmission Control Protocol, often written TCP/IP

• IP is lowest layer, UDP and TCP sit on top of it.

• Not going to look at the physical layer (ethernet, etc)

• Not going to look at IPv6

Internet and the OSI 7 layer model

7 Application

TELNET

RFC 854

FTP

RFC 959

SMTP

RFC 821

SNMP

RFC 1098

DNS

RFC 10346 Presentation

5 Session

4 Transport TCP

RFC 793

UDP

RFC 768

3 Network ARP

RFC 826

RARP

RFC 903

ICMP

RFC 792

BOOTP

RFC 951

IP

RFC 791

2 Link 802.2

802.3 802.5 Other Medium Access

Protocols

1 Physical

The 7 Layer Burrito

7. Sour cream

6. Cheese

5. Guacamole

4. Tomato

3. Lettuce

2. Seasoned rice

1. Refried beans

A Networking Analogy

• Two office blocks, each contains a number of different companies

• Each company has one or more phone numbers (so there are several phone numbers for the office block)

• Each phone number has a few hundred extensions

• To call anyone, you need their company phone number, and their extension

• 4 numbers identify any call -- source phone number, source extension, destination phone number, destination extension

A Networking Analogy (cont.)

• Imagine if everybody agreed on certain standard phone extenions.

– #25 gets you to the mail room

– #80 is the marketing department

– #123 calls the speaking clock

• That’s almost how the Internet works

In an IP network…

• You have a host (an office building)

• Each host has one or more network interfaces (companies within the building)

• Each interface has one or more IP addresses attached to it (phone numbers)

• Each interface has 65535 ports (extensions)

• Connections are made from a port on an IP address to another port on an IP address

• 4 numbers identify a connection on the Internet -- source IP address, source port, destination IP address, destination port

In an IP network…

• You have a host (an office building)

• Each host has one or more network interfaces (companies within the building)

• Each interface has one or more IP addresses attached to it (phone numbers)

• Each interface has 65535 ports (extensions)

• Connections are made from a port on an IP address to another port on an IP address

• 4 numbers identify a connection on the Internet -- source IP address, source port, destination IP address, destination port

Packet switching

• Internet is a packet switched network

• Data is split into packets

• Each packet has a source IP/port, and a destination IP/port, as well as other meta-information

• Packets may not arrive in the same order as sent

• Packets may not even arrive at all

IP Address: A definition

• 32 bit number

– So there are 232 = 4,294,967,296 of them

• Normally written as 4 * 4 octet values, e.g., 10.10.1.1 (dotted quad notation)

• Are assigned by the network people, who arranged a block of addresses for the company, who were given them by your ISP, who was allocated them by their regional IP authority, who were assigned a regional block by the Internic.

So, tell me what ports are

• Like a telephone extension

• Each IP address has 216 - 1 = 65535 ports

• A server listens on an IP address:port pair for incoming connections

• A client is typically allocated a port at random for outgoing connections, and specifies the destination port it wants to connect to

• Some services (mail, web, etc) have “well known ports” assigned that servers are expected to listen on (25, 80, etc)

Networks are groups of IP addresses• IP addresses are grouped into collections,

called networks

• Network membership is determined by the netmask

• Netmask splits the IP address in to two portions; the host portion, and the network portion

• Two hosts are in the same network if the network portions of their IP addresses are identical

How netmasks work

• 10.10.1.1 is really

00001010 00001010 00000001 0000000110 10 1 1

• and 10.10.2.1 is really

00001010 00001010 00000010 0000000110 10 2 1

How netmasks work (cont.)

• Netmask is another 32 bit binary number

• It is binary-ANDed with the IP address

• All bits still on after this form the network portion of the IP address

• All bits left off are the host portion

How netmasks work (cont.)

• IP: 10.10.1.1

• Netmask: 255.255.255.0

• 00001010 00001010 00000001 0000000110 10 1 1

AND11111111 11111111 11111111 00000000255 255 255 0

=00001010 00001010 00000001 0000000010 10 1 0

• So this is the .1 host in the 10.10.1.0 network

How netmasks work (cont.)

• Netmask doesn’t have to be a continuous string of 1s, then continuous 0s

– 170.170.170.0

– 10101010 10101010 10101010 00000000

• That would be bloody stupid though

• In practice, netmasks are all 1s, then all 0s

How netmasks work (cont.)

• Leads to another common notation for netmasks, /n

• /24 means 24 x 1, then all 0

– 11111111 11111111 11111111 00000000

– Same as 255.255.255.0

• /16 would 16 x 1, then all 0

– 11111111 11111111 00000000 00000000

– Same as 255.255.0.0

How netmasks work (cont.)

• Are these two hosts on the same network?

– 10.10.1.1/24

– 10.10.2.1/24

• No. The first is on the 10.10.1.0 net, the second is on the 10.10.2.0 net

• What about these?

– 10.10.1.1/16

– 10.10.2.1/16

• Yes, they’re both on the 10.10.0.0 net

How netmasks work (cont.)

• Netmasks do not need to be on an octet boundary

– 11111111 11111111 11111111 11000000

– 255.255.255.192

– /26

– 10.10.1.33 = 00001010 00001010 00000001 00100001

– 10.10.1.67 = 00001010 00001010 00000001 01000011

The Network Addresses

• Network address is used to indicate the whole network

• No host can be given the network address

• Consists of the network portion as normal, with the host portion set to all zero

• 10.10.1/24, the network address is 10.10.1.0

• 10.10.1/26 defines four networks

– 10.10.1.0 = 00001010 00001010 0000001 00000000

– 10.10.1.64 = 00001010 00001010 0000001 01000000

– 10.10.1.128 = 00001010 00001010 0000001 10000000

– 10.10.1.192 = 00001010 00001010 0000001 11000000

The Broadcast Addresses

• Broadcast address is used to send to all hosts on the network

• No host can be given the broadcast address

• Consists of the network portion as normal, with the host portion set to all ones

• 10.10.1/24, the broadcast address is 10.10.1.255

• 10.10.1/26 defines four networks and broadcast addresses– 10.10.1.63 = 00001010 00001010 0000001 00111111

– 10.10.1.127 = 00001010 00001010 0000001 01111111

– 10.10.1.191 = 00001010 00001010 0000001 10111111

– 10.10.1.255 = 00001010 00001010 0000001 11111111

Shrinking address space

• /24 has 256 host addresses available

– .0 through to .255

• Lose .0, reserved for network

• Lose .255, reserved for broadcast

• Leaves you with (256 - 2) = 254 available addresses for hosts

Shrinking address space (cont.)• /25 creates two networks

• .0 network

– Network address is .0

– Broadcast address is .127

– Host addresses are .1 through to .126 (126 addresses)

• .128 network

– Network address is .128

– Broadcast address is .255

– Host addresses are .129 through to .254 (126 addresses)

• Only (126 * 2) = 252 available host addresses now

Smaller subnets, fewer hosts

• /26 network has four networks

• Each network reserves 2 addresses

• So there are 4 * 2 = 8 addreses reserved

• 256 - 8 = 248 host addresses available

• And so on

Routing

• Hosts on the same network can contact each other directly

• E.g., 10.10.1.1/24 wants to talk to 10.10.1.2/24.

• It puts a packet on the wire with a destination address of 10.10.1.2, and 10.10.1.2 receives it

• It’s like magic, you don’t need to know how this bit works, it just does

• If you become a network administrator, you will learn, in long, tedious detail, how this magic works

Routing (cont.)

• Hosts on two different networks can’t talk directly, they need a router to route the packets between them

• A router is a device with at least 2 network interfaces present on 2 or more different networks

• Hosts send packets for other networks to the router

• Router looks at the destination address information in the packet, and works out where to send it

Routing (cont.)

• Each Internet host has to maintain a routing table

• The routing table details how packets get from a to b

• The routing table only contains information about the networks the host is directly connected to

Routing (cont.)

10.10.1.2/24

10.10.2.2/24

Internet

10.10.2.1/24

10.10.1.1/2480.194.99.103/24

Routing (cont.)• Here’s the routing table for the workstations on the 10.10.1/24 network

• If it’s on the local network then we know we can reach it directly

• Otherwise send it on to the router, and hope that it knows how to deal with it

Destination Gateway

10.10.1/24 Local interface

Default 10.10.1.1

Routing (cont.)• Here’s the routing table for the workstations on the 10.10.2/24 network

• If it’s on the local network then we know we can reach it directly

• Otherwise send it on to the router, and hope that it knows how to deal with it

Destination Gateway

10.10.2/24 Local interface

Default 10.10.2.1

Routing (cont.)

• Here’s the routing table for the router

Destination Gateway

10.10.1.0/24 Interface 1

10.10.2.0/24 Interface 2

Default Interface 3

Routing (cont.)

• This is very scalable

– No host needs to know the complete route to the destination, or the Internet’s topology

– They just need to know the IP address of the nearest router

– The nearest router hands it off to the next nearest router, and so on

User Datagram Protocol (UDP)

• Runs on top of IP

• Connectionless, just send data

– No guarantee packets will be delivered in order, the applications must deal with this

– No guarantee packets will even arrive, applications must resend data as necessary

– A bit like the Post Office

• But very low overhead

Transmission Control Protocol (TCP)• Runs on top of IP

• Connection oriented (open/send/close)

• Network stack ensures

– Packets are delivered to the application in the correct order

– Missing packets are automatically resent

• Has more overhead than UDP, particularly on the intial connection (three way handshake)

• Handles network congestion well

Internet summary

• Hosts have interfaces

• Interfaces have IP addresses

• IP addresses subdivided in to the network portion and the host portion by the netmask

• Subdividing networks consumes available IP addresses (for network and broadcast address)

• Hosts on the same network can talk to one another directly

• Hosts on different networks need to know the address of the correct router to use

Internet Summary (cont.)

• Data sent using either UDP or TCP

• UDP is faster, but the application has to do more book keeping

• TCP starts slower, but the application has to do less work

IP Design Good Points

• Very scalable

• Easy to understand, simple rules

• Does not enforce specific policy

– Networks can be any size

– Does not require particular cabling standard

– Hardware and OS agnostic

• Open

IP Design Bad Points

• Large networks send a lot of meta data around

– Hosts announcing themselves

• Basic IP design is not secure

– Easy to spoof the source address on a packet

– Leads to denial of service attacks

– Malicious router can sniff traffic, or replace data

– Security in layers 5, 6, and 7 (SSL, SSH, etc)

Domain Name System

(DNS)

The Definitive Reference

• DNS and BIND, Paul Albitz & Cricket Liu

• Everything you ever wantedto know about the DNS

• Can’t recommend this bookhighly enough

IP Addresses are a pain

• Working with IP addresses is

– Cumbersome

– Error prone

– Hard to remember

• We prefer to name things where possible

• Which is why we have domain names

Fully Qualified Domain Names

• FQDN is two or more names, separated by dots

• L/R, the first part is the host name

• The rest is the domain name

• IP addresses are mapped to FQDNs

• FQDNs are mapped back to IP addresses

• How?

One way: The hosts file

10.10.1.1 gateway.example.com

10.10.1.2 me.example.com

10.10.1.3 another.example.com

. . .

This does not scale (!)

So the DNS was invented

• A hierarchical name space, read from right to left

• me.example.com (FQDN) is

. <- The root

.com <- Top level domain

.com.example <- Sub-domain

.com.example.me <- FQDN

• Converting a hostname to an IP address is called “resolving” the address

• “zone” and “domain” are almost interchangable terms

How the DNS is used

• 3 types of host

– DNS servers know how addresses and names map to one another for one or more domains

– DNS caches, given a domain, know how to find out which DNS server knows about that domain, and query it for info

– DNS clients (resolvers) know how to talk to caches

• DNS clients contact their nearest cache when they need to resolve an address. The cache works out which DNS server will have this information, and makes the queries

The root nameservers

• 12 (or so) machines, scattered around the world, that know the nameservers immediately below them

• Every DNS server in the world needs to know the IP addresses of the root nameservers

• That’s the only bit of static configuration required

• Everything else is looked up as necessary

• Which is pretty cool

DNS Hierarchy

.co.uk

www.brunel.ac.uk

brunel.ac.uk

src.doc.ic.ac.uk

doc.ic.ac.uk

ic.ac.uk

.ac.uk

.uk

.net

www.freebsd.org freefall.freebsd.org

freebsd.org slashdot.org

.org

citigroup.com

.com ...

GTLD Nameservers

Root Nameservers

Primary and Secondary DNS

• Each domain has exactly one primary (master) DNS server, and 0 to ‘n’ secondary (slave) servers

• To a client, there is no distinction between the two

• DNS information is updated on the primary DNS server

• Secondary servers periodically check for updates, and copy changes over as necessary

DNS in action

• dns.example.com is the local DNS cache

• me.example.com is a host that uses the DNS server

• You are a user running applications on me.example.com

• You type ‘www.freebsd.org’ in your web browser

• What happens?

DNS in action (cont.)

• First, me.example.com checks to see if it knows the IP address of www.freebsd.org

• It doesn’t

• So it sends a DNS query to dns.example.com

• This query says “Please give me the A record for the FQDN www.freebsd.org”

DNS in action (cont.)

• dns.example.com knows nothing about www.freebsd.org

• So it asks one of the root name servers

• They don’t know either, but they say “Go talk to the .org name servers, here’s their IP addresses”

• So dns.example.com goes and asks the .org name servers

DNS in action (cont.)

• They say “We don’t know, but we do know that ns.freebsd.org is the nameserver that’s authoritive for *.freebsd.org, here’s its address, go ask it”

• So dns.example.com says to ns.freebsd.org “Please give me the A record for www.freebsd.org”

• ns.freebsd.org says “Sure, it’s 216.136.204.117”

DNS in action (cont.)

• dns.example.com caches this information (so if it’s asked again it doesn’t need to redo all the above), and sends the info back to me.example.com

• All this happens in a few seconds

• This is what your browser is doing when it says something like “Resolving hostname”

Other types of DNS record

• That example used “A” records

– They map FQDNs back to IP addresses

• Called a “Forward” lookup

• Not the only type of records in the DNS

– PTR records map IP addresses to FQDNs

• Called a “Reverse” lookup

– NS records list the domain’s name servers

– MX records are used for mail routing

– SOA record is the ‘Start of Authority’

SOA Record

• Every zone has one SOA record

• Describes characteristics for the zone

– Serial number, which is incremented every time the data changes

– Time-to-live, which says how long data should be cached for

– E-mail address of DNS info maintainer

Example of a DNS Zone File

$ORIGIN brunel.ac.uk.brunel.ac.uk. IN SOA sirius.brunel.ac.uk. hostmaster.brunel.ac.uk.

(2002103001 ; Serial number 8000 ; Refresh after 2hrs 13min 7200 ; Retry after 2hrs 604800 ; Expire after 1wk 21600 ; Minimum TTL of 6hrs

)

IN NS sirius.brunel.ac.uk.IN NS ns3.ja.net.

IN MX 5 nemesis.brunel.ac.uk.IN MX 4 eros.brunel.ac.uk.

s70n133 IN A 134.83.70.133s249n88 IN A 134.83.249.88s249n90 IN A 134.83.249.90

… … …

IP Characteristics of DNS

• DNS servers listen on port 53

• Generally uses UDP

– Very short communication lifespan

– TCP overhead is too high

– Protocol is simple and robust

• Didn’t get an answer? Just send the query again

• May use TCP where appropriate

– Zone transfers between primary and secondary servers

Smart things about DNS

• Simple mechanism for synchronising primary and secondary servers

• Distributes data throughout the network, no real single point of failure for the Internet

– With the exception of the root nameservers

– DDoS Attacks

Bad things about DNS

• Not secure, you have to trust your DNS server

– Always do a forward lookup after a reverse lookup

• DNS server is a single point of failure for a network’s presence on the Internet

– So make sure that multiple secondary servers exist

– On different, geographically disparate networks

Bad things about DNS (cont.)

• Difficult to do updates ‘on demand’

– There are enhancements that try to address this

– But they’re not widely deployed

– Commercial interests

Simple Mail Transport Protocol

(SMTP)

SpaM Transport Protocol

What it sometimes feels like

A word from our sponsor…

• Wed 13th to 16th November 2003

• Compass Theatre, Ickenham

• £5.00, £6.50 or £7.50

• 07050 605081

• I’m in it as myself.

• “Nail it to the counter Lord Fergason and damn the cheesmongers!”

An e-mail message consists of…• Envelope

– Contains addressing information

– Discarded once the message is successfully delivered

• Header

– Contains 1-n “name: value” fields

– From:, To:, CC:, BCC:, Subject:, Date:, Received:, X-Foo:, X-Bar:, etc…

• Body

– Unstructured text of the actual message

Sample SMTP conversation# telnet eros.brunel.ac.uk 25220 ************HELO ngo.dnsalias.org250 eros.brunel.ac.uk OKMAIL FROM: nik@freebsd.org250 2.1.0 OKRCPT TO: simon.taylor@brunel.ac.uk250 2.1.5 Recipient OKDATA354 Enter Mail, end by a line with only ‘.’From: nik@freebsd.org (Nik Clayton)To: simon.taylor@brunel.ac.uk (Simon Taylor)Subject: Slides for lecture

Sorry mate, no chance I’ll have the slides ready in time, we’ll need to fake something. But keep it toyourself, I don’t think they’ll notice.

Nik.250 2.1.5 Submitted & queued (msg.22684-0)QUIT221 2.0.0 eros.brunel.ac.uk says goodbye to ngo.dnsalias.org

SMTP Highlights

• Protocol is entirely plain text

– Easy to debug

– Easy to test by hand

– Easy to script

• Protocol is relatively simple

– Easy to write code for (Microsoft excepted)

• Protocol is unambiguous

– All information is contained in the status codes. The explanatory text is useful but ignored by implementations

SMTP Highlights (cont.)

• Protocol is consistent

– 2xx codes indicate success

– 3xx codes indicate ‘send more data’

– 4xx codes indicate temporary failures

– 5xx codes indicate permanent failures

• The ‘xx’s provide further delineation

• SMTP implementations are supposed to be paranoid

A real SMTP failure

• We had an application that was a buggy SMTP server

• Sometimes it failed to send back a valid SMTP response after generating a bounce message

• The client didn’t know whether or not the message was delivered, temp. failed, or perm. failed

• So it tried, tens of times a second, to resend the message

• This generated thousands of bounce messages very quickly

The Envelope and Bcc:

• From: nik@freebsd.orgTo: Simon.Taylor@brunel.ac.ukBcc: someone.else@brunel.ac.uk. . .

• 220 . . .MAIL FROM: nik@freebsd.org250 . . .RCPT TO: simon.taylor@brunel.ac.uk250 2.1.5 Recipient OK RCPT TO: someone.else@brunel.ac.uk250 2.1.5 Recipient OKDATA354 . . .From: nik@freebsd.org (Nik Clayton)To: simon.taylor@brunel.ac.uk (Simon Taylor)

. . .

Sample Received: Lines

Received: from localhost (nik@localhost.nothing-going-on.org [127.0.0.1])

by crf-consulting.co.uk (8.12.3/8.12.3) with ESMTP id g9GFo4Tk093919

for <nik@localhost>; Wed, 16 Oct 2002 16:50:04 +0100 (BST)

(envelope-from Simon.Taylor@brunel.ac.uk)

Received: from ngo.org.uk [212.219.216.39]

by localhost with POP3 (fetchmail-5.9.11)

for nik@localhost (single-drop); Wed, 16 Oct 2002 16:50:04 +0100 (BST)

Received: from nemesis.brunel.ac.uk (nemesis.brunel.ac.uk [134.83.108.17])

by ngo.org.uk (8.9.3/8.9.3) with ESMTP id RAA07600

for <nik@ngo.org.uk>; Wed, 16 Oct 2002 17:01:18 +0100 (BST)

Received: from csstsjt (actually s76n96.brunel.ac.uk) by nemesis.brunel.ac.uk

with SMTP-BRUNEL (PP) with ESMTP; Wed, 16 Oct 2002 16:47:25 +0100

Re-ordered Received: lines

Received: from csstsjt (actually s76n96.brunel.ac.uk) by nemesis.brunel.ac.uk

with SMTP-BRUNEL (PP) with ESMTP; Wed, 16 Oct 2002 16:47:25 +0100

Received: from nemesis.brunel.ac.uk (nemesis.brunel.ac.uk [134.83.108.17])

by ngo.org.uk (8.9.3/8.9.3) with ESMTP id RAA07600

for <nik@ngo.org.uk>; Wed, 16 Oct 2002 17:01:18 +0100 (BST)

Received: from ngo.org.uk [212.219.216.39]

by localhost with POP3 (fetchmail-5.9.11)

for nik@localhost (single-drop);

Wed, 16 Oct 2002 16:50:04 +0100 (BST)

Received: from localhost (nik@localhost.nothing-going-on.org [127.0.0.1])

by crf-consulting.co.uk (8.12.3/8.12.3) with ESMTP id g9GFo4Tk093919

for <nik@localhost>; Wed, 16 Oct 2002 16:50:04 +0100 (BST)

(envelope-from Simon.Taylor@brunel.ac.uk)

Re-ordered Received: lines

Received: from csstsjt (actually s76n96.brunel.ac.uk) by nemesis.brunel.ac.uk

with SMTP-BRUNEL (PP) with ESMTP; Wed, 16 Oct 2002 16:47:25 +0100

Received: from nemesis.brunel.ac.uk (nemesis.brunel.ac.uk [134.83.108.17])

by ngo.org.uk (8.9.3/8.9.3) with ESMTP id RAA07600

for <nik@ngo.org.uk>; Wed, 16 Oct 2002 17:01:18 +0100 (BST)

Received: from ngo.org.uk [212.219.216.39]

by localhost with POP3 (fetchmail-5.9.11)

for nik@localhost (single-drop);

Wed, 16 Oct 2002 16:50:04 +0100 (BST)

Received: from localhost (nik@localhost.nothing-going-on.org [127.0.0.1])

by crf-consulting.co.uk (8.12.3/8.12.3) with ESMTP id g9GFo4Tk093919

for <nik@localhost>; Wed, 16 Oct 2002 16:50:04 +0100 (BST)

(envelope-from Simon.Taylor@brunel.ac.uk)

Acronyms

• MTA = Mail Transfer Agent

– The software that routes message from host to host (Sendmail, Postfix, Qmail, Exchange (cough))

• MUA = Mail User Agent

– The software that lets users send and receive e-mail (Outlook, Eudora, etc)

• PBCK = Problem Between Chair and Keyboard

– A user. See also “DFU”

Mail Routing

• I tap in simon.taylor@brunel.ac.uk into my MUA. What happens?

• MUA hands message off to local MTA

• Local MTA uses the DNS to look up MX records for brunel.ac.uk

• MX record?

MX Records

• Are entries in the DNS

• Unlike most other DNS entries (A records, etc), they contain two pieces of information

– A FQDN

– A weight / preference

• A domain (brunel.ac.uk) may have multiple MX records, listing different FQDNs and weights, providing redundancy

• Hosts acting as MXs for a domain do not need to be in the same domain as the domain they are acting as MXs for (!)

Brunel and Citigroup MX recordsWeight Host

4 eros.brunel.ac.uk

5 nemesis.brunel.ac.uk

Weight Host

50 mail1.citigroup.com

50 mail2.citigroup.com

50 mail3.citigroup.com

50 mail4.citigroup.com

50 mail5.ssmb.com

Mail Routing (cont.)

• The local MTA sorts the MX results in order of their weight (lowest first)

• It does a DNS lookup for the IP address(es) of the first FQDN in the list

• It tries to connect to that IP address on port 25

• If the connection succeeds it tries to deliver the message

• If the connection fails, or the delivery attempt failed with a temporary error, it tries again, with the next MX record in the list

Mail Routing (cont.)

• The MTA will queue messages for a period of time (5 days is typical)

• It will make regular attempts to re-deliver messages that generated temporary failures

– Failure after a certain period (normally 4 hours) may generate a “We are still trying to deliver your message” note to the envelope sender address

• Messages that generate a permanent failure from any of the MX hosts are not retried, and are bounced

• Bounces go to the envelope sender address, not the From: address

Exchange Servers

Citigroup Mail Backbone Structure

Anti-spam

Address re-writing

Archiving

Anti-virus

Internet

IP Characteristics of SMTP

• SMTP servers listen on port 25

• Always uses TCP

– Relatively long communication lifespan

– TCP overhead is acceptable

– TCP ensures packets are resent as necessary

Extending SMTP

• Turns out that, as originally specified, SMTP doesn’t do some useful things

• So ESMTP was invented

• But how do you do this without breaking all the existing implementations?

• Hmm…

Extending SMTP (cont.)

• Get out clause in the original SMTP spec

• If an SMTP server receives a command it doesn’t understand, it:

– Does not drop the connection

– Returns an error code (5xx)

– Pretends it never received the command

• Robustness in action, and a stroke of genius

Extending SMTP (cont.)

• EHLO - Extended HELO

• Replaces ‘HELO’ in the beginning of the SMTP spec

• If a server responds to EHLO with a 2xx code you know it speaks ESMTP

• If it responds with a 5xx code then you fall back to regular SMTP, and immediately send a HELO.

EHLO in action

220 issaspam-ny01.ssmb.com ESMTP Go aheadEHLO ngo.dnsalias.org250-issaspam-ny01.ssmb.com Hello 250-ENHANCEDSTATUSCODES250-PIPELINING250-8BITMIME250-SIZE 26214400250-DSN250-DELIVERBY250 HELPMAIL FROM: nik@freebsd.org250 . . .

EHLO failing

220 smtp.example.comEHLO ngo.dnsalias.org502 Error: command not implementedHELO ngo.dnsalias.org250 OKMAIL FROM: nik@freebsd.org250 . . .

A better way of solving the problem• Always embed version information in

to your protocols

• The version should be the first piece of information in any transaction

• Defines the format of the rest of the transaction

• But, still allow unimplemented commands to fail gracefully

Nice things about SMTP

• It’s distributed from the get-go, and it scales

– Need more servers? Add them, and update your MX records

• It’s open and royalty free

– SMTP is fully documented in RFC2821

– Message format is in RFC2822

• Heterogenous

– Nothing in SMTP ties it to a particular platform

More nice things about SMTP

• It’s resilient, and failures are handled

– MX server not responding? Go try another one

– Are they all down? Wait a bit, and try again

– It distinguishes between temporary errors• Disk’s full, I can’t accept any mail at the moment, so try

again letter

– And permanent errors• The e-mail address you’ve provided is invalid, I’m never

going to be able to deliver it.

• Hides implementation details from the user

– User doesn’t need to know the route the message takes

Nice things about SMTP..?

• Secure?

– Not really

– Relatively simple to forge mail

– Harder to forge it perfectly

– Does not address encryption or authentication of message contents

• Nobody’s perfect

Thanks

Questions?

Bonus Slides

Things I wish I knew 10 years ago• Work for a small company

– You learn a lot very quickly

– The hours can be insane

– You can accomplish a lot very fast

• Work for a large company

– You tend to specialise

– Regular hours

– Bureacracy is ever-present

More things to know

• Attend conferences

– You learn a lot

– The networking (people kind) is invaluable

– Speaking at them is great for the CV

• It also forces you to think clearly about a subject

– Never neglect the social side

• Travel whenever possible

– San Francisco is great in the summer

Still more things to know

• Always be aware of the Peter Principle

• Read “The Mythical Man Month”, Brooks

• Learn the Perl programming language

• Stay up to date with the technical journals

• Find time to have a life

Pseudo-code for a server

int s; // The socket handlesockaddr_t addr; // The socket addressint client; // Address info of the client

addr.sin_port = 80; // We’ll listen on port 80

s = socket(AF_INET, SOCK_STREAM, 0); // Create socket

// Assign the address info we specify to the socketbind(s, &addr, sizeof(sockaddr_t));

listen(s, 5); // 5 incoming connections at once

while(accept(s, &addr, &client)) {// If we’re here then something’s connected to us.// Do whatever we’re supposed to do when this happens

}

Pseudo-code for a client

int s; // The socket handlesockaddr_t addr; // The socket addressstruct hostent *he; // Info about the remote host

s = socket(AF_INET, SOCK_STREAM, 0); // Create socket

// Get the IP address of the host we want to connect tohe = gethostbyname(“www.freebsd.org”);

// Store the IP address, and the port we connect toaddr.sin_addr.s_addr = *((int *) he->h_addr_list[0]);addr.sin_port = 80;

if(connect(s, &addr, sizeof(addr)) == 0) {

// Connected to the remote host.// …

close(s); // All done}

User

me.example.com

dns.example.com

Root Nameserver

.org Nameserver

ns.freebsd.org Nameserver

top related