linux tcp/ip stack

22
Linux TCP/IP Stack

Upload: odessa-spears

Post on 31-Dec-2015

122 views

Category:

Documents


8 download

DESCRIPTION

Linux TCP/IP Stack. Process. Socket layer. 2: Data Link. Interface Layer (Ethernet, etc.). Protocol Layer (TCP / IP). TCP / IP vs. OSI model. 7: Application 6: Presentation 5: Session. 4: Transport 3: Network. 1: Physical - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Linux TCP/IP Stack

Linux TCP/IP Stack

Page 2: Linux TCP/IP Stack

1: PhysicalLayer

2: DataLink

4: Transport3: Network

7: Application6: Presentation5: Session

Interface Layer (Ethernet, etc.)

Protocol Layer (TCP / IP)

Socket layer

Process

TCP / IP vs. OSI model

Page 3: Linux TCP/IP Stack

TCP/IP Stack Overview Process

1: sosend (……………... )

Socket Layer

2: tcp_output ( ……. )

Protocol Layer (TCP Layer)

3: ip_output ( ……. )

Interface Layer (Ethernet Device Driver)

Output Queue

5: recvfrom(……….)

Input Queue

3: ip_input ( ……... )

4: tcp_input ( ……... )

Protocol Layer (IP Layer)

4: ethernet_output ( ……. ) 2: ethernet_input ( …….. )

Physical Media

Page 4: Linux TCP/IP Stack

Process Layer to TCP Layer

send (int socket, const char *buf, int length, int flags)Process

Kernel sendto (int socket, const char *data_buffer, int length, int flags, struct sockaddr *destination, int destination _length)

sendit (struct proc *p, int socket, struct msghdr *mp, int flags, int *return_size)

sosend (struct socket *s, struct mbuf *addr, struct uio *uio, struct mbuf *top, struct mbuf *control, int flags )

uipc_syscalls.c

uipc_socket.c

tcp_userreq (struct socket *s, int request, struct mbuf *m, struct mbuf * nam, struct mbuf * control ) tcp_userreq.c

tcp_output (struct tcpcb *tp) tcp_output.cTCP Layer

Page 5: Linux TCP/IP Stack

Socket Layer

sendto (int socket, const char *data_buffer, int length, int flags, struct sockaddr *destination, int destination _length)

Data

Data

Unused Space

150 BytesData

128 BytesmBuf

100 Bytes

28 Bytes

20 Bytes

50 Bytes

58 Bytes

m_nextpkt = NULL

m_next = NULLm_next

m_nextpkt = NULL

m_len = 100 m_len = 50

m_data m_data

m_type = MT_DATA m_type = MT_DATA

m_flags = M_PKTHDR m_flags = 0

m_pkthdr.len = 150

m_pkthdr.recvif =NULL

data_buffer

MBUF Chain

Page 6: Linux TCP/IP Stack

Socket Layer -sosend passes data and control information to the protocol layer

sosend(struct socket *s, struct mbuf *addr, struct uio *uio, struct mbuf *data_buffer, struct mbuf *control, int flags )

Initialize a new memory buffer and variables to hold flags

yes

Is there enough space in the buffer

sbspace(s->sb_snd)

no

Copy data_buffer mbuf

Free the memory buffers received

1 0More buffersto send?

yes

no

int error = tcp_usrreq(s, flags, mbuf, addr, control)

error

Return value of errorto sendto ( )

Page 7: Linux TCP/IP Stack

TCP Layer - tcp_usrreq(struct socket *s, int request, struct mbuf *data_buffer, mbuf *nam, mbuf * control)

Initialize internet protocol control block inp and TCP control block tpto store information useful for TCP

Convert Socket to Internet Protocol Control Block inp = sotoinpcb(so)

Convert the internet protocol control block to a tcp control block tp = intopcb(inp)

request

PRU_SEND

int error = tcp_output(tp)return errorto tcp_userreq( )

Page 8: Linux TCP/IP Stack

Called by tcp_usrreq for one of the following reasons:To send the initial SYNTo send a finished_sending messageTo send dataTo send a window update after data has been received.

tcp_ouput ( ) functionality: 1. determines whether TCP can send a segment or not depending on: flags in the data sent by the socket layer to send an ACK, etc.

Size of window advertised by the receiver’s end.Amount of data ready to send whether unacknowledged data already exists for the connection

2. Calculate the amount of data to be sent depending on:size of receiver’s windownumber of bytes in the send buffer

3. Check for window shrink

4. Send a segmentAllocate a buffer for the TCP and IP header from the header templateCopy the TCP and IP header template into the the buffer to be sent.Fill the fields in the TCP header.

Decrement the number of buffers to tbe sent, so that the end can be checked.Set sequencenumber and acknowledgement field.Set three fields in the IP header - IP length, TTL and Tos.Pass the datagram to IP

TCP Layer (tcp_output.c) - tcp_output(struct tcpcb *tp)

Page 9: Linux TCP/IP Stack

TCP Layer (tcp_output.c) - tcp_output(struct tcpcb *tp)

struct socket *so = tp -> t_inpcb -> inp_socket

Initialize a tcp header tcp_header

idle

Idle is true if the max sequence number equals the oldest unacknowledged sequence number,if an ACK is not expected from the other end.int idle = (tp -> snd_max == tp -> snd_una)

Check ACK FlagAcknowledgement isnot expected, set the congestion window to one segmenttp -> snd_cwnd = tp -> t_maxseg;

true

false

Page 10: Linux TCP/IP Stack

TCP Layer - tcp_output(struct tcpcb *tp)

Determine length of data that shouldbe transmitted and the flags to be used.len is the minimum number of bytes in the send buffer, win (the minimum of the receiver’s window)and the congestion window. len = min(so -> so_snd.sb_cc, win) - off

Acknowledgement isnot expected, set the congestion window to one segmenttp -> snd_cwnd = tp -> t_maxseg;

off is the offset in bytes from the beginning of the send buffer of the first data byte to send.off bytes have already been sent and acknowledgement on those is awaited.int off = tp -> snd_nxt - tp -> snd_una

Determine the flags like TH_ACK, TH_FIN, TH_RST, TH_SYNflags = tcp _outflags [ tp -> t_state ]

Page 11: Linux TCP/IP Stack

TCP Layer - tcp_output(struct tcpcb *tp)

tp -> t_flags &TF_ACKNOW

Send acknowledgement

Determine the flags like TH_ACK, TH_FIN, TH_RST, TH_SYNflags = tcp _outflags [ tp -> t_state ]

tp -> t_flags &TF_SYN || TH_RST

tp -> t_flags &TH_FIN

true

false

false

trueSend sequence numberor reset

Finished sending

true

false

Page 12: Linux TCP/IP Stack

Ckeck flags to determine the type of message:window proberetransmissionnormal data transmission

Length of data < 44 Bytes100 - 40 - 16

yes

Create a new mbuf chain,copy the surplus data andpoint it to the first mbuf chain.

Allocate an mbuf for the TCP & IP header and data if possible.MGETHDR ( m, M_DONTWAIT, MT_HEADR)M_DONTWAIT indicates that if memory is not available for mbuf then come out of the routine and return an error state.

no

Copy the data from the socket send buffer into thenew packet header mbuf

ip_output(m, tp->t_inpcb -> inp_options, &tp -> t_inpcb -> inp_route, so -> so_options & SO_DONOTROUTE, 0)

Page 13: Linux TCP/IP Stack

Packetsdamaged?

ip_output.c

ip_output(struct mbuf *m, struct mbuf *opt, struct route *ro, int flags, struct ip_moptions *imo)1. Header initialization2. Route Selection3. Source address selection and Fragmentation

1. Header initialization

The value of “flags” decides what’s to be done with the data• IP_FORWARDING : Forward packet• IP_ROUTETOIF : Route directly to Interface• IP_ALLOWBROADCAST : Allow broadcasting of packet• IP_RAWOUTPUT : Packet contains pre-constructed header

yes ERROR

if ((flags == IP_FORWARDING ) || (flags == IP_RAWOUTPUT ))

no

Save header length in hlen for fragmentation algorithm

Construct and initialize IP headerset ip_v = 4, clear ip_off

assign unique identifier to ip_idlength, offset, TTL, protocol, TOS etc

are set by higher layers.

yes

no If the packet has to be forwarded to another host, i.e if the machine is acting as a router, then the IP header for forwarded packets should not be modified by ip_output.

Check if there were any errors while adding headers in higherlayers. Most of the fields of the IP header are pre defined byhigher layer protocols.

If the packet is not being forwarded and has to be sent to another host then initialize the IP header.

Page 14: Linux TCP/IP Stack

2. Route Selection

Verify Cached Route for destination address

Find the interface on which the packet has to be placed. Ifp points to the interface’s ifnet structure.

If (cached_route == destination)

Locate route : Call rtalloc(dst_ip) to locate a route to the destination. Find the interface on which the packet has to be placed. Ifp points to the interface’s ifnet structure. If rtalloc(dst_ip) fails to find a route, return host unreachable error.

yes

no

A cached route may be provided to ip_output as an argument. UDP and TCP maintain a route cache associated with each socket.

Check if the cached route is the correct destination. If a route has not been provided, ip_output sets a temporary route structure called iproute.

If the cached route is provided, find the interface on which the frame has to be sent.

If the packet is being routed, rtalloc locates a route to the address specified by dst. If rtalloc fails, an EHOSTUNREACH error is generated. If ip_forward called ip_output the error is converted to an ICMP error.If the address is found then ifp is made to point to thr ifnet structure for the interface. If the next hop is not the packets final destination, then dst is changed to point to the next hop router.

Page 15: Linux TCP/IP Stack

3. Source address selection and Fragmentation

Check if valid source address is specified.

Select the IP address of the outgoinginterface as the source address.

Does the packet have to be fragmented ?

Fragment the packet if it’s size isgreater than the MTU.

If there are no check_sum errors, send the data to if_output function of the selected interface.

no

yes

yes

no

The final section of the ip_output ensures that theIP header has a valid source IP address. This couldn’t have been done earlier because the route hadn’t been selected yet. If there is no source IP then the IP address of the outgoing interface is used as the source IP.

Larger packets (packets that exceed the MTU) must be fragmented before they can be sent.

In either case (fragmented or not) the checksum is computed (in_cksum). If no errors are found, the data is sent to if_output function of the output interface.

Page 16: Linux TCP/IP Stack

Interface Layer (if_ethersubr.c)

ether_output(struct ifnet *ifp, struct mbuf *mbuf, struct sockaddr *destination, struct rtentry *routing_entry)1. Verification2. Protocol-Specific Processing3. Frame Construction4. Interface Queuing.

senderr (ENETDOWN)Ethernet portup and running ?ifp -> if_flags &

(IF_UP | IF_RUNNING )

no

yes

1. Verification

Page 17: Linux TCP/IP Stack

Interface Layer(if_ethersubr.c) - ether_output(struct ifnet *ifp, struct mbuf *mbuf,

struct sockaddr *destination, struct rtentry *rt_entry)

Function: Takes the data portion of an Ethernet frame ans encapsulates it with a 14-byte header and places it on the interface send_queue.Phases: Verification, Protocol-Specific Processing, Frame Construction, Interface Queuing.

Arguments - ifp points to outgoing interface’s ifnet structurembuf is the data to be sentdestination is the destination addressrt_entry points o the routing entry

Initialize- Ethernet header - struct eth_header *eh

senderr (ENETDOWN)Ethernet portup and running ?ifp -> if_flags &

(IF_UP | IF_RUNNING )

no

yes

Verification

Page 18: Linux TCP/IP Stack

senderr (EHOSTUNREACH)

Route valid ?rt_entry = rtalloc1 (destination, 1)

0

1

Next hop a gateway ?rt = rt -> rt_gwroute

0

1

Destination respondingto ARP requests?

If not then do not send more packets to avoid flooding.

rt -> rt_flags &RTF_REJECT

no

Verification

Protocol Specific Processing

Page 19: Linux TCP/IP Stack

Protocol Specific ProcessingFunctionality: Finds Ethernet address corresponding to the IP address of the destination.

Use m_copy( ) to keep the packet tillan ack. Is recvd.

destination -> sa_family

AF_INET

Send ARP broadcast to find theethernet address corresponding to the destination IP address

Frame Preparartion

Page 20: Linux TCP/IP Stack

Make sure there is room for the 14 byteethernet headerM_PREPEND ( m, sizeof(ethernet_header), M_DONOTWAIT)

Frame Preparartion

Protocol Specific Processing

Form the Ethernet header fromethernet frame type, ethernet MAC address,unicast ethernet address associated with the output interface.e.g. the default gateway for a host

Page 21: Linux TCP/IP Stack

Interface Queuing

Frame Preparartion

Is the output queue full

no

yes Discard the frameFree the memory buffsenderr ( ENOBUFS )

Place the frame on the interface’s send queue

lestart ( ifp )

if_snd

lestart ( ifp )

Page 22: Linux TCP/IP Stack

Interface Layer(if_le.c) - lestart(struct ifnet *ifp)

Function: Dequeues frames from the interface output queue and arranges for them to be transmitted by the Ethernet Card.

le -> sc_if.if_flags &IFF_RUNNING

struct le_softc *le = & le_softcl [ ifp -> if_unit ]

return error

1

0

Copy the the frame in mbuf to the hardware buffer

Set the IFF_OACTIVE on to indicate that thedevice is busy transmitting.