what’s needed to receive? a look at the minimum steps required for programming our 82573l nic to...

25
What’s needed to receive? A look at the minimum steps required for programming our 82573L nic to receive packets

Post on 19-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: What’s needed to receive? A look at the minimum steps required for programming our 82573L nic to receive packets

What’s needed to receive?

A look at the minimum steps required for programming our 82573L nic to receive packets

Page 2: What’s needed to receive? A look at the minimum steps required for programming our 82573L nic to receive packets

Accessing 82573L registers

• Device registers are hardware mapped to a range of addresses in physical memory

• We can get the location and extent of this memory-range from a BAR register in the 82573L device’s PCI Configuration Space

• We then request the Linux kernel to setup an I/O ‘remapping’ of this memory-range to ‘virtual’ addresses within kernel-space

Page 3: What’s needed to receive? A look at the minimum steps required for programming our 82573L nic to receive packets

kernel space

Linux address-spaces

dynamic ram

nic registers

userspace

kernel code/data

nic registers

‘virtual’ address-spacephysical address-space

128-TB

128-TB

.text, .data, .bss

stack

shared libraries

dynamic ram

64-GB

Page 4: What’s needed to receive? A look at the minimum steps required for programming our 82573L nic to receive packets

Kernel memory allocation

• The NIC requires that some host memory for packet-buffers and receive descriptors

• The kernel provides a ‘helper function’ for reserving a suitable region of memory in kernel-space which is both ‘non-pageable’ and ‘physically contiguous’ (i.e., kzalloc())

• It’s our job is to decide how much memory our network controller hardware will need

Page 5: What’s needed to receive? A look at the minimum steps required for programming our 82573L nic to receive packets

Format for an Rx Descriptor

Base-address (64-bits) statusPacket-length

Packet-checksum

VLANtag

errors

16 bytes

The device-driver initializes this ‘base-address’ field with the physical address of a packet-buffer

The network controller will ‘write-back’ the values for these fields when it has transferred a received packet’s data into this descriptor’s packet-buffer

Page 6: What’s needed to receive? A look at the minimum steps required for programming our 82573L nic to receive packets

Suggested C syntax

typedef struct {unsigned long long base_address;unsigned short packet_length;unsigned short packet_cksum;unsigned char desc_status;unsigned char desc_errors;unsigned short vlan_tag;} RX_DESCRIPTOR;

‘Legacy Format’ for the Intel Pro1000 network controller’s Receive Descriptors

Page 7: What’s needed to receive? A look at the minimum steps required for programming our 82573L nic to receive packets

the packet’s data ‘payload’ goes here(usually varies from 56 to 1500 bytes)

Ethernet packet layout

• Total size normally can vary from 64 bytes up to 1522 bytes (unless ‘jumbo’ packets and/or ‘undersized’ packets are enabled)

• The NIC expects a 14-byte packet ‘header’ and it appends a 4-byte CRC check-sum

destination MAC address (6-bytes)

source MAC address(6-bytes)

Type/length(2-bytes)

Cyclic RedundancyChecksum (4-bytes)

0 6 12 14

Page 8: What’s needed to receive? A look at the minimum steps required for programming our 82573L nic to receive packets

Rx-Descriptor Ring-Buffer

Circular buffer (128-bytes minimum – and must be a multiple of 128 bytes)

RDBA base-address

RDLEN (in bytes)

RDH (head)

RDT (tail)

= owned by hardware (nic)

= owned by software (cpu)

0x00

0x10

0x20

0x30

0x40

0x50

0x60

0x70

0x80

Page 9: What’s needed to receive? A look at the minimum steps required for programming our 82573L nic to receive packets

Packet-buffers and descriptors

• Our ‘nicrx.c’ module allocates 8 buffers of size 2K-bytes (i.e., more than enough for any normal Ethernet packets)

16K + 128 bytes allocated (8 packet-buffers, plus Rx-Descriptor Queue)

for the Rx Descriptor Queue (128 bytes)

for the eight packet-buffers

Page 10: What’s needed to receive? A look at the minimum steps required for programming our 82573L nic to receive packets

RxDesc Status-field

PIF IPCS TCPCS VP IXSM EOP DD

7 6 5 4 3 2 1 0

DD = Descriptor Done (1=yes, 0=no) shows if nic is finished with descriptor EOP = End Of Packet (1=yes, 0=no) shows if this packet is logically last IXSM = Ignore Checksum Indications (1=yes, 0=no) VP = VLAN Packet match (1=yes, 0=no) USPCS = UDP Checksum calculated in packet (1=yes, 0=no) TCPCS = TCP Checksum calculated in packet (1=yes, 0=no) IPCS = IPv4 Checksum calculated on packet (1=yes, 0=no) PIF = Passed In exact Filter (1=yes, 0=no) shows if software must check

UDPCS

Page 11: What’s needed to receive? A look at the minimum steps required for programming our 82573L nic to receive packets

RxDesc Error-field

RXE IPE TCPE reserved=0 SE CE

7 6 5 4 3 2 1 0

RXE = Received-data Error (1=yes, 0=no) IPE = IPv4-checksum error TCPE = TCP/UDP checksum error (1=yes, 0=no) SEQ = Sequence error (1=yes, 0=no) SE = Symbol Error (1=yes, 0=no) CE = CRC Error or alignment error (1=yes, 0=no)

SEQreserved=0

Page 12: What’s needed to receive? A look at the minimum steps required for programming our 82573L nic to receive packets

Essential ‘receive’ registers

enum {

E1000_CTRL 0x0000, // Device Control

E1000_STATUS 0x0008, // Device Status

E1000_RCRL 0x0100, // Receive Control

E1000_RDBAL 0x2800, // Rx Descriptor Base Address Low

E1000_RDBAH 0x2804, // Rx Descriptor Base Address High

E1000_RDLEN 0x2808, // Rx Descriptor Length

E1000_RDH 0x2810, // Rx Descriptor Head

E1000_RDT 0X2818, // Rx Descriptor Tail

E1000_RXDCTL 0x2828, // Rx Descriptor Control

E1000_RA 0x5400, // Receive address-filter Array

};

Page 13: What’s needed to receive? A look at the minimum steps required for programming our 82573L nic to receive packets

Programming steps

1) Detect the presence of the 82573L network controller (VENDOR_ID, DEVICE_ID)2) Obtain the physical address-range where the nic’s device-registers are mapped3) Ask the kernel to map this address range into the kernel’s virtual address-space4) Copy the network controller’s MAC-address into a 6-byte array for future access5) Allocate a block of kernel memory large enough for our descriptors and buffers6) Insure that the network controller’s ‘Bus Master’ capability has been enabled 7) Select our desired configuration-options for the DEVICE CONTROL register8) Perform a nic ‘reset’ operation (by toggling bit 26), then delay until reset completes9) Select our desired configuration-options for the RECEIVE CONTROL register10) Initialize our array of Receive Descriptors with the physical addresses of buffers 11) Initialize the Receive Engine’s registers (for Rx-Descriptor Queue and Control)12) Give ‘ownership’ of all of our Rx-Descriptors to the network controller13) Enable the Receive Engine14) Install our ‘/proc/nicrx’ pseudo-file (for user-diagnostic purposes)

NOTE: Steps 1) through 8) are the same as for our ‘nictx.c’ kernel module.

Page 14: What’s needed to receive? A look at the minimum steps required for programming our 82573L nic to receive packets

Device Control (0x0000)

PHYRST

VME R=0

TFCE RFCE RST R=0

R=0

R=0

R=0

R=0

ADVD3

WUC

R=0

D/UDstatus

R=0

R=0

R=0

R=0

R=0

FRCDPLX

FRCSPD

R=0

SPEED R=0

SLU

R=0

R=0

R=1

0 0 FD

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

GIOMD

R=0

31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16

FD = Full-Duplex SPEED (00=10Mbps, 01=100Mbps, 10=1000Mbps, 11=reserved)GIOMD = GIO Master Disable ADVD3WUP = Advertise Cold Wake Up Capability SLU = Set Link Up D/UD = Dock/Undock status RFCE = Rx Flow-Control EnableFRCSPD = Force Speed RST = Device Reset TFCE = Tx Flow-Control EnableFRCDPLX = Force Duplex PHYRST = Phy Reset VME = VLAN Mode Enable

82573LWe used 0x04000A49 to initiate a ‘device reset’ operation

Page 15: What’s needed to receive? A look at the minimum steps required for programming our 82573L nic to receive packets

0

Device Status (0x0008)

? 0 0 0 0 0 0 0 0 0 0 0GIO

MasterEN

0 0 0

0 0 0 0 PHYRA ASDV

ILOS

SLU

0 TXOFF 0 0

FD

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

FunctionID

LU

31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16

SPEED

FD = Full-DuplexLU = Link UpTXOFF = Transmission PausedSPEED (00=10Mbps,01=100Mbps, 10=1000Mbps, 11=reserved)ASDV = Auto-negotiation Speed Detection ValuePHYRA = PHY Reset Asserted

82573L

some undocumented functionality?

Page 16: What’s needed to receive? A look at the minimum steps required for programming our 82573L nic to receive packets

Receive Control (0x0100)

R=0

0 0FLXBUFSE

CRCBSEX R

=0PMCF DPF R

=0CFI

CFIEN

VFE BSIZE

BAM

R=0

MO DTYP RDMTS

ILOS

SLU

LPE UPE 0 0 R=0

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

SBPEN

31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16

LBM MPE

EN = Receive Enable DTYP = Descriptor Type DPF = Discard Pause Frames SBP = Store Bad Packets MO = Multicast Offset PMCF = Pass MAC Control FramesUPE = Unicast Promiscuous Enable BAM = Broadcast Accept Mode BSEX = Buffer Size ExtensionMPE = Multicast Promiscuous Enable BSIZE = Receive Buffer Size SECRC = Strip Ethernet CRCLPE = Long Packet reception Enable VFE = VLAN Filter Enable FLXBUF = Flexible Buffer sizeLBM = Loopback Mode CFIEN = Canonical Form Indicator EnableRDMTS = Rx-Descriptor Minimum Threshold Size CFI = Canonical Form Indicator bit-value

We used 0x1440821C in RCTL to prepare the ‘receive engine’ prior to enabling it

Page 17: What’s needed to receive? A look at the minimum steps required for programming our 82573L nic to receive packets

Rx-Descriptor Control (0x2828)

0 0 0 0 0 0 0

GRAN

0 0 WTHRESH(Writeback Threshold)

0 0 0 FRCDPLX

FRCSPD 0HTHRESH

(Host Threshold)

ILOS

0 0

ASDE

0

LRST

0 0

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

0 0

31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16

PTHRESH(Prefetch Threshold)0 0

Recommended for 82573: 0x01010000 (GRAN=1, WTHRESH=1)

“This register controls the fetching and write back of receive descriptors. The three threshold values are used to determine when descriptors are read from, and written to, host memory. Their values can be in units of cache lines or of descriptors (each descriptor is 16 bytes), based on the value of the GRAN bit (0=cache lines, 1=descriptors). When GRAN = 1, all descriptors are written back (even if not requested).” --Intel manual

Page 18: What’s needed to receive? A look at the minimum steps required for programming our 82573L nic to receive packets

PCI Bus Master DMA82573L i/o-memory

RX and TX FIFOs(32-KB total)

Host’s Dynamic Random Access Memory

Descriptor Queue

packet-buffer

packet-buffer

packet-buffer

packet-buffer

packet-buffer

packet-buffer

packet-buffer

DMA

on-chip RX descriptors

on-chip TX descriptors

Page 19: What’s needed to receive? A look at the minimum steps required for programming our 82573L nic to receive packets

Pthresh and Hthresh

• When the number of unprocessed descriptors in the NIC’s on-chip memory has fallen below the Prefetch Threshold, and the number of valid descriptors in host memory which are owned by the NIC is at least equal to the Host Threshold, then the NIC will fetch that number of descriptors in a single ‘burst’ DMA-transfer

Page 20: What’s needed to receive? A look at the minimum steps required for programming our 82573L nic to receive packets

Wthresh

• When the number of descriptors waiting in the NIC’s on-chip memory to be written back to Host memory is at least equal to the Writeback Thrershold, then the NIC will write back that number of descriptors in a single ‘burst’ DMA-transfer

Page 21: What’s needed to receive? A look at the minimum steps required for programming our 82573L nic to receive packets

Experiment #1

• Let’s install our ‘nicrx.c’ kernel module on one host, and use the ‘cat’ command to view its queue of Rx-Descriptors:

$ /sbin/insmod nicrx.ko

$ cat /proc/nicrx

• Then let’s install our ‘nictx.c’ module on a different host on the same local network:

$ /sbin/insmod nictx.ko

• Now look again at the receive descriptors!

Page 22: What’s needed to receive? A look at the minimum steps required for programming our 82573L nic to receive packets

Experiment #2

• Install our ‘dram.c’ device-driver module on both of these host-machines, and use our ‘fileview’ utility to look at the contents of each module’s packet-buffers – you’ll find their physical addresses displayed if you use ‘cat’ to see the descriptor-queues:

$ cat /proc/nictx and $ cat /proc/nicrx

Page 23: What’s needed to receive? A look at the minimum steps required for programming our 82573L nic to receive packets

Experiment #3

• Our ‘nicrx.c’ module had enabled both the Unicast and Multicast promiscuous modes

• So let’s watch what happens when we use the ‘/sbin/ifconfig’ command (with ‘sudo’) to bring up a secondary network interface on another host on the same segment of our local network

• Do you recognize these new packets?

Page 24: What’s needed to receive? A look at the minimum steps required for programming our 82573L nic to receive packets

Experiment #4

• With ‘nicrx.c’ module installed on one host, log on to two other hosts on the same LAN and bring up their ‘eth1’ network interfaces

• Use the ‘ping’ command on one of these two hosts to try contacting the other one

• What do you observe about any packets that are received by the host where our ‘nicrx.c’ module had been installed?

Page 25: What’s needed to receive? A look at the minimum steps required for programming our 82573L nic to receive packets

In-class exercise

• Suppose you turn off the UPE-bit (bit #3) in the Receive Control register (in nicrx.c)

• From another host on the same segment, bring up its ‘eth1’ interface, then adjust its routing table so that all multicast packets are sent out via the secondary interface:

$ sudo /sbin/route add –net 224.0.0.0 netmask 255.0.0.0 device eth1

• If you ‘ping’ a multicast address, will the ICMP datagram be received by ‘nicrx.c’?