smartcom's control plane software, a customized version of freebsd by boris astardzhiev
TRANSCRIPT
Smartcom’s control plane software, a customized version of FreeBSD
Boris Astardzhiev
Smartcom-Bulgaria AD, R&D DepartmentEuroBSDCon 2014, Sofia, Bulgaria
Who are we?
2
● Smartcom-Bulgaria AD○ Since 1991○ Approximately 100 employees at present
● 3 main departments○ Integration○ Microelectronics○ Research and development (about 15 people)
Our first manageable switch
5
● Smart Switch Pro 800○ Motorola CPU○ Based on Realtek○ 8 x 100MBit/s Ethernet copper ports○ Managed through GUI via its ports
The second ones
6
● SGSv1○ Atmel ARMv9 CPU○ Based on Marvell chipsets○ 24 or 8 x 100MBit/s Ethernet ports ○ 2 x 1GBit/s Ethernet ports○ GNU/Linux based○ Triple-play focused
● Issues
Meanwhile...
7
● New customers’ requirements○ …Hardware switch/router?
● Marvell gave us a chance● 2 SoCs
○ Address customers’ requests○ Redesign SGSv1○ Identical registers
The new appliances
8
● SGSR○ Layer 3 distribution switch
● SGSv2○ Access switch substituting SGSv1
● Designed from the ground up in Smartcom-Bulgaria
SGSR’s hardware
9
● Marvell SoC platform○ ARMv5 CPU with 1 core○ 800MHz clockspeed○ 512MB DRAM○ 512MB USB flash memory
● Modular hot-swap architecture○ Up to 24 1GBit/s ports○ Up to 4 10GBit/s ports
● Layer 2 switching○ Max MAC addresses per system: 16K○ Jumbo frames support (9KB)○ Supported VLANs: 0 - 4094○ IEEE 802.1AD VLAN stacking (QinQ)
● Layer 3 features○ Routing table size: 13K○ ARP table size: 4K○ ACL based routing
● ACLs● QoS
○ Ingress/egress rate limiting○ 8 hardware queues per-port○ ACL based traffic
classification and QoSprofile assignment
● IP Multicast● Storm controls
SGSv2’s hardware
10
● Marvell SoC platform○ ARMv5 CPU with 1 core○ 800MHz clockspeed○ 128MB DRAM○ 512MB flash memory
● Interfaces○ 24 x 10/100/1000MBit/s SFP/RJ45 ports○ 4 x 1GBit/s combo SFP/RJ45 ports
● Layer 2 switching○ Max MAC addresses per system: 16K○ Jumbo frames support (9KB)○ Supported VLANs: 0 - 4094○ VLAN stacking (QinQ)
● QoS○ 8 hardware queues per-
port○ Scheduling methods
(egress): strict priority and WRR
○ 802.1p priority trust and remap
● ACLs● Storm controls● L2 Multicast groups: 1K
The software choice
12
● Why FreeBSD?○ It’s free due to the BSD license○ The Marvell SoCs had support in the 8 branch○ NETGRAPH○ The biggest BSD community probably
● NetBSD had support for our chips as well but...○ No mainline NETGRAPH
● OpenBSD didn’t support our chips
Initializing the hardware
14
● U-Boot○ API○ USB○ Manage disk’s active slices
● ubldr○ Connect it to U-Boot’s API
● The FreeBSD loader○ CRC32 of a file feature was introduced
● Let’s boot the kernel...
The design
15
FreeBSDkernel
Portinterfaces
(sgs_if_port)sw-0
HardwareSoftware
Marvell MAC
DMA
Kernelspace
Userspace
CPU port
HW library(kobj)
Userlanddaemons/tools/facilities ifconfig
p27...p1p0
socketvarious interfaces
The network stack
16
Port
Lagg
Unit
Bridge
Interface
RouterSubinterface
vlan
family
XOR
1
1..*
1
1
1 1..*
1
1
1
11
1..*
1
1
1..*
11
1● Inspired by
NETGRAPH● ifnet
○ if_input○ The glues
propertypvid
Port’s ifnet
structure
Lagg’s softc
sgs_if_lagg
if_input
lagg_inputif_vlantrunkNULL
Stack optimization on ingress flow
17
CPU if_sw
Interrupt
Fetch a frame
sw_intr_rx(sifp, mbuf)
sgs_if_port
port_input(m
buf)
sgs_if_unit
sgs_if_lagg
pifp->if_vlantrunk != NULL
unit_input(pifp, mbuf)
pifp->sgs_if_lagg != NULL
lagg_input(ifp_port, mbuf)lifp->if_vlantrunk != NULLunit_input(lifp, mbuf)
sgs_if_bridge
uifp->sgs_if_bridge != NULLbridge_input(uifp, mbuf)
sgs_if_iface
sgs_if_subiface
bifp->sgs_if_iface != NULLiface_input(bifp, mbuf)
iifp->sgs_if_subiface != NULLsubiface_input(iifp, mbuf)
XOR
Egress flow
18
if_start
if_transmit
sgs_if_port
MAC Controller
if_sw
IFQ_HANDOFF(pifp, mbuf)
ENQUEUE(pifp->if_start)(pifp) DEQUEUE pifp->if_transmit Send a frame
ether_output()
Frame
The unicast router
19
● Initial tasks in terms of hardware○ TCAM updates and LPM○ Insure consistency
● How do we handle it?○ Intercept traffic in CPU
■ Trigger ARPs● in_arpinput() hook
○ Routing messages■ Update network prefixes■ rt_dispatch() hook
The multicast router
20
● options MROUTING○ Intercept multicast data traffic in CPU
■ Trigger MFC updates and upcalls○ Hooks
■ update_mfc_params()■ expire_mfc()
● TCAM activity● Userland daemons
○ Handle upcalls
Implementation and useful tools
21
● Kernel facilities○ BPF○ callout○ EVENTHANDLER○ ioctl○ kobj○ locks○ socket○ sysctl○ syscall○ taskqueue○ ...
● Userspace facilities○ awk/sed○ cron○ ifconfig○ regtool○ route○ ssh○ ...
Layer 2 features● Mainly interfaces’ property related
○ VLAN 802.1q tagging, QinQAuto-learning, Link transitionsdampening, Static MACs
● Packet interception oriented○ LACP○ RSTP○ IGMP snooping
■ Process group memberships ○ DHCP snooping
■ Track states■ Option 82 & ACL assisted security
22
vlan-10
ioctl
igmpd
port-3/1.10
ioctl
ifconfig
HW library
Intercept IGMP packets Set
membership
BPF
Layer 3 features● Mainly packet interception oriented
○ Unicast routing○ Inter VLAN Multicast routing○ Policy based routing○ SNMP
■ Based on bsnmpd○ PIM-SM○ BGP
■ Based on openbgpd○ DHCP relay with ACL assisted security
■ Track states and insert option 82● Non-packet interception oriented
○ Routing preferences23
pimd
ip_mroute
HW library
MFCupcalls
Intercept PIM, IGMP
and multicast frames
Set someoptions
ip_input
Quality of Service
24
● Rate-limiting● Storm controls● 8 queues per egress● ACL based traffic classification and
QoS profile assignment● CPU port
○ 8 queues■ Management traffic■ Intercepted traffic
The system as whole
25
● How do we upgrade?○ Modified NanoBSD
■ Redundancy■ 4 slices
● UFS● One active rootfs out of two - /● Config files - /cfg● Misc - /data
■ Whole image upgrading is slow
The Port Collection
26
● Pretty customized○ Focused on frequently modified
user space facilities● Upgrade only parts of the system
○ No or little service disruption○ Convenient for partial upgrades
CLI
27
● Based on klish● Hierarchical● The language
○ Mainly Lua and shell scripts● Database integration
○ SQLite3● Commit oriented instead of enter and shoot● The desired way for configuring the device
Development issues
28
● ARM Debugging○ Kernel space○ User space
● Crash inspections○ Classic dumps to a swap partition○ NETDUMP
● (Back)traces● Patches and new stuff from FreeBSD● Tracking latest version of FreeBSD
Quality assurance
29
● Black box testing○ Equivalence partitioning○ Boundary-value analysis○ Load and stress testing○ Exploratory testing○ Interoperability tests○ System testing in a real topology
● Automation and regression○ CLI and SNMP○ TCL/Expect
Future development
30
● IPv6● VRF● Stacking● Make our software as a module● Optimize code● Redesign and reimplement
Q&A
Smartcom-Bulgaria AD, BIC IZOT, Office 317, 133 Tzarigradsko Chaussee Blvd.7th km, 1784 Sofia, BULGARIA, Tel.: +359 2 9650650, Fax: +359 2 9743469
http://www.smartcom.bg/e-mail: [email protected]
powered by
Thank you! Questions?