based on composite protocols

32
Building a Reliable Multicast Service Building a Reliable Multicast Service based on based on Composite Protocols Composite Protocols Sandeep Subramaniam Master’s Thesis Defense The University of Kansas 06.13.2003 Committee: Dr. Gary J. Minden (Chair) Dr. Joseph B. Evans Dr. Perry Alexander

Upload: others

Post on 06-Jan-2022

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: based on Composite Protocols

Building a Reliable Multicast ServiceBuilding a Reliable Multicast Servicebased on based on

Composite ProtocolsComposite Protocols

Sandeep Subramaniam

Master’s Thesis DefenseThe University of Kansas

06.13.2003

Committee:Dr. Gary J. Minden (Chair)Dr. Joseph B. EvansDr. Perry Alexander

Page 2: based on Composite Protocols

2

Outline

• Motivation• Composite Protocol (CP) Framework• Design of a composable service – multicast example• Implementation of service over Ensemble • Testing and Performance Evaluation• Summary & Future Work

Page 3: based on Composite Protocols

3

Basic definitions

• Protocol component– Single function entity that embeds minimal protocol functionality – E.g. checksum, reliable-delivery, fragment

• Composite protocol – Collection of protocol components arranged in orderly fashion – E.g. An IP-like composite protocol would consist of forwarding,

fragment and checksum protocol components.

• Composable service– Collection of 2 or more co-operating composite-protocols.– E.g. multicast consists of multicast routing, group management and

replication of data

Page 4: based on Composite Protocols

4

Advantages of Composite Protocols

• Protocols implemented as collections of single-function components– Reusability– Flexibility – Aid in formal verification, building correct protocols.– Customization , fine tuned protocol stacks– “Properties-in Protocol-out”

Page 5: based on Composite Protocols

5

Motivation

• Motivation for composable service– Apply the composite protocol approach to wider range of protocols

• Data and control plane protocols • demonstrate feasibility and applicability • Expand the library of protocol components

– Address issues of inter-protocol communication • Building a network service addresses all above needs.• Reliable multicast service chosen as an ideal example

– 3 co-operating protocols operating in tandem • Reliable replication of data (multicast forwarding)• Multicast routing • Group management

Page 6: based on Composite Protocols

6

Composite Protocol Framework

TSM RSM

CompositeProtocol X

CompositeProtocol Y

ENDPOINT A

CompositeProtocol Y

CompositeProtocol X

ENDPOINT B

Packet Memory

Packet Memory

Packet Memory

SLP SLP

SLP SLP

SLP SLP

SLP SLP

SLP SLP

SLP SLP

SLP SLP

SLP SLP

Glo

bal M

emor

y

RSM

RSM

TSM

TSM

TSM RSM

RSM

RSM

TSM

TSM

TSM RSM

Glo

bal M

emor

y

RSM

RSM

TSM

TSM

TSM RSM

RSM

RSM

TSM

TSM

Local Memory

Local Memory

Local Memory

Local Memory

Local Memory

Local Memory

Local Memory

Local Memory

Local Memory

Local Memory

Local Memory

Local Memory

Page 7: based on Composite Protocols

7

Design of a Composable Service - Steps

• Decomposition– Identify key functional entities in the monolithic counterpart.

• Specification of protocol components using AFSMs– Represent each component as a pair of SMs (TSM & RSM)– Specify local, SLP, packet and global memory requirement– Identify data and control events

• Building the stacks – Linear composition to yield composite protocol

• Deployment– stacks to place on a particular network node

• Global memory objects for inter-stack communication

Page 8: based on Composite Protocols

8

Step 1 - Decomposition (Multicast Service)

• Multicast routing based on DVMRP • Group Management based on IGMP• DVMRP

– Neighbor Discovery– Route Exchange– Spanning Tree– Pruning – Grafting

• IGMP– Join/Leave

• Data– Multicast Forwarding– Reliable Multicast– In-order Multicast

Page 9: based on Composite Protocols

9

Step 2 – Specification

• Each functional component has to confirm to CP specs.– Independent of other components

• Each protocol component specification consists of– Pair of AFSMs – TSM and RSM – Memory requirements

• Packet memory – bits on the wire or component header• Local memory – maintaining local state information• SLP Memory - memory local to the stack but pertaining to packet • Global memory – external memory requirements

– Events: • Data: packet arrival from component above/below• Control : timers , application-component interaction

• Specify assumptions and parameters

Page 10: based on Composite Protocols

10

Building the Stacks – Step 3

• Group related components into composite protocol -linear composition

• Try re-using existing components

Multicast Routing

Pruning

Spanning Tree

Route Exchange

Neighbor Discovery

Fragment

Checksum

Grafting Reliable Multicast

Unicast Forward

Application

Multicast Forward

TTL

Fragment

Checksum

Replicator

Reliable Multicast

Join_Leave

Fragment

Checksum

Application

Group Management

TTL

Multicast Forward

Replicator

Application

Fragment

Checksum

Basic Multicast

Page 11: based on Composite Protocols

11

Building the Stacks - Stack Ordering

• Determine the order of stacking among components• Does order matter ?• Property – Oriented

– Layer N provides a property to Layer N+1– Order of component determines stack behavior – E.g. reliable multicast stack

• Control – Oriented – Components in stack are independent– Layer N does not provide specific property to Layer N+1– Order may affect performance not stack behavior– E.g. multicast routing stack

Multicast Routing

Pruning

Spanning Tree

Route Exchange

Neighbor Discovery

Fragment

Checksum

Grafting

Reliable Multicast

Unicast Forward

Application

Multicast Forward

TTL

Fragment

Checksum

Replicator

Reliable Multicast

Page 12: based on Composite Protocols

12

Deployment – Step 4

C1

C3

L2 L3

L1

S

C2

Multicast core router

Multicast leaf router

end hosts

Multicast routing stack

Group managementstack

Multicast data stack

Page 13: based on Composite Protocols

13

Inter-stack Communication - Global Memory

• Addresses the issue of cross-protocol communication

• Acts as a repository for data shared among different stacks.

• Accessible to all components of all composite protocol instances at that endpoint

• Scope and extent greater than any single protocol accessing it.

• Functional interface for read/write

• Responsible for initialization and maintenance of shared info

CompositeProtocol X

CompositeProtocol Y

ENDPOINT A

Packet Memory

Packet Memory

Packet Memory

SLP SLP

SLP SLP

SLP SLP

SLP SLP

Glo

bal M

emor

y

RSM

RSM RSM

TSM

SM

TSM

SM

SM

TSM

Local Memory

Local Memory

Local Memory

Local Memory

Local Memory

Local Memory

Page 14: based on Composite Protocols

14

Global Memory – Features

• Separation of Protocols and Data Management– Independence between protocols and global memory data management– Protocol component expresses requirements for global memory access

through its external functions – Protocols that write to /read from global objects need not agree on internal

data format

• Functional Interface– Access to shared data only through write/read functional interfaces– Encapsulates shared data– Hides internal representation of global memory object

GlobalMemoryobject

WRITE

READ

WRITE

READ

Page 15: based on Composite Protocols

15

Global Memory - Features

• Synchronization– Each object solely responsible for providing synchronized access

to its shared data – Synchronization not delegated to users of the shared object.– Access control mechanism is implementation specific – Semaphores or other mechanisms can be used

• Extensibility – Object definition can be extended – Internal data structures / external functions can be added– Backward compatibility easily maintained

Page 16: based on Composite Protocols

16

Implementation

• Overview of Ensemble • Global Memory Implementation• Operational overview of service • Protocol Interaction through global memory

Page 17: based on Composite Protocols

17

Ensemble

• Group communication system developed at Cornell• Used as base framework for building composite protocol• Reasons:

– Written in OCaml functional programming language, aiding for formal analysis of code

– Ensemble uses linear stacking of layers to form stack– Event handlers executed atomically– Unbounded message queues between layers– Provides uniform interface– Support for dynamic linking of components , adding/removing

components from stack at run-time

Page 18: based on Composite Protocols

18

Multicast Global Memory Objects

• Neighbor Table – Stores 1-1 mapping between an interface and neighbor discovered

on that interface

• Routing Table– Repository for unicast routes – Metric and next-hop information for each route prefix stored

• Source Tree– Maintains spanning tree for each multicast source in the n/w– Contains list of dependent downstream neighbors for each source

Page 19: based on Composite Protocols

19

Multicast Global Memory Objects (contd)

• Prune Table – Contains core and leaf interface prune-state information for each

(source/group) pair in the n/w– Interfaces can be in 3 states : un-pruned/pruned/grafted

• Group Table– Stores current list of group members on each leaf interface– Updated when members join/leave groups

Page 20: based on Composite Protocols

20

Global Memory – Linux Shared Memory

• Shared memory – fastest form of IPC – Single chunk of memory shared by 2 or more processes

• Steps in creating a global memory object – Specify read/write functional interface using CamlIDL– Implement functions using Linux shared memory system calls– Handle concurrency issues by using semaphores – Dynamically link global object with stacks at run-time

Page 21: based on Composite Protocols

21

Functional Interface in CamlIDL

• CamlIDL – stub code generator – generates C stub code required for Caml/C interface based on IDL

specification• Neighbor Table

– Write• void write_ntable([in] struct ntable_entry ntable[], [in] int num)

– Read• [int32] int getNeighborForInterface([in,int32] int intf)

struct ntable_entry {int32 intf_addr; // interface IP addressint32 nbor_addr; // neighbor IP address};

Page 22: based on Composite Protocols

22

Multicast Service – Operational Overview

• Initialization

H1

A

C D

B

H2

1

32

1.1

3.2

3.1

2.2

2.1

1.2 Intf Nbor1.2 1.12.1 2.23.1 3.2

Intf Nbor1.1 1.2

Intf Nbor 3.2 3.1

Intf Nbor 2.2 2.1

Net Met Nhop1 0 1.12 1 1.23 1 1.2

Net Met Nhop1 0 1.22 0 2.13 0 3.1

Net Met Nhop1 1 2.12 0 2.23 1 2.1

Net Met Nhop1 1 3.12 1 3.13 0 3.2

Net Dn Nbors1 2.2,3,22 1.1,3.23 1.1, 2.2

G1

G1

G1

G1

prune

src grp intf st1.1 G1 2.1 P

3.1 U1.2 U

prune

src grp intf st1.1 G1 2.1 P

3.1 P1.2 U

prun

e

G1

G1

graft

src grp intf st1.1 G1 2.1 G

3.1 P1.2 U

graf

t

• Routing stack runs

• Neighbor Table updated

• Routing Table updated

• Source Tree updated

• H1, H2 joins group G1• A multicasts to group G1• H1 leaves G1 • H2 leaves G1 • H1 joins G1 again

Page 23: based on Composite Protocols

23

Protocol Interactions thru Global Memory

NeighborTable

RoutingTable

SourceTree

PruneTable

GroupMember

Table

NeighborDiscovery

SpanningTree

Route Exchange

Pruning

Grafting

MulticastForwarding

Global Memoryobjects

JoinLeave

GlobalMemoryobject

WRITE

READ

WRITE

READ

Functional Interface

UnicastForwarding

Write access

Read access

Routing stack

Data stack

Group MgmtWait

RPF check

Multicast Forwarding - RSM

2,5

1 3

10,11 7

4

Leaf IntfCheck

IntfLoop

GroupMemCheck

68

DnstreamCheck

7

RTable

PruneTable

SrcTreeGrp

Table

No leafintfs Group

member exists No Group

member

PruneTable

9

Page 24: based on Composite Protocols

24

Performance Evaluation – Test Setup

• 15 node test network• Ensemble test applications

similar to ping, ttcp used• Metrics

– Stack/Component latency – One-way end-to-end latency– Basic Multicast Throughput – Reliable Multicast Throughput– Join/Leave latency

• Performance Improvement Factors– Native-code ocamlopt

compiler instead of byte-code ocamlc

– Reducing global memory lookups , use of caches

– Order of guards

• Pentium III 800 MHz• 256 MB RAM, 20 GB HDD• 100 Mbps NICs• RedHat 7.1, Linux 2.4.6• OCaml v3.06, native code

Page 25: based on Composite Protocols

25

Stack Latency vs. Message Size

• 5 runs of 1000 pkts each• Global memory lookup 1 in

100 pkts• Stack latency increases with

message size due to checksum component

Stack Latency vs Msg Size

0

20

40

60

80

100

120

140

160

0 2 00 40 0 6 00 8 00 10 00 120 0 14 00

Msg Size in bytes

Late

ncy

in m

icro

-sec

s

Router

Sender

Receiver

STACKMcast_Fw d FragmentChecksumReplicator

(averaged for 1000 pkts, 5 runs each)

Page 26: based on Composite Protocols

26

Throughput vs. Message size

• 5 runs , 1000 pkts each

• Receivers 4 hops from source

• Sharp decrease after 1300 bytes due to fragmentation

• Stack A uses IP fragment

• Stack B uses fragmentcomponent

Throughput vs Msg SizeAverag ed o ver 4 receivers , each 4 ho ps fro m multicas t source

fo r 10 00 packets and 5 runs

0

5

10

15

2 0

2 5

3 0

3 5

4 0

4 5

5 0

0 2 0 0 4 0 0 6 0 0 8 0 0 10 0 0 12 0 0 14 0 0 16 0 0 18 0 0

M s g s iz e in b yt e s

Thro

ughp

ut in

Mbp

s

Stack AStack B

Stack AMcastRepl

Stack BMcastFragChkRepl

Page 27: based on Composite Protocols

27

Reliable Multicast Throughput vs. Error rate

• 5 runs, 1000 packets each 1000 bytes

• Receivers 4 hops from source• Random_Drop component

simulated link-error rate• Receiver-initiated NACK

scheme used for Reliable component

Reliable Multicast Throughput

0

2

4

6

8

10

12

14

16

18

20

0 2 4 6 8 10 12

% Error Rate

Thro

ughp

ut in

Mbp

s

Average Throughput

Averaged over 4 receivers , each 4 hop s from mult icas t so urcefo r 100 0 p ackets , p acket s ize 100 0 b ytes

Rel_Mcas tMcas t_ FwdUcas t_ FwdFragChecksumRnd_ Dro pRep licato r

STACK

Page 28: based on Composite Protocols

28

Other metrics

• Component Latencies at sender and receiver nodes

• Join and Leave latencies

Join Latency(milli-seconds)

1 4052 4583 535

Prune Depth

Timer (seconds)Query 0.1Graft 0.1Prune 0.1

Leave Latency : 146 ms

Msg Size MCAST FRAG CHKSUM REPLNode bytes

Sender 1000 26.09 8.23 20.49 7.61Receiver 1000 3.26 3.37 19.41 5.04

(in micro-seconds)

Component Latencies

Sender rate : 10 pkts/sec

Page 29: based on Composite Protocols

29

Comparison with Linux IP Multicast

• Mrouted 3.9 evaluated on same test network

• Iperf tool used to measure end-to-end throughput

• Composite multicast just worse by a factor of 2-3.– SM execution adds

overhead– Strict layering in

framework prevents pointer arithmetic on buffers

– Ensemble is a user-level program

Mrouted vs Composite Multicast ThroughputAveraged over 4 receivers each 4 hops aw ay from multicast source

for 1000 packets and 5 runs

0

20

40

60

80

100

120

0 500 1000 1500 2000 2500

Packet size in bytes

Th

rou

gh

pu

t in

Mb

ps

Mrouted ThroughputComposite Multicast

Page 30: based on Composite Protocols

30

Summary

• Novel approach for building network services from composite protocols

• Demonstrates applicability and feasibility of composite protocol approach to data-plane and control-plane protocols.

• Addresses challenging issue of inter-stack communication using global memory

• All components and global memory objects implemented and tested for both functionality and performance

Page 31: based on Composite Protocols

31

Future Work

• The multicast service can be extended to support multi-point to multi-point model

• Implement complex multicast protocols like MOSPF/ PIM • Security and network management protocols • Improve performance • Deployment of service on an active network

Page 32: based on Composite Protocols

32

Questions ?