network stack specialization for performance presented by donghwi kim (some figures are brought from...

14
Network Stack Specialization for Performance Presented by Donghwi Kim (Some figures are brought from the paper) 1

Upload: jemima-bishop

Post on 28-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Network Stack Specialization for Performance Presented by Donghwi Kim (Some figures are brought from the paper) 1

1

Network Stack Spe-cialization

for PerformancePresented by Donghwi Kim

(Some figures are brought from the paper)

Page 2: Network Stack Specialization for Performance Presented by Donghwi Kim (Some figures are brought from the paper) 1

2

Objective

• The authors tried to show upper bound of network application performance by specialization(Actually, not only a network stack but also an ap-plication’s implementation is specialized)

• A special kind of applications is chosen(Serves same content to multiple users)• Sandstorm: A Web server serves static webpage• Namestorm: A DNS server

Page 3: Network Stack Specialization for Performance Presented by Donghwi Kim (Some figures are brought from the paper) 1

3

Key of performance

• A complete zero-copy stack• Aggressive amortization• Pre-packetized data• Batching to mitigate system-call overhead

• Synchronous, clocked from received packets• Improves cache locality• Minimize the latency of sending the first packet of re-

sponse

• Intel’s DDIO

Page 4: Network Stack Specialization for Performance Presented by Donghwi Kim (Some figures are brought from the paper) 1

4

Network stack

• libnmio: Data-movement and event-notification primitives• libeth: A lightweight Eth-

ernet-layer• libtcpip: An optimized

TCP/IP layer• libudpip: A UDP/IP layer

Page 5: Network Stack Specialization for Performance Presented by Donghwi Kim (Some figures are brought from the paper) 1

5

A complete zero-copy stack• Receiving a packet• Done by DMA

• Transmitting a packet• Aggressive amortization

• Modify one of prepared a copy of packet and use DMA• The modifications are performed in a single pass to use

CPU’s L1 cache efficiently

Page 6: Network Stack Specialization for Performance Presented by Donghwi Kim (Some figures are brought from the paper) 1

6

A complete zero-copy stack• pre-copy method• maintain more than one copy of each packet• potential to thrash CPU’s L3 cache

• memcpy method• maintain one long-term copy and create ephemeral

copies• more work should be done

Page 7: Network Stack Specialization for Performance Presented by Donghwi Kim (Some figures are brought from the paper) 1

7

How the optimization works?

• Batching increases TCP RTT• Amortizing reduces per-request processing

Page 8: Network Stack Specialization for Performance Presented by Donghwi Kim (Some figures are brought from the paper) 1

8

Intel’s DDIO

• Direct Data I/O

• When transmission• Pull data from the L3 cache without a detour through

system memory

• When reception• DMA can place data in processor’s L3 cache

Page 9: Network Stack Specialization for Performance Presented by Donghwi Kim (Some figures are brought from the paper) 1

9

Evaluation

Page 10: Network Stack Specialization for Performance Presented by Donghwi Kim (Some figures are brought from the paper) 1

10

Evaluation

Page 11: Network Stack Specialization for Performance Presented by Donghwi Kim (Some figures are brought from the paper) 1

11

Evaluation

Page 12: Network Stack Specialization for Performance Presented by Donghwi Kim (Some figures are brought from the paper) 1

12

DDIO

• Pre-copy case: DDIO pulls untouched incoming data into the cache, so the file data cannot be cached• Memcopy case: CPU loads file data into the cache

Page 13: Network Stack Specialization for Performance Presented by Donghwi Kim (Some figures are brought from the paper) 1

13

Discussion

• mTCP vs. Sandstorm

Page 14: Network Stack Specialization for Performance Presented by Donghwi Kim (Some figures are brought from the paper) 1

14

Discussion

• mTCP• Provides UNIX-like socket programming interface• mTCP provides fairness

• TCP of Sandstorm• Higher level stack does not wrap lower level stack

• Each stack is a stand-alone service• For example, an application interacts directly with libnmio

• Amortization, no-queueing, inaccurate timer cannot guarantee correctness• Limited applications