the development of network processor technology adviser: dr.gaj co-adviser: dr.mark

Post on 11-Jan-2016

213 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

THE DEVELOPMENT OF NETWORK PROCESSOR

TECHNOLOGY

Adviser: Dr.Gaj

Co-Adviser: Dr.Mark

Scope of Presentation

1. Introduction to NP

2. Evolution of NP development

3. IXP 1200 network processor

4. Adding security functionality of network processor

5. Conclusion

Introduction

• Why Network Processor?– widespread of internet technology– data explosion– need to send huge data over networks at high

speed– CPU based => ASIC based =>NP based

CPU Based Router

• A computer with multiple NIC installed

• Running software dedicated to routing

• PC + Linux => small router

• Packet flow:

• NIC1,buffer->memory->CPU, register, processing->memory->NIC2,buffer

CPU Based Router

• Good: flexibility to program the instructionsupdate & upgrade thru software

CPU Based Router - Drawback

• Cannot keep up with line speed:In 1994~2000, network bandwidth growing622Mbps -> 10Gbps

• CPU speed: 100MHz -> 2GHz• approx 1GHz CPU handle 1Gbps data rate

CPU Based Router – Drawback

• Demand of Quality of Service– Internet banking, E-commerce requires instant

interactions v.s. E-mail, WWW– Rich user v.s. poor user– Vedio on demand v.s. Vedio conference– Each traffic type has each priority level– Traffic management task needs more complic

ated algorithm & higher processing speed

CPU Based Router – Drawback

• PC is not optimized for network traffic– PC is suitable for big chunk of data:

DMA, loading file, do I/O once in a while.Network traffic is I/O bound traffic

– Voice stream use small 64-byte packets– Not good for large number of small packets– Spend much time idling while memory access

(single thread)– Not good for bit-level operations (using shift)

ASIC Based Router (Faster)

• Burdens are distributed to each NIC

• Embed instructions to perform forwarding operations

ASIC Based Router – Drawback

• No Re-Programmability, not flexible– Instructions are hardwired – difficult & costly t

o change for DiffServe, IPSec, IPv6

• Long Development Time– 12~18 months– faster & complicated application need longer– more complex design=> fabrication & verificati

on– longer time to market, less competitive

ASIC Based Router – Drawback

• Layer 2 protocol is in flux– Ethernet (LAN) standard is OK– LAN=> VLAN (802.1Q)– WAN: vendor need to integrate them to single

“Multi-service” product=> HDLC, Frame Relay, ATM, etc.

Network Processor

• What is NP?– A software programmable device that is

designed to process data packets at wire-speed

– As flexible as CPU– As fast as ASIC at wire –speed– provide all packet processing function as

previous technology can provide

NP Architecture• PPEs, Network Interface, Dedicated Hardware Unit, Control Processor, Me

mory Interface

NP Architecture (PPEs)

• Multiple Programmable Processing Engines (PPEs)– More flexible over ASIC(hard-wired)– Parallel processing, better than CPU– Most adopted technology in vendors

ex: micro-engines in Intel IXP 1200channel processor in Motorola C-5pico-engines in IBM NP4GS3

NP Architecture (PPEs)

• Usually use RISC, pipeline architecture

• Simplified instruction sets to reduce chip area

• Adding bit manipulation functionality

• Topologies can be parallel, pipeline, and pool (see next page)

fig.topologies

NP Architecture (PPEs)

• Multiple hardware thread to hide memory access delay, achieve more tasks

• Instant context-swap (program counter, separated register sets for each context)

• Small internal program memory – reduce instruction fetch timegood: fastbad: program cannot be too long, too complicated

NP Architecture – Network Interface

• Network Interface– Connecting external framer or MAC– Framer: converting bit stream to packet data– MAC: Ethernet Framer– Framer or MAC can be built internal in NP

good: save overall chip areabad: limit product flexibility

– Standard: UTOPIA level 2 and 3, SPI-3, SPI 4.2

NP Architecture – Dedicated Unit

• Dedicated Hardware Unit– Offload the burden of computational intensive

operations from PPEs• Lookup Engines• Queue Management• CRC Calculation• Security function

– Trade-off: More dedicated units, more chip area, more costy

NP Architecture- Control Processor

• Control Processor– Time-insensitive task, ex, routing table

update, control and traffic management packet

– (How about time-sensitive task?)– Exceptional packet processing, ex, unknown

type packet– System bring-up, reboot, system management

NP Architecture–Memory Interface

• SRAM:– small and frequently accessed data: ex.

routing table, queuing information, packet pointer

• DRAM:– large and rare accessed data: ex. packet data

NP Advantages

• Less Time to Market (TTM)– Software programmability: Easy to implement,

model, sample product– Flexibility: Easy to adapt to newer protocol,

easy to add new functionality to exist design

NP Advantages

• Longer Time in Market (TIM)– new and critical functions can be added by re-

programming the network within the device – can be upgraded via a software download to

add new features and protocol support

NP Advantages

• Leverage 3rd-party development of applications– 3rd-party vendor provide software packet and

module for common used application – software reuse without the need to re-invent

the wheel– create a new industry

Intel IXP1200

Bit-level Operations

• How to set 5th bit to one?– CPU: Y & (1<<5)

two instructions– NPU: with the help of extra shifter, can be

done in one instructions

Microengine

Microengine

• 6 ME * 4 Thread = 24 Thread

• ALU: addition, subtraction, logical operations. No multiplication.save chip area.

• 32-bit register

• The power of extra shifter. ex. TTL field, FF-1=FE(FFFFFFFF) >> 24 == FFFF-1==FE

Microengine

• Multiple Threading– 4 Program counters– register sets can divided to 4 parts for 4 threads, swa

p contexts in a single cycle– context-swap can hide memory access latency– each thread share same instruction store, each thread

can perform same of different program, but instructions store is limited (2048 instructions)

– program in the instruction store is loaded by StrongARM

Microengine

Microengine• Separated Register Sets

– 3 types of 32 bit registers:128 general-purpose registers64 SRAM transfer registers64 SDRAM transfer registers

• read bus & write bus are separated, so does the register sets: 32 for read, 32 for write, no addressing mode

• can be addressed by globe mode or thread-local mode• globe mode=>shared variable, non-preemptive,

Numbers of Register per Micro-engine

General Purpose Register

Transfer Register

SRAM Transfer SDRAM Transfer

Read-only Write-only Read-only Write-only

128 32 32 32 32

per thread 32 8 8 8 8

Memory Interface

Memroy Interface Minimum Addressable Unit (bytes)

Size (bytes) Approx. Latency (clks)

Scratchpad 4 4K(on chip) 12-14

SRAM 4 8M 16-20

SDRAM 8 256M 33-40

Security functionality of network processor

• why? more and more business/corporate, personal e-commerce transactions over Internet

• need data confidentiality and data integrity for information transmitted and received over internet and networks

Security processor architecture

• for existing NP and network systems– a. security co-processor architecture– b. security accelerator architecture – c. security in-line processor architecture

• for new design and development of NP & networks:– a. network processor architecture with

security functionality – (on-chip core)

Security co-processor architecture

• co-processor performs all security function and protocol processing and encryption function

Security accelerator architecture

• use in conjunction with host NPU

• host NPU performs protocol processor, eg, handshake, protocol header processing

• security accelerator performs encryption of payload

In-line security processor architecture

• architecture is referred to an “bump in the wire”

• place before the NP to receive data packets on one side

• encrypt data packets and send on the other side to NP

• Hence, duplicate most of the same functions of the NP

on-chip security core architecture

• include security functionality in the NP architecture• eg. encryption of payload, protocol processing• implementation of ixp2850 has 2 cryptography core • more efficient due to reduce transfer of data packets bac

k and forth different memories• secure traffic on the fly with cryptographic engine embed

ded on the NP• reduce real estate power and memory requirements• on-chip core architecture is the future design for NP

Survey

Architecture Vendor Model IPSec Data Rate

Security Co-Processor

Cavium Nitrox-CN1340

3.2Gbps

Security Accelerator

Broadcom BCM5841 4.8Gbps

In-line Processor Cavium NitroxII-CN2560

10Gbps

On-chip Core Intel IXP2850 10Gbps

Considerations for security functionality in network system

– 1. for existing NP and networksco-pro, acc, in-line more suitable and cost effective

– 2. for relative low volume of data and moderate speed, co-processor is cost-effective

– 3. for high traffic and data volume, in-line architecture would be the most efficient and ease of integration. however, cost will be high due to duplication of NP functionality

Considerations for security functionality in network system

• for new NP and network design– 1.on chip security core architecture gives best

efficiency, power consumption and small footprint

– 2.depending on the demand for such performance, the cost of on-chip security core NP should be affordable.

Conclusion

• the need for flexibility and speed resulted in design shift to NP architecture

• the implementation of ixp1200 demonstrated that NP architecture is efficient and can meet the demand for networking at wired speed.

• the design of NP architecture is still evolving. With the need for security functionality, on-chip security core seems to be the way to future NP architecture, given the advantage of data efficiency, footprint, and minimum power consumption.

Thank You

top related