sora: high performance software radio using general ... · sora: high performance software radio...

27
Sora: High Performance Software Radio using General Purpose Multi- Core Processors Kun Tan Jiansong Zhang Ji Fang He Liu § Yusheng Ye § Shen Wang § Yongguang Zhang Haitao Wu Wei Wang Geoffrey M. Voelker Microsoft Research Asia Tsinghua University, Beijing, China § Beijing Jiaotong University, Beijing, China UCSD, La Jolla, USA NSDI 2009, Boston, USA 1

Upload: others

Post on 12-Mar-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Sora: High Performance Software Radio using General ... · Sora: High Performance Software Radio using General Purpose Multi-Core Processors Kun Tan †Jiansong Zhang Ji Fang‡He

Sora: High Performance Software Radio using General Purpose Multi-

Core Processors

Kun Tan† Jiansong Zhang† Ji Fang‡ He Liu§ Yusheng Ye§

Shen Wang§ Yongguang Zhang† Haitao Wu† Wei Wang†

Geoffrey M. Voelker◊

† Microsoft Research Asia‡ Tsinghua University, Beijing, China

§ Beijing Jiaotong University, Beijing, China ◊ UCSD, La Jolla, USA

NSDI 2009, Boston, USA 1

Page 3: Sora: High Performance Software Radio using General ... · Sora: High Performance Software Radio using General Purpose Multi-Core Processors Kun Tan †Jiansong Zhang Ji Fang‡He

Fundamental Challenges

• Large volume of high-fidelity digital signals

– Require a high-speed system I/O

NSDI 2009, Boston, USA 3

Antenna

Processor

SoftwareHardwareDigital

Samples

D/A

A/D

RF

Frontend

1.2Gbps for 802.11 (20MHz channel, 16b A/D, 4x)

~up to 5 Gbps for 11n (4x4MIMO) ;

Over 10Gbps for future high-speed wireless

Page 4: Sora: High Performance Software Radio using General ... · Sora: High Performance Software Radio using General Purpose Multi-Core Processors Kun Tan †Jiansong Zhang Ji Fang‡He

Fundamental Challenges

• Large volume of high-fidelity digital signals

– Require a high-speed system I/O

• Computation-intensive signal processing

NSDI 2009, Boston, USA 4

InterleavingConvolutional

encoderQAM Mod IFFT GI Addition

Symbol Wave

ShapingScramble

To RF

Demod +

InterleavingFFT

Viterbi

decodingRemove GI

From RF

Descramble

Transmitter:

Receiver:

Bits

@48MbpsBits

@48Mbps

Samples

@512Mbps

Samples

@1.28Gbps

Samples

@640Mbps

Decimation

Samples

@384MbpsBits

@24Mbps

Samples

@1.28Gbps

Samples

@640Mbps

Samples

@512Mbps

Samples

@384MbpsBits

@48Mbps

Bits

@24Mbps

Bits

@24Mbps

From MAC

To MAC

Bits

@24Mbps

Page 5: Sora: High Performance Software Radio using General ... · Sora: High Performance Software Radio using General Purpose Multi-Core Processors Kun Tan †Jiansong Zhang Ji Fang‡He

Fundamental Challenges

• Large volume of high-fidelity digital signals

– Require a high-speed system I/O

• Computation-intensive signal processing

NSDI 2009, Boston, USA 5

InterleavingConvolutional

encoderQAM Mod IFFT GI Addition

Symbol Wave

ShapingScramble

To RF

Demod +

InterleavingFFT

Viterbi

decodingRemove GI

From RF

Descramble

Transmitter:

Receiver:

Bits

@48MbpsBits

@48Mbps

Samples

@512Mbps

Samples

@1.28Gbps

Samples

@640Mbps

Decimation

Samples

@384MbpsBits

@24Mbps

Samples

@1.28Gbps

Samples

@640Mbps

Samples

@512Mbps

Samples

@384MbpsBits

@48Mbps

Bits

@24Mbps

Bits

@24Mbps

From MAC

To MAC

Bits

@24Mbps

Raw computation power required: 802.11b => 10Gops, 802.11a => 40Gops!(now server-class CPU runs at 3GHz clock)

Page 6: Sora: High Performance Software Radio using General ... · Sora: High Performance Software Radio using General Purpose Multi-Core Processors Kun Tan †Jiansong Zhang Ji Fang‡He

Fundamental Challenges

• Large volume of high-fidelity digital signals

– Require a high-speed system I/O

• Computation-intensive signal processing

• Hard deadline and accurate timing control

– 802.11 MAC requires response within a few s

– Event trigger timing accuracy at s level

NSDI 2009, Boston, USA 6

Page 7: Sora: High Performance Software Radio using General ... · Sora: High Performance Software Radio using General Purpose Multi-Core Processors Kun Tan †Jiansong Zhang Ji Fang‡He

Approaches

NSDI 2009, Boston, USA 7

Programmability

Low High

Low

Hig

h

Perf

orm

ance

EmbeddedDSP

Example: Rice WARP, TI SFF-SDR

Programmable hardware

(FPGA)

Example: GNU Radio/USRP(v1&2)• Interface USB/GbE: <1Gbps, >1ms• Achievable wireless xput: ~100Kbps

Low-performanceGPP-based SDR

SoraSora

Resolving the SDR platform dilemma• Commodity PC w/ C program• High performance

• sys tput:10Gbps; ~s latency • target wireless xput:10M~1Gbps

Page 8: Sora: High Performance Software Radio using General ... · Sora: High Performance Software Radio using General Purpose Multi-Core Processors Kun Tan †Jiansong Zhang Ji Fang‡He

Sora Approach

• New PCIe-based Interface card => high system throughput

• New optimizations to implement PHY algorithms and streamline processing on multi-core CPU=> efficient PHY processing

• Core dedication => real-time support

NSDI 2009, Boston, USA 8

Page 9: Sora: High Performance Software Radio using General ... · Sora: High Performance Software Radio using General Purpose Multi-Core Processors Kun Tan †Jiansong Zhang Ji Fang‡He

MemRF

RFRF

Sora

APP

Multi-core CPU

Sora Soft-Radio Stack

PCIe bus

Digital Samples

@Multiple Gbps

RCBA/D

D/A RFSora

APP

APP

APP

APP

APP

Sora Hardware

Sora Architecture

NSDI 2009, Boston, USA9

General radio front-end: 700M/1.8G/2.4G/5GHz

Page 10: Sora: High Performance Software Radio using General ... · Sora: High Performance Software Radio using General Purpose Multi-Core Processors Kun Tan †Jiansong Zhang Ji Fang‡He

MemRF

RFRF

Sora

APP

Multi-core CPU

Sora Soft-Radio Stack

PCIe bus

Digital Samples

@Multiple Gbps

RCBA/D

D/A RFSora

APP

APP

APP

APP

APP

Sora Hardware

Radio Control Board

NSDI 2009, Boston, USA10

PCIe-based High-speed Interface card

PCIe is commodity in most modern PCs High throughput: 16Gbps at PCIe-8x Low latency: ~ 1 s Separated with other I/O devices

Page 11: Sora: High Performance Software Radio using General ... · Sora: High Performance Software Radio using General Purpose Multi-Core Processors Kun Tan †Jiansong Zhang Ji Fang‡He

RCB Details

NSDI 2009, Boston, USA 11

PCIe-8x interface: up to 16Gbps throughput

Versatile RF interface: up to 8 channels (8x8 MIMO)

Page 12: Sora: High Performance Software Radio using General ... · Sora: High Performance Software Radio using General Purpose Multi-Core Processors Kun Tan †Jiansong Zhang Ji Fang‡He

Versatile RF interface: up to 8 channels (8x8 MIMO)

RCB Details

NSDI 2009, Boston, USA 12

s

Buffered data path: bridging the synchronous ops at RF and asynchronous processing at CPU (12.3Gbps measured )

Low latency control path for software (0.36 s measured)

A/D

D/ARF Circuit

RF Front-endPCIE

Controller SDRAM

Controller

FIFO

FIFODMA

Controller

DDR

SDRAM

FPGA

RCB

PCIe

bus

Antenna

RF

Controller

Registers

Page 13: Sora: High Performance Software Radio using General ... · Sora: High Performance Software Radio using General Purpose Multi-Core Processors Kun Tan †Jiansong Zhang Ji Fang‡He

MemRF

RFRF

Sora

APP

Multi-core CPU

Sora Soft-Radio Stack

PCIe bus

Digital Samples

@Multiple Gbps

RCBA/D

D/A RFSora

APP

APP

APP

APP

APP

Sora Hardware

Sora Software

13

High-performance SDR processing w/ key software techniques

Efficient PHY implementation using SIMD and LUTs Speed up PHY using multi-core streamline processing Core dedication for real-time support

NSDI 2009, Boston, USA

Page 14: Sora: High Performance Software Radio using General ... · Sora: High Performance Software Radio using General Purpose Multi-Core Processors Kun Tan †Jiansong Zhang Ji Fang‡He

Efficient PHY Implementation

• Exploit large high-speed cache memory– Extensive use of lookup tables (LUT): trade memory

for calculation; still well fit into L2 cache

– Applicable for more than half of the common algorithms; speedup ranges from 1.5x to 22x

NSDI 2009, Boston, USA 14

Direct impl. 8 ops per bit

LUT impl. 2 Look-up op for 8 bits!(size 32KB)

Ex: Convolutional encoder+

+

Tb Tb Tb Tb Tb Tb

Output Data A

Output Data B

Page 15: Sora: High Performance Software Radio using General ... · Sora: High Performance Software Radio using General Purpose Multi-Core Processors Kun Tan †Jiansong Zhang Ji Fang‡He

Efficient PHY Implementation

• Exploit data parallelism in PHY

– Utilize wide-vector SIMD extension in CPU

– Applicable to many PHY algorithms with significant speedups (1.6x ~ 50x)

NSDI 2009, Boston, USA 15

Ex. (I)FFT

Page 16: Sora: High Performance Software Radio using General ... · Sora: High Performance Software Radio using General Purpose Multi-Core Processors Kun Tan †Jiansong Zhang Ji Fang‡He

Core 2Core 1

Speed up PHY using multi-core streamline processing

• Efficiently partition and schedule the PHY processing across cores– Interconnecting sub-pipeline with light-weight,

synchronized FIFOs

– Static scheduling of processing modules in PHY pipeline

NSDI 2009, Boston, USA 16

Demod +

InterleavingFFT

Viterbi

decodingRemove GI DescrambleDecimation

Synchronized FIFO

Page 17: Sora: High Performance Software Radio using General ... · Sora: High Performance Software Radio using General Purpose Multi-Core Processors Kun Tan †Jiansong Zhang Ji Fang‡He

Core Dedication for Real-time Support

• Exclusively allocate enough cores for SDR processing in multi-core systems

– Guarantee the CPU, cache and memory bandwidth resources for predictable performance

– Achieve s-level timing control

– Simple abstraction, and easier to implement in standard OSes than RT-scheduler

• Implemented in WinXP without modifications to Kernel

NSDI 2009, Boston, USA 17

Page 18: Sora: High Performance Software Radio using General ... · Sora: High Performance Software Radio using General Purpose Multi-Core Processors Kun Tan †Jiansong Zhang Ji Fang‡He

Implementation

• Sora software platform on Win XP

– 14K lines of C code, including PCIe driver framework, memory management, FIFO management, etc

• SoftWiFi: full implementation of IEEE 802.11a/b/g PHY and DCF MAC

– 9K lines of C code; 4 man-month for dev & test

– DSSS 1, 2, 5.5, 11Mbps for 11b; OFDM 6, 9, 12, 18, 24, 36, 48, 54Mbps for 11a/g

NSDI 2009, Boston, USA 18

Page 19: Sora: High Performance Software Radio using General ... · Sora: High Performance Software Radio using General Purpose Multi-Core Processors Kun Tan †Jiansong Zhang Ji Fang‡He

0

1

2

3

4

5

6

7

8

9

10

1M 2M 5.5M 11M 6M 24M 54M

Re

qu

ire

d c

om

pu

tati

on

(Gig

a cy

cle

s p

er

seco

nd

)

0

1

2

3

4

5

6

7

8

9

10

1M 2M 5.5M 11M 6M 24M 54M

Re

qu

ire

d c

om

pu

tati

on

(Gig

a cy

cle

s p

er

seco

nd

)

Results: PHY Processing

11.6 11.7 11.7 11.818.3 60.4 132.4

NSDI 2009, Boston, USA 19

>30

x s

peedup

~10

x s

peedup

802.11b 802.11a/g 802.11b 802.11a/g

After Sora Optimization

Page 20: Sora: High Performance Software Radio using General ... · Sora: High Performance Software Radio using General Purpose Multi-Core Processors Kun Tan †Jiansong Zhang Ji Fang‡He

0

1

2

3

4

5

6

7

8

9

10

1M 2M 5.5M 11M 6M 24M 54M

Re

qu

ire

d c

om

pu

tati

on

(Gig

a cy

cle

s p

er

seco

nd

)

0

1

2

3

4

5

6

7

8

9

10

1M 2M 5.5M 11M 6M 24M 54M

Re

qu

ire

d c

om

pu

tati

on

(Gig

a cy

cle

s p

er

seco

nd

)

Results: PHY Processing

11.6 11.7 11.7 11.818.3 60.4 132.4

NSDI 2009, Boston, USA 20

>30

x s

peedup

~10

x s

peedup

802.11b 802.11a/g 802.11b 802.11a/g

Sora enables software implementation of today’s high-speed wireless system in

standard PC with a few cores

After Sora Optimization

Page 21: Sora: High Performance Software Radio using General ... · Sora: High Performance Software Radio using General Purpose Multi-Core Processors Kun Tan †Jiansong Zhang Ji Fang‡He

0

5

10

15

20

25

1M 2M 5.5M 11M 6M 24M 54M

Th

rou

gh

pu

t (M

bp

s)

Modulation Mode

Sora-Commercial

Commercial-Commercial

Commercial-Sora

Results: End-to-end Throughput

NSDI 2009, Boston, USA 21

Communicating with commercial 802.11a/b/g card

Page 22: Sora: High Performance Software Radio using General ... · Sora: High Performance Software Radio using General Purpose Multi-Core Processors Kun Tan †Jiansong Zhang Ji Fang‡He

0

5

10

15

20

25

1M 2M 5.5M 11M 6M 24M 54M

Th

rou

gh

pu

t (M

bp

s)

Modulation Mode

Sora-Commercial

Commercial-Commercial

Commercial-Sora

Results: End-to-end Throughput

NSDI 2009, Boston, USA 22

Communicating with commercial 802.11a/b/g card

Seamlessly interoperate with commercial WiFi• Correctness of all PHY algorithms• Satisfying timing requirements of standards• Commercial equivalent performance

Page 23: Sora: High Performance Software Radio using General ... · Sora: High Performance Software Radio using General Purpose Multi-Core Processors Kun Tan †Jiansong Zhang Ji Fang‡He

Extensions

NSDI 2009, Boston, USA 23

Jumbo frames in 802.11 TDMA MAC

Page 24: Sora: High Performance Software Radio using General ... · Sora: High Performance Software Radio using General Purpose Multi-Core Processors Kun Tan †Jiansong Zhang Ji Fang‡He

Extensions: New Applications

NSDI 2009, Boston, USA 24

Page 25: Sora: High Performance Software Radio using General ... · Sora: High Performance Software Radio using General Purpose Multi-Core Processors Kun Tan †Jiansong Zhang Ji Fang‡He

Conclusion

• Sora is a fully programmable software radio platform on commodity PC architecture– Easy C programming on multi-core CPU

– High performance: high processing speed, low latency, and performance guarantee• Confirmed by SoftWiFi, the first fully interoperable IEEE

802.11 (PHY and MAC) on general purpose processors

• Plan to release Sora SDK to research community– H/W: RCB + 2.4G RF front-end set (~$2K USD)

NSDI 2009, Boston, USA 25

Page 27: Sora: High Performance Software Radio using General ... · Sora: High Performance Software Radio using General Purpose Multi-Core Processors Kun Tan †Jiansong Zhang Ji Fang‡He

Q&A

NSDI 2009, Boston, USA 27