sora- a high performance baseband dsp processor
DESCRIPTION
This project is basically on software defined radio which was published by microsoft asia team which is based on reconfigurable baseband processor architecture, which tries to increase the performance of processor by adding no. cores into processTRANSCRIPT
Sora“High Performance Software
Radio using General Purpose Multi-Core Processors"
Microsoft Research Asia
1
ByHarshit SrivastavaCDS12M001
A Seminar On
MA
G
WiF
SM
Software Radio
BluetoothGPS
3G General RFFrontendi
CD Bluetooth, WiFi,WiMAX, GSM,CDMA, 3G, LTE …
WiMAX
BenefitsPromise of universal connectivity and cost savingProgrammability => faster development cycle, faster to marketOpen platform for wireless research
2
software
Introduction
• This paper presents Sora, a fully programmable software radio platform on commodity PC architectures.• Sora combines the performance and fidelity of hardware
SDR platforms with the programmability and flexibility of general-purpose processor (GPP) SDR platforms.• Sora uses both hardware and software techniques to
address the challenges of using PC architectures for high speed SDR.
• Sora is the first SDR platform that enables users to develop high-speed wireless implementations, such as the IEEE 802.11a/b/g PHY and MAC, entirely in software on a standard PC architecture.
• Software defined radio (SDR) holds the promise of fully programmable wireless communication systems, effectively supplanting current technologies which have the lowest communication layers implemented primarily in fixed, custom hardware circuits.
• Many current SDR platforms are based on either programmable hardware such as field programmable gate arrays or embedded digital signal processors (DSPs).• Such hardware platforms can meet the processing and timing
requirements of modern high-speed wireless protocols, but programming FPGAs and specialized DSPs are difficult tasks.
Introduction Continue..
Fundamental Challenges
• Large volume of high-fidelity digital signals– Require a high-speed system I/O
1.2Gbps for 802.11(20MHz channel, 16b A/D, 4x)
~up to 5 Gbps for 11n (4x4MIMO) ;
Antenna
Over 10Gbps for future high-speed wireless
DigitalSample
sHardware Software
3
ProcessorRF D/A Frontend A/D
Fundamental Challenges
Large volume of high-fidelity digital signals– Require a high-speed system I/O
•
• Computation-intensive signal processing
Samples
@384Mbps
Samples
@512Mbps
Samples
@640Mbps
Samples
@1.28Gbps
Bits@24Mbps
Bits@24Mbps
Bits@48Mbps
Bits@48Mbps
Transmitter:
To RFFrom
MAC Samples
@1.28Gbps
Samples
@640Mbps
Samples
@512Mbps
Samples
@384Mbps
Bits@48Mbps
Bits@24Mbps
Bits@24Mbps
Receiver:
From RF
To MAC
4
Decimation Remove GI FFTDemod
+ Interleaving
Viterbi decoding
DescrambleTo
ScrambleConvolutio
nal encoder
Interleaving
QAM Mod
IFFT GI Addition
Symbol Wave
Shaping
To
Bits Bits Bits Bits Samples Samples Samples
Sa
@24Mbps @24Mbps @48Mbps @48Mbps @384Mbps @512Mbps @640Mbps
@1.2
Convolutional Interleaving
Symbol
WaveScramble encoder QAM Mod IFFT GI Addition
Shaping
From MAC
Samples Samples Samples Samples Bits Bits
[email protected] @640Mbps @512Mbps @384Mbps @48Mbps @24Mbps
@24M
Demod + ViterbiDecimation Remove GI FFT
Interleaving decoding Descramble
Fundamental Challenges
Large volume of high-fidelity digital signals– Require a high-speed system I/O
•
• Computation-intensive signal processing
mples8Gbps
Transmitter:
To RF
Receiver:
To MAC
5
Raw computation power required:802.11b => 10Gops, 802.11a => 40Gops!(now server-class CPU runs at 3GHz clock)
8G
bps
From RF To MA
Fundamental Challenges
Large volume of high-fidelity digital signals– Require a high-speed system I/O
Computation-intensive signal processingHard deadline and accurate timing control
•
••
––
s802.11 MAC requires response within a few
Event trigger timing accuracy at s level
6
Approaches
SoraSoraProgrammablehardware
EmbeddedDSP
Resolving the SDR platform dilemma• Commodity PC w/ C program• High performance
• sysinput:10Gbps; ~s latency• target wireless xput:10M~1Gbps
Example: Rice WARP, TI SFF-SDR
Low-performanceGPP-based SDR
Example: GNU Radio/USRP(v1&2)• Interface USB/GbE: <1Gbps, >1ms• Achievable wireless xput: ~100Kbps
HighProgrammability
Low
7
Per
form
ance
Low
Hig
h
(FPGA)
Low High
Sora Approach
New PCIe-based Interface card =>throughput
• high system
• New optimizations to implement PHYalgorithms and streamline processing onmulti-core CPU=> efficient PHY processing
• Core dedication => real-time support
8
RF
Sora Architecture
Digital Samples@Multiple Gbps
RFRF
PCIe bus
Sora Soft-Radio Stack Sora Hardware
General radio front-end: 700M/1.8G/2.4G/5GHz
9
Mem RCBA/DD/A RF
Multi-core CPU
APP APP APP APP
Sora Sora APP APP
RF
Radio Control Board
Digital Samples@Multiple Gbps
RFRF
PCIe bus
Sora Soft-Radio Stack Sora Hardware
PCIe-based High-speed Interface card
PCIe is commodity in most modern PCs High throughput: 16Gbps at PCIe-8x Low latency: ~ 1 s Separated with other I/O devices
10
Mem RCBA/DD/A RF
Multi-core CPU
APP APP APP APP
Sora Sora APP APP
RCB Details
PCIe-8x interface: up to 16Gbps throughput
Versatile RF interface: up to 8 channels (8x8 MIMO)
11
terface: up to 8 channels (8x8 MIMO)
s
RCB Details
RF Front-endbus
Versatile RF in
12
RF CircuitController Controller
Controller SDRAM
FPGADMA
FIFO RF A/D
FIFO D/A Antenna
PCIe PCIE
Controller
Registers
DDRRCB
SDRAM
Buffered data path: bridging the synchronous ops at RF and asynchronous processing at CPU (12.3Gbps measured )
Low latency control path for software (0.36 s measured)
RF
Sora Software
Digital Samples@Multiple Gbps
RFRF
PCIe bus
Sora Soft-Radio Stack Sora Hardware
High-performance SDR processing w/ key software techniques
Efficient PHY implementation using SIMD and LUTs Speed up PHY using multi-core streamline processing Core dedication for real-time support
13
Mem RCBA/DD/A RF
Multi-core CPU
APP APP APP APP
Sora Sora APP APP
Efficient PHY Implementation
Exploit large high-speed cache memory– Extensive use of lookup tables (LUT): trade
memory for calculation; still well fit into L2 cache– Applicable for more than half of the common
•
algorithms; speedup ranges from 1.5x to 22x
Output Data AEx: Convolutional encoder +
+ Output Data B
14
LUT impl. 2 Look- up op for 8 bits! (size 32KB)
TbTbTbTbTbTb
Direct impl. 8ops per bit
Efficient PHY Implementation
Exploit data parallelism in PHY•––
Utilize wide-vector SIMD extension
Applicable to many PHY algorithmsin CPU
withsignificant
Ex. (I)FFT
speedups (1.6x ~ 50x)
15
Speed up PHY using multi-corestreamline processing
Efficiently partition and schedule the PHYprocessing across cores
•
– Interconnecting sub-pipeline with light-weight,synchronized FIFOs
– Static schedulingpipeline
of processing modules in PHY
Core 1
16
SynchronizedFIFO
Core 2
Remove GI FFTDemod +
InterleavingViterbi
decoding DescrambleDecimation
Core Dedication for Real-timeSupport
Exclusively allocate enough cores for SDRprocessing in multi-core systems
•
– Guarantee the CPU, cache and memorybandwidth resources for predictable performance
Achieve s-level timing controlSimple abstraction, and easier to implement in standard OSes than RT-scheduler
• Implemented in WinXP without modifications to Kernel
––
17
Implementation
Sora software platform on Win XP– 14K lines of C code, including PCIe driver
framework, memory management, FIFO management, etc
SoftWiFi: full implementation of IEEE802.11a/b/g PHY and DCF MAC
– 9K lines of C code; 4 man-month for dev
•
•
& test
– DSSS 1, 2, 5.5, 11Mbps for 11b; OFDM 6, 9, 12, 18,24, 36, 48, 54Mbps for 11a/g
18
7
6
Results: PHY Processing
11.6 11.7 11.7 11.818.3 60.4 132.41010
99
88
7
6
55
44
33
22
11
00
1M 2M 5.5M 11M 6M 24M 54M1M 2M 5.5M 11M 6M 24M 54M
802.11b 802.11a/g 802.11b 802.11a/g
19
~10x speedup
>30x speedup
Requ
ired
com
puta
tion
(Gig
a cy
cles
per
se
cond
)
Requ
ired
com
puta
tion
(Gig
a cy
cles
per
sec
ond)
After Sora Optimization
peedup
c peedupa 6
g iG( 5
n o
it
a 4tupm 3
o cde 2
r iu
q
e 1R
7
6
Results: PHY Processing
11.6 11.7 11.7 11.818.3 60.4 132.41010
9
8
5
4
3
2
1
0
1M 2M 5.5M 11M 6M 24M 54M1M 2M 5.5M 11M 6M 24M 54M
802.11b 802.11a/g 802.11b 802.11a/g
20
~10x s
>30x s
Requ
ired
com
puta
tion
(Gig
a cy
cles
per
sec
ond)
ycle
s pe
r sec
ond) 9
8
7After Sora Optimization
Sora enables software implementation of today’s high-speed wireless system in
standard PC with a few cores0
Results: End-to-end ThroughputCommunicating with commercial 802.11a/b/g card
Modulation Mode21
Th
rou
gh
pu
t (M
bp
s)
25Sora-Commercial
20 Commercial-Commercial
15 Commercial-Sora
10
5
0
1M 2M 5.5M 11M 6M 24M 54M
Commercial-Sora
Results: End-to-end ThroughputCommunicating with commercial 802.11a/b/g card
Modulation Mode22
Th
rou
gh
pu
t (M
bp
s)
• Correctness of all PHY algorithms
25Sora-Commercial
20 Commercial-Commercial
15 Seamlessly interoperate with commercial WiFi
10 • Satisfying timing requirements of standards• Commercial equivalent performance
5
0
1M 2M 5.5M 11M 6M 24M 54M
Extensions
TDMA MACJumbo frames in 802.11
23
• A fully programmable software radio platform that provides the benefits of both SDR approaches, thereby resolving the SDR platform dilemma for developers.
• With Sora, developers can implement and experiment with high-speed wireless protocol stacks, e.g., IEEE 802.11a/b/g, using commodity general-purpose PCs.
• Developers program in familiar programming environments with powerful tools on standard operating systems.
• Software radios implemented on Sora appear like any other network device, and users can run unmodified applications on their software radios with the same performance as commodity hardware wireless devices.
Applications
Conclusion
Sora is a fully programmable software radioplatform on commodity PC architecture– Easy C programming on multi-core CPU– High performance: high processing speed, low
latency, and performance guarantee• Confirmed by SoftWiFi, the first fully interoperable IEEE
802.11 (PHY and MAC) on general purpose processors
Plan to release Sora SDK to research community– H/W: RCB + 2.4G RF front-end set (~$2K USD)
•
•
25
Kun Tan† Jiansong Zhang† Ji Fang‡ He Liu§ Yusheng Ye§
Shen Wang§ Yongguang Zhang† Haitao Wu† Wei Wang†
Geoffrey M. Voelker◊
† Microsoft Research Asia‡ Tsinghua University, Beijing, China
§ Beijing Jiaotong University, Beijing, China◊ UCSD, La Jolla, USA
References