multithreaded processor for software defined...

20
Multithreaded Processor for Software Defined Radio 1 North Lexington Ave, 10 th Floor White Plains, New York 10601 914-287-8500 John Glossner, Ph.D., Founder, CTO & EVP Erdem Hokenek, Ph.D., Founder, Chief H/W Architect Mayan Moudgill, Ph.D., Founder, Chief S/W Architect [email protected]

Upload: dangxuyen

Post on 29-Jul-2018

248 views

Category:

Documents


0 download

TRANSCRIPT

Multithreaded Processor for

Software Defined Radio

1 North Lexington Ave, 10th FloorWhite Plains, New York 10601

914-287-8500

John Glossner, Ph.D., Founder, CTO & EVP

Erdem Hokenek, Ph.D., Founder, Chief H/W Architect

Mayan Moudgill, Ph.D., Founder, Chief S/W [email protected]

http://www.SandbridgeTech.com

Agenda

Motivation for SDR

Multithreaded SDR Processor

Software Development ToolsCompilerSimulatorIDE

Communications System Implementation2Mbps WCDMA802.11b

http://www.SandbridgeTech.com

The Challenges of an Industry

Cost3G is >10x more complex than 2G

but cost should be same, or even lessConvergence phone 2x complexity

WLAN, 2G, and 2.5G integration Traditionally implemented in HW

Moore’s law reduces cost 50% every 18 months6 years until the wireless multimedia is a real consumer market

Time-to-marketGPRS terminals were late

OEMs had to wait until bug free SoCs were available3G terminals will be late

OEMs have to wait until bug free SoCs with ‘reasonable’ power consumption are available

http://www.SandbridgeTech.com

A Whole Industry’s approach failed …

DSP

Fixed

ARM

WLAN802.11

GSM

Bluetooth

GPRS

CDMA2k

CDMAone

JavaWCDMAFDD

IS-136

MPEG-4

MP3

EDGE

PDC

USB

DSP

ARM

VGA

… on advanced wireless systems …

WCDMATD-SCDMA

WCDMATDD

GPS

… with multimedia

http://www.SandbridgeTech.com

Agenda

Motivation for SDR

Multithreaded SDR Processor

Software Development ToolsCompilerSimulatorIDE

Communications System Implementation2Mbps WCDMA802.11b

http://www.SandbridgeTech.com

SandblasterTM Architecture Performs

CompilableDSP

Java ProcessorControl

Processor

SystemProductivityAdvantage

9-12+ months

C programmedLatency hiding architecture

3G Applications Standard 3G, xDSL, 802.11Control Stacks

http://www.SandbridgeTech.com

Processor Execution Models

addi r0,r2,8

muli r8,r0,4

IF RD EX WBIF RD EX WB

IF RD EX WBIF RD EX WB

Pipelined Processor Stalls

addi r0,r2,8

muli r8,r3,4

IF RD EX WBIF RD EX WB

IF RD EX WBIF RD EX WB

Monolithic Processor

addi r0,r2,8

muli r8,r3,4

IF RD EX WB

IF RD EX WB

Pipelined Processor

addi r0,r2,8

muli r8,r3,4

IF RD EX WBIF RD EX WB

IF RD EX WBIF RD EX WB

Pipelined Processor

addi r0,r2,8 IF RD EX WB

IF RD EX WBIF RD EX WB

Multithreaded Processor

muli r8,r0,4 IF RD EX WB

addi r0,r2,8 IF RD EX WBIF RD EX WB

IF RD EX WBIF RD EX WBIF RD EX WBIF RD EX WB

Multithreaded Processor

muli r8,r0,4 IF RD EX WBIF RD EX WB

http://www.SandbridgeTech.com

Multithreaded Architecture

Thread 0# Monitoring FCCH&SCHlu r1,M(r2) lu r1,M(r2) add r2,r3,r4cmp cf2,r5,r3jc cf2,(Synch_off)

Thread 7# Monitoring CPICHlu r1,M(r2)lu r1,M(r2) call (De_scrambler)

stu r3,M(r2)

Thread 6# Handover Code

ctsr r1,sr2 lu r1,M(r2).jc cf0,(UMTS_Mode)

Thread 5# FIR Filter

lvu vr1,M(r4)vmacs

vr3,vr1,vr2,wr0loop lcr0,label(4x)

Thread 4# Pulse shaping Code

lvuvr1,M(r4)vmacs vr3,vr1,vr2,wr0loop lcr0,label(4x)

Thread 3# WCDMA Handover Code

ctsr r1,sr2 lu r1,M(r2).jc cf0,(WCDMA_Mode)

Thread 2# Exit Code ..Start Thread33..Start Thread24

)

Thread 1# WCDMA Main

Start Thread0Start Thread1Start Thread2

Start Thread13Start Thread14loop lc1,0xf0

)

I –C

ache

64K

B64

B L

ines

4W (2

-act

ive)

I- DecodeI- Decode

JumpQ PCCRJTRLCR

Data Buffer

MPY

VRABC

VPR0

MPY

VRABC

VPR1

PABCMPY

VPR2

MPY

VPR3

SAT

VRABCVRABC

PABCPABCPABC

ACC ACC ACC ACCADD ADD ADD ADD

AGEN

(16) 32-bit GPR

LS IQ

LRA LRB

Address

DirLRU Replace

Data Memory64KB

Data Memory64KB

Data Memory64KB

Data Memory64KB

Data Memory64KB

Data Memory64KB

Data Memory64KB

Data Memory64KB

8-Banks / 2Active

INT IQ

IRA IRB

ALU

WB

Bus/MemoryInterface

Interrupt SIMDIQ

Sea of Threads

FastCross Thread

Interrupts

Hardware Scheduled

FullyInterlocked

HighlyParallel

CProgrammed

Code&Data Sharing Across Threads

Key to Low Power Implementation

http://www.SandbridgeTech.com

Multithreaded Baseband Processor

High ParallelismVector / SIMD data parallelismMultiple instruction issueThread-level parallelism

I –C

ache

64K

B64

B L

ines

4W (2

-act

ive)

I- DecodeI- Decode

JumpQ PCCR

JTR

LCR

Data Buffer

MPY

VRABC

VPR0

MPY

VRABC

VPR1

PABC

MPY

VPR2

MPY

VPR3

SAT

VRABC VRABC

PABCPABCPABC

ACC ACC ACC ACCADD ADD ADD ADD

AGEN

(16) 32-bit GPR

LS IQ

LRA LRB

Address

DirLRU Replace

Data Memory64KB

Data Memory64KB

Data Memory64KB

Data Memory64KB

Data Memory64KB

Data Memory64KB

Data Memory64KB

Data Memory64KB

8-Banks

INT IQ

IRA IRB

ALU

WB

Bus/MemoryInterface

Interrupt SIMDIQI –

Cac

he64

KB

64B

Lin

es4W

(2-a

ctiv

e)

I –C

ache

64K

B64

B L

ines

4W (2

-act

ive)

I- DecodeI- Decode

JumpQ PCPCCRCR

JTRJTR

LCRLCR

Data Buffer

MPY

VRABC

VPR0

MPY

VRABC

VPR1

PABC

MPY

VPR2

MPY

VPR3

SAT

VRABC VRABC

PABCPABCPABC

ACC ACC ACC ACCADD ADD ADD ADD

AGEN

(16) 32-bit GPR

LS IQ

LRA LRB

Address

DirLRU Replace

DirLRU Replace

Data Memory64KB

Data Memory64KB

Data Memory64KB

Data Memory64KB

Data Memory64KB

Data Memory64KB

Data Memory64KB

Data Memory64KB

8-Banks

Data Memory64KB

Data Memory64KB

Data Memory64KB

Data Memory64KB

Data Memory64KB

Data Memory64KB

Data Memory64KB

Data Memory64KB

8-Banks

INT IQ

IRA IRB

ALU

IRA IRB

ALU

WB

Bus/MemoryInterface

Interrupt SIMDIQ

http://www.SandbridgeTech.com

Performance

Peak3 compound operations/cycle>20 RISC-ops/cycle4 MACS/cycle (MAC-SAT-ADD-SAT)

ExampleL0: lvu %vr0,%r3,8|| vmulreds %ac0,%vr0,%vr0,%ac0|| loop %lc0,L0>20 operations packed into a 64-bit compound ISAVLIW machines may require 512+ bits

20 tap FIR3.63 taps/cycle sustained

http://www.SandbridgeTech.com

Agenda

Motivation for SDR

Multithreaded SDR Processor

Software Development ToolsCompilerSimulatorIDE

Communications System Implementation2Mbps WCDMA802.11b

http://www.SandbridgeTech.com

Compiler Productivity

SANDBRIDGE

Compile

DesignAlgorithms

Write DSPSpecific C

Write DSPAssembly

10x every 10 years

Signal Processing Applications Complexity

1E+3

10E+3

100E+3

1985 1995 2005

Line

s of

C C

ode

Map toFixed Point C

Hand ScheduleOperations on DSP

Final Product Final Product

6-9 Months! 6-9 Months!

SandblasterTM Provides Dramatic Improvement

http://www.SandbridgeTech.com

Compiler

Programmed in C or JavaSuper-computer class compiler

VectorizationDSP instruction generation

Standard LibraryPrintf();

POSIX pthreads or Java threads50k+ testcases used for validation

Industry standards: Plum-Hall, perennial, nullstone

AMR Encoder(out-of-the-box C code)

0

100

200

300

400

500

600

700

SB TI C64x TI C62x SC140 ADI BlackFin

DSP's

Mhz

10 MHz

http://www.SandbridgeTech.com

Simulation Software

Compiled Simulator JIT “Flash” compilation Up to 100 MHz on high end x86Multi-threaded supported

Up to 4 orders of magnitude fasterDramatic development time reductionSignificant productivity improvement

Simulation Speed(1GHz Laptop)

0.114 0.106

0.0020.013

24.639

0.001

0.010

0.100

1.000

10.000

100.000

Mill

ions

of I

nstru

ctio

ns P

er S

econ

d

SB 24.639

TI C64x (Code Composer) 0.114

TI C62x(Code Composer) 0.106

SC140(Metrow erks) 0.002

ADI Blackfin (Visual DSP) 0.013

http://www.SandbridgeTech.com

Context sensitive IDE

IDE based on open source netbeansCommon Java/C programming environmentIntegrated debuggingTransparent HW / Simulation environmentWorks in multiple languages

http://www.SandbridgeTech.com

Agenda

Motivation for SDR

Multithreaded SDR Processor

Software Development ToolsCompilerSimulatorIDE

Communications System Implementation2Mbps WCDMA802.11b

http://www.SandbridgeTech.com

Integration

IF

RF

MMI

APPLICATIONTASKS

PROTOCOLSTACK

DATA I/OLCD, KPD …

L1 CONTROL

L1 BASEBAND SW

http://www.SandbridgeTech.com

Real-time Baseband Performance

Physical channel Segmentation

Radio frame segmentation

2nd interleaving

Physical channel mapping

Channel coding

Rate matching

TrBk concatenation /Code block segmentation

CRC attachment

Radio frameequalization

1 st interleaving

TrCHMultiplexing

Spreading/Scrambling

Filter

Rate matching

Physical channel Segmentation

Radio frame segmentation

2nd interleaving

Physical channel mapping

Channel coding

Rate matching

TrBk concatenation /Code block segmentation

CRC attachment

Radio frameequalization

1 st interleaving

TrCHMultiplexing

Spreading/Scrambling

Filter

Rate matchingFILTER

RAKE Searcher

PN BT#1 PN BT#2 PN BT#3

De-Scrambler

Path Table Building

Timing Management

De-SpreadDe-SpreadDe-SpreadChannelEst/Derot

Path 2Path 3Path 4

Path 1DSCH

Path 1S-CCPCH

De-ScrambleDe-ScrambleDe-Scramble

DPCH

De-SpreadDe-SpreadDe-SpreadDe-Spread

Path 2Path 3Path 4

MRCMeasurements:

SIR

RSCP

ISCP

Ec/Io

Multi Channel Code De-Mux2nd Deinterleaver

1nd Deinterleaver Channel DecodingFurther Processing

FILTER

RAKE Searcher

PN BT#1 PN BT#2 PN BT#3

De-Scrambler

Path Table Building

Timing Management

De-SpreadDe-SpreadDe-SpreadChannelEst/Derot

Path 2Path 3Path 4

De-SpreadDe-SpreadDe-SpreadChannelEst/Derot

Path 2Path 3Path 4

Path 1DSCH

Path 1S-CCPCH

De-ScrambleDe-ScrambleDe-Scramble

DPCHS-CCPCH

De-ScrambleDe-ScrambleDe-Scramble

DPCH

De-SpreadDe-SpreadDe-SpreadDe-Spread

Path 2Path 3Path 4

De-SpreadDe-SpreadDe-SpreadDe-Spread

Path 2Path 3Path 4

MRCMeasurements:

SIR

RSCP

ISCP

Ec/Io

Multi Channel Code De-Mux2nd Deinterleaver

1nd Deinterleaver Channel DecodingFurther Processing

0

10

20

30

40

50

60

70

80

90

100

802.11b GPRS WCDMA1/2/5.5/11Mbps Class 10/12 64/384/2k Kbps

% S

B96

00 U

tiliz

atio

n

Real-time chip, bit, and symbol rate processing1 SB9600 chip for 2Mbps Rx concurrently with 768kbps Tx<75% utilization for 384kbps Rx / 384kbps Tx

Includes functions traditionally implemented in H/WTurbo DecoderRake ReceiverTx/Rx Filters

Concurrent performance on 802.11B and GPRS

http://www.SandbridgeTech.com

Summary

Multithreaded baseband processormulti-threadedhigh-performance and low-power

Sophisticated compiler technologyautomatically generates DSP operationsnear-assembly language performance

Reconfigurable Communications ProtocolsWCDMAGSM, GPRS802.11BluetoothGPS

http://www.SandbridgeTech.com

Populated Multimode Baseband CardJ1

J5

J2 J6

SIP8_1

1

J3

J4

SW1

SW2

OFF

SW3

LE

Ds 18