the scalable configurable instrument processor
Post on 07-Jan-2016
35 Views
Preview:
DESCRIPTION
TRANSCRIPT
Hayes 1 MAPLD 2005/135
The Scalable Configurable Instrument Processor
John R. Hayes
Johns Hopkins University
Applied Physics Laboratory
john.hayes@jhuapl.edu
Hayes 2 MAPLD 2005/135
Introduction• Instrument processors require low power,
small footprint, etc.
• Few processors meet these requirements
• Design our own using VHDL and FPGAs
• Architectural influences:– APL’s FRISC (Freja, Flare Genesis)– Harris RTX2010 (NEAR, ACE, MESSENGER,
New Horizons, etc.)
Hayes 3 MAPLD 2005/135
SCIP Processor Features• Stack Architecture: N op T->T (RTX)
• 16 or 32-bit Internal Data Path
• 16-bit Instruction OpCode
• ALU/Condition Architecture (FRISC3/4)
• Multiply/Divide Steps (FRISC4)
• Barrel Shifter (FRISC4)
• Stack Caches (FRISC3/4)
• Memory-Mapped I/O
Hayes 4 MAPLD 2005/135
Scalability• Data path, i.e. buses, ALU, barrel shifter
can be 16 or 32 bits wide• Buses scale trivially in VHDL:
– bus_x: in unsigned(DBITS-1 downto 0);
• ALU based on function blocks with 1-level of carry look-ahead scales easily
• Multiplier would not scale well; instead use Booth multiply step; “scaling” done in software
Hayes 5 MAPLD 2005/135
More Scalability• Barrel shifter (based on funnel shifter) uses
N*log(N) resources-- Funnel shifter.process(a, b, count) variable t: unsigned(2*DBITS-1 downto 0);begin t := a & b; for i in 0 to DLBITS-1 loop if count(i) = '1' then t := to_unsigned(0, 2**i) & t(2*DBITS-1 downto 2**i); end if; end loop; result <= t(DBITS-1 downto 0);end process;
Hayes 6 MAPLD 2005/135
Configurability• SCIP library
– SCIP (DBITS=16 or 32, ABITS=up to 32)– Clock generator (WBITS)– AMBA APB bridge (ABITS, DBITS)
• AMBA APB component library– Interrupt controller (INTS)– UART (DATABITS, STOPBITS, DIV)– Parallel ports (BITS)– Others: I2C-subset, watchdog, etc.
Hayes 7 MAPLD 2005/135
Configuring a System on a Chip (SoC)
• All components except decoders and memory controller are from libraries
• A basic SoC, processor, interrupt controller, and UART, is ~500 lines of VHDL
SCIP clock
primary decode
memorycontrol
AMBA/APB Bridge
Din
Dout
A
secondary decode
ints UART
Hayes 8 MAPLD 2005/135
Instruction SetInstruction Format Function
Call 0DDDDDDDDDDDDDDD call
CallL 1000DDDDDDDDDDDDDDDDDDDDDDDDDDDD
call long (32-bit only)
Branch 10010BBFFFFFFFFF branch
Qbranch 10011BBFFFFFFFFF branch if Fl=0
ALUImm 1010RSSIIIIIAAAA imm op T -> T
ALUImmL 1011RSS000 LAAAAiiiiiiiiiiiiiiii
longimm*scale op T -> T
ALUStep 1011RSS001 aaaa N op T:MD:Fl -> T:MD:Fl
ALUOp 1011RSS010 AAAA N op T -> T
1011RSS011
ALUOpEx 1011RSS1CCCCAAAA N op T -> T; cond -> Fl
ALUTs 1100RSS0CCCCAAAA N op T ->; cond -> T
ALUTsEx 1100RSS1CCCCAAAA N op T ->; cond -> Fl
ALURdRg 1101RSS0rrrrAAAA Reg op T -> T
ALUWrRg 1101RSS1rrrrAAAA N op T -> Reg
Shift 1110RSS OO00 N shift T -> T
ShiftIm 1110RSSIIIIIOO01 imm shift T -> T
Load 1110RSSIIIIIMM10 *(imm + T) -> T
Store 1110RSSIIIIIMM11 *(imm + T) <- T
Special 1111 s special
Hayes 9 MAPLD 2005/135
16-Bit Version• Adds Code Page and Data Page Registers to
supply upper address bits
• Adds far/near mode bit and special instructions to set/reset mode
• Not compatible with RTX object code, but most RTX source code runs– Most I/O routines must be rewritten– APL’s common instrument software library has
been ported
Hayes 10 MAPLD 2005/135
32-Bit Version• Adds Long Subroutine Call instruction
• Adds 32-bit option to Load/Store instructions
• Adds scale to Long Immediate instruction; any 32-bit literal can be constructed with at most two instructions
Hayes 11 MAPLD 2005/135
Implementation: Simulation• Port cross-compilers to SCIP, 16-bit and
32-bit versions
• Write architectural simulator in C
• Validate architecture, compiler, and simulator
• Translate C into VHDL
• Use simulator to generate VHDL test bench
• Simulate VHDL test bench to validate translation
Hayes 12 MAPLD 2005/135
Implementation: SCIP Synthesis Area Usage
• Synthesize processor using Synplify Pro
• Synthesize 16-bit SCIP and 32-bit SCIP for three similar parts
Xilinx 3S200
(seq/comb)
Actel 54SX72A
(seq/comb)
QuickLogic QL6325
(seq/comb)6 / 33 % (+ RAM)
SCIP (16 bits)
SCIP (32 bits) 10 / 57 % (+ RAM)
40 / 45 %
76 / 83 %
50 / 43 %
-
Hayes 13 MAPLD 2005/135
Implementation: Xilinx• Xilinx Spartan-3 Starter Kit Board
• Board has 3S200 FPGA and 1 MB SRAM
• SCIP-16 with UART– Usage: 8% (+ RAM) / 39%
• SCIP-32 with UART– Usage: 14% (+ RAM) / 65%
• Runs all test programs
Hayes 14 MAPLD 2005/135
Implementation: New Horizons (NH) Demo
• Build-up flight-spare NH processor board
• Remove RTX2010 processor
• Replace NH Actel (SX72) with:– SCIP-16, clock gen., memory I/F, etc– NH S/C I/F, watchdog, I2C subset, etc.– Interrupt control, test port UART, etc.
• Measure power, etc.
Hayes 15 MAPLD 2005/135
Implementation: NH RTX Processor Board
Hayes 16 MAPLD 2005/135
Implementation: NH SCIP Processor Board
Hayes 17 MAPLD 2005/135
Implementation: NH SCIP Results
• Actel (SX72) usage: 59 / 60%
• Running at 6 MHz (instruction rate)
• Board powered from external 5V (Actel core 2.5 V from on-board regulator)
RAM Test
Empty Loop
Sleep
835 mWEntire Board
Actel Core 100 mW
490
70
275
30
Hayes 18 MAPLD 2005/135
Implementation: V-Slit Demo
• Redesign of NH RTX processor board
• Replace RTX2010 with QuickLogic QL6325:– SCIP-16, clock gen., memory I/F, etc– QL6325 -> Aeroflex UT6325 path to flight
• Replace Actel SX72 with SX32:– NH S/C I/F, etc.– Interrupt control, test port UART, etc.
• Measure power, etc.
Hayes 19 MAPLD 2005/135
Implementation: V-Slit SCIP Processor Board
Hayes 20 MAPLD 2005/135
Implementation: V-Slit Results
• QuickLogic (QL6325) usage: 52 / 45 %
• Running at 6 MHz (instruction rate)
• Board powered from external 3.3 and 2.5V (QuickLogic and Actel share 2.5 V)
RAM Test
Empty Loop
Sleep
393 mWEntire Board
QL/Actel Core 63 mW
332
35
104
15
Hayes 21 MAPLD 2005/135
Summary• Architecture validated
• Implementations on Xilinx, Actel, and QuickLogic tested
• SCIP-16 provides replacement for RTX2010; SCIP-32 provides growth path
• SCIP use planned on several upcoming flight instruments
top related