prof. stephen a. edwards columbia universitysedwards/classes/2005/4840/... · 2005-01-07 · alu...
Post on 22-Mar-2020
3 Views
Preview:
TRANSCRIPT
Processors, FPGAs, and ASICsCSEE W4840
Prof. Stephen A. Edwards
Columbia University
Processors, FPGAs, and ASICs – p. 1/??
Spectrum of IC choices
Full Custom
ASIC
Gate Array
FPGA
PLD
GP Processor
SP Processor
MultifunctionFixed-function
You choosepolygons (Intel)
circuit (Sony)
wires
logic network
logic function
program (e.g., Pentium)
program (e.g., DSP)
settings (e.g., Ethernet)part number (e.g., 74LS00)
Flexibility
Processors, FPGAs, and ASICs – p. 2/??
NMOS Transistor Cross Section
pn+
n+
Al
oxpoly Al
Processors, FPGAs, and ASICs – p. 3/??
Inverter Transistors and Layout
x
x
Vdd
Vssx
Vss
Vdd
x
Processors, FPGAs, and ASICs – p. 4/??
NAND Gate Transistors and Layout
x ∧ y
x
y
Vdd
Vss x yVss
Vdd
x ∧ y
Processors, FPGAs, and ASICs – p. 5/??
Full-custom ICs
Processors, FPGAs, and ASICs – p. 6/??
Standard Cell ASICs
Processors, FPGAs, and ASICs – p. 7/??
Standard Cell ASICs
Processors, FPGAs, and ASICs – p. 8/??
Channeled Gate Arrays
Processors, FPGAs, and ASICs – p. 9/??
Channeled Gate Arrays
Processors, FPGAs, and ASICs – p. 10/??
Sea-of-Gates Gate Arrays
Processors, FPGAs, and ASICs – p. 11/??
FPGAs: Floorplan
DS077-2 (v2.1) July 9, 2003 www.xilinx.com 1Product Specification 1-800-255-7778
© 2003 Xilinx, Inc. All rights reserved. All Xilinx trademarks, registered trademarks, patents, and disclaimers are as listed at http://www.xilinx.com/legal.htm. All other trademarks and registered trademarks are the property of their respective owners. All specifications are subject to change without notice.
Architectural Description
Spartan-IIE ArrayThe Spartan-IIE user-programmable gate array, shown inFigure 1, is composed of five major configurable elements:
• IOBs provide the interface between the package pins and the internal logic
• CLBs provide the functional elements for constructing most logic
• Dedicated block RAM memories of 4096 bits each• Clock DLLs for clock-distribution delay compensation
and clock domain control• Versatile multi-level interconnect structure
As can be seen in Figure 1, the CLBs form the central logicstructure with easy access to all support and routing struc-tures. The IOBs are located around all the logic and mem-ory elements for easy and quick routing of signals on and offthe chip.
Values stored in static memory cells control all the config-urable logic elements and interconnect resources. Thesevalues load into the memory cells on power-up, and canreload if necessary to change the function of the device.
Each of these elements will be discussed in detail in the fol-lowing sections.
Input/Output BlockThe Spartan-IIE IOB, as seen in Figure 2, features inputsand outputs that support a wide variety of I/O signaling stan-dards. These high-speed inputs and outputs are capable ofsupporting various state of the art memory and bus inter-faces. Table 1 lists several of the standards which are sup-ported along with the required reference, output andtermination voltages needed to meet the standard.
The three IOB registers function either as edge-triggeredD-type flip-flops or as level-sensitive latches. Each IOB hasa clock signal (CLK) shared by the three registers and inde-pendent Clock Enable (CE) signals for each register.
0
Spartan-IIE 1.8V FPGA Family: Functional Description
DS077-2 (v2.1) July 9, 2003 0 0 Product Specification
R
Figure 1: Basic Spartan-IIE Family FPGA Block Diagram
DLL DLL
DLLDLL
BLO
CK
RA
MB
LOC
K R
AM
BLO
CK
RA
MB
LOC
K R
AM
I/O LOGIC
CLBs CLBs
CLBs CLBs
DS077_01_052102
Processors, FPGAs, and ASICs – p. 12/??
FPGAs: Routing
Processors, FPGAs, and ASICs – p. 13/??
FPGAs: CLBSpartan-IIE 1.8V FPGA Family: Functional Description
DS077-2 (v2.1) July 9, 2003 www.xilinx.com 5Product Specification 1-800-255-7778
R
Additional LogicThe F5 multiplexer in each slice combines the function gen-erator outputs (Figure 5). This combination provides eithera function generator that can implement any 5-input func-tion, a 4:1 multiplexer, or selected functions of up to nineinputs.
Similarly, the F6 multiplexer combines the outputs of all fourfunction generators in the CLB by selecting one of the twoF5-multiplexer outputs. This permits the implementation ofany 6-input function, an 8:1 multiplexer, or selected func-tions of up to 19 inputs.
Figure 4: Spartan-IIE CLB Slice (two identical slices in each CLB)
I3
I4
I2
I1
Look-UpTable
D
CK
EC
Q
R
S
I3
I4
I2
I1
O
O
Look-UpTable
D
CK
EC
Q
R
SXQ
X
XB
CE
CLK
CIN
BX
F1
F2
F3
SR
BY
F5IN
G1
G2
YQ
Y
YB
COUT
G3
G4
F4
Carryand
ControlLogic
Carryand
ControlLogic
DS001_04_091400
Processors, FPGAs, and ASICs – p. 14/??
PLAs/CPLDs: The 22v10 TIBPAL22V10-15BCHIGH-PERFORM
ANCE IMPACT-X
PROGRAM
MABLE ARRAY LOGIC CIRCUITS
SR
PS
009A – D
3356, OC
TO
BE
R 1989 – R
EV
ISE
D JU
NE
1990
PO
ST
OF
FIC
E B
OX
655303 • D
ALLA
S, T
EX
AS
752654
0 4 8 12 16 20 24 28
Increments
FirstFuseNumbers
32 36 40
Macro-cell
R = 5809P = 5808
R = 5811P = 5810
R = 5813P = 5812
R = 5815P = 5814
R = 5817P = 5816
logic symbol (positive logic)
Asynchronous Reset
23
22
21
20
19
1
2
3
4
5
(to all registers)
396
0
440
880
924
1452
1496
2112
2156
2860
I/O/Q
I/O/Q
I/O/Q
I/O/Q
I/O/Q
I
I
I
I
CLK/I
Macro-cell
Macro-cell
Macro-cell
Macro-cell
Processors, FPGAs, and ASICs – p. 15/??
Example: Euclid’s Algorithm
int gcd(int m, int n){int r;while ((r = m % n) != 0) {
m = n;n = r;
}return n;
}
Processors, FPGAs, and ASICs – p. 16/??
i386 Programmer’s Model
31 0
eax Mostly
ebx General-
ecx Purpose-
edx Registers
esi Source index
edi Destination index
ebp Base pointer
esp Stack pointer
eflags Status word
eip Instruction Pointer
15 0
cs Code segment
ds Data segment
ss Stack segment
es Extra segment
fs Data segment
gs Data segment
Processors, FPGAs, and ASICs – p. 17/??
Euclid on the i386
gcd: pushl %ebpmovl %esp,%ebppushl %ebxmovl 8(%ebp),%eaxmovl 12(%ebp),%ecxjmp .L6
.L4: movl %ecx,%eaxmovl %ebx,%ecx
.L6: cltdidivl %ecxmovl %edx,%ebxtestl %edx,%edxjne .L4movl %ecx,%eaxmovl -4(%ebp),%ebxleaveret
Processors, FPGAs, and ASICs – p. 18/??
SPARC Programmer’s Model
31 0
r0 Always 0
r1 Global Registers...
r7
r8/o0 Output Registers...
r14/o6 Stack Pointer
r15/o7
r16/l0 Local Registers...
r23/l7
31 0
r24/i0 Input Registers...
r30/i6 Frame Pointer
r31/i7 Return Address
PSW Status Word
PC Program Counter
nPC Next PC
Processors, FPGAs, and ASICs – p. 19/??
SPARC Register Windows
The outputregisters of thecalling procedurebecome the inputsto the calledprocedure
The global registersremain unchanged
The local registersare not visibleacross procedures
r8/o0...r15/o7r16/l0...r23/l7
r8/o0 r24/i0... ...r15/o7 r31/i7r16/l0...r23/l7
r8/o0 r24/i0... ...r15/o7 r31/i7r16/l0...r23/l7r24/i0...r31/i7
Processors, FPGAs, and ASICs – p. 20/??
Euclid on the SPARC
gcd:save %sp, -112, %spmov %i0, %o1b .LL3mov %i1, %i0mov %i0, %o1b .LL3mov %i1, %i0
.LL5:mov %o0, %i0
.LL3:mov %o1, %o0call .rem, 0mov %i0, %o1cmp %o0, 0bne .LL5mov %i0, %o1retrestore Processors, FPGAs, and ASICs – p. 21/??
Motorola DSP56301DMA
Overview 1-11
The block diagram in Figure 1-1 illustrates these buses among other components.
1.6 DMA
The DMA block has the following features:
n Six DMA channels supporting internal and external accesses
n One-, two-, and three-dimensional transfers (including circular buffering)
n End-of-block-transfer interrupts
n Triggering from interrupt lines and all peripherals
Figure 1-1. DSP56301 Block Diagram
PLL OnCE™
ClockGenerator
Internal DataBus
Switch
Program RAM4096 × 24(Default)
YABXABPAB
YDBXDBPDBGDB
MODC/IRQBMODB/IRQC
ExternalData Bus
Switch
14
MODA/IRQD
DSP56300
652
24-Bit
24
24
X DataRAM
2048 × 24(Default)
Y DataRAM
2048 × 24(Default)
DDB
DAB
Memory Expansion Area
Peripheral
Core
YM
_EB
XM
_EB
PM
_EB
PIO
_EB
Expansion Area
6
SCI
JTAG
3
RESET
MODD/IRQA
PINIT/NMI
2
Boot-strapROM
EXTAL
XTAL
ADDRESS
CONTROL
DATA
TripleTimer
HostInterface
(HI32)
ESSI
AddressGeneration
UnitSix ChannelDMA Unit
ProgramInterrupt
Controller
ProgramDecode
Controller
ProgramAddress
Generator
Data ALU24 × 24 + 56 → 56-bit
Two 56-bit Accumulators56-bit Barrel Shifter
PowerManagement
ExternalBus
Interface and
I - CacheControl
ExternalAddress
BusSwitch
5
DE
MAC
Processors, FPGAs, and ASICs – p. 22/??
DSP 56000 Programmer’s Model
55 4847 2423 0x1 x0 Sourcey1 y0 Registers
a2 a1 a0 Accumulatorb2 b1 b0 Accumulator
15 0r7...r4r3...r0
15 0n7...n4n3...n0
15 0m7...m4m3...m0
AddressRegisters
15 0Program CounterStatus RegisterLoop AddressLoop Count
15 PC Stack...0
15 SR Stack...0
Stack pointer
Processors, FPGAs, and ASICs – p. 23/??
Motorola DSP56301 ALU
3-2 DSP56300 Family Manual
Data Arithmetic Logic Unit
The Data ALU registers can be read or written over the X Data Bus (XDB) and the Y Data Bus (YDB) as 24- or 48-bit operands. The source operands for the Data ALU, which can be 24, 48, or 56 bits, always originate from Data ALU registers. The results of all Data ALU operations are stored in an accumulator. The Data ALU runs in 16-bit Arithmetic mode when the SA bit in the Status Register (SR) is set. For details on the SR, see Chapter 5, Program Control Unit
Figure 3-1. Data ALU Block Diagram
Bit Field Unit and Barrel Shifter
AccumulatorShifter
Immediate Field
48
56
24
24
56
56
56
56
X Data Bus
Y Data Bus
2424
X0
X1
Y0
Y1
24 24
Multiplier
Accumulatorand Rounding Unit
A (56)
B (56)
Shifter/Limiter
Pipeline Register
P Data Bus
MUX
56
56
Forwarding Register
56
Processors, FPGAs, and ASICs – p. 24/??
Motorola DSP56301 AGU
4-2 DSP56300 Family Manual
Address Generation Unit
n The offset N (stored in the respective offset register)
n Minus N to the selected address register
The offset adder and the reverse-carry adder operate in parallel and share common inputs. The only difference between them is that the carry propagates in opposite directions. Test logic determines which of the three summed results of the full adders is output. Figure 4-1 shows a block diagram of the AGU.
Each Address ALU can update one address register from its respective address register file during one instruction cycle. The contents of the associated modifier register specify the type of arithmetic to be used in the address register update calculation. The modifier value is decoded in the Address ALU.
The two Address ALUs can generate up to two addresses every instruction cycle:
n One for the PAB, or
n One for the XAB, or
n One for the YAB, or
n One for the XAB and one for the YAB
The AGU can directly address 16,777,216 locations on each of the XAB, YAB, and PAB. Using a register triplet to address each operand, the two independent ALUs can work with the two data memories to feed two operands to the Data ALU in a single cycle.
Figure 4-1 AGU Block Diagram
N0
N1
N2
N3 M3
M2
M1
M0
AddressALU
AddressALU
R0
R1
R2
R3 R7
R6
R5
R4 M4
M5
M6
M7 N7
N6
N5
N4
Triple Multiplexer
Low Address ALU High Address ALU
XAB YAB PAB
Program Address Bus
EP
Global Data Bus
Processors, FPGAs, and ASICs – p. 25/??
FIR Filter in 56000
move #samples, r0move #coeffs, r4move #n-1, m0move m0, m4movep y:input, x:(r0)clr a x:(r0)+, x0 y:(r4)+, y0
rep #n-1mac x0,y0,a x:(r0)+, x0 y:(r4)+, y0
macr x0,y0,a (r0)-movep a, y:output
Processors, FPGAs, and ASICs – p. 26/??
TI TMS320C6000 VLIW DSP
2-2
Figure 2–1. TMS320C62x CPU Data Paths
2X
1X
.L2
.S2
.M2
.D2
(B0–B15)
(A0–A15)
.D1
.M1
.S1
.L1
long src
dst
src2
src1
src1
src1
src1
src1
src1
src1
src1
8
8
8
8
88
long dst
long dstdst
dst
dst
dst
dst
dst
dst
src2
src2
src2
src2
src2
src2
src2
long src
Controlregister
file
DA1
DA2
ST1
LD1
LD2
ST2
32
32
Data path A
Data path B
Register file A
Register file B
long srclong dst
long dstlong src
CPU Data Paths and Control
Processors, FPGAs, and ASICs – p. 27/??
FIR in One ’C6 Assembly Instruction
Load a halfword (16 bits)Do this on unit D1
FIRLOOP:LDH .D1 *A1++, A2 ; Fetch next sample
|| LDH .D2 *B1++, B2 ; Fetch next coeff.
|| [B0] SUB .L2 B0, 1, B0 ; Decrement count
|| [B0] B .S2 FIRLOOP ; Branch if non-zero
|| MPY .M1X A2, B2, A3 ; Sample × Coeff.
|| ADD .L1 A4, A3, A4 ; Accumulate result
Use the cross pathPredicated instruction (only if B0 non-zero)
Run these instruction in parallel
Processors, FPGAs, and ASICs – p. 28/??
AX88796 Ethernet Controller
ASIX ELECTRONICS CORPORATION 5
AX88796 L 3-in-1 Local Bus Fast Ethernet Controller
1.0 Introduction
1.1 General Description: The AX88796 provides industrial standard NE2000 registers level compatible instruction set. Various drivers are easy acquired, maintenance and usage. No much additional effort to be paid. Software is easily port to various embedded systems with no pain and tears The AX88796 Fast Ethernet Controller is a high performance and highly integrated local CPU bus Ethernet Controller with embedded 10/100Mbps PHY/Transceiver and 8K*16 bit SRAM. The AX88796 supports both 8 bit and 16 bit local CPU interfaces include MCS-51 series, 80186 series, MC68K series CPU and ISA bus. The AX88796 implements both 10Mbps and 100Mbps Ethernet function based on IEEE802.3 / IEEE802.3u LAN standard. The AX88796 also provides an extra IEEE802.3u compliant media-independent interface (MII) to support other media applications. Using MII interface, Home LAN PHY type media can be supported. As well as, the chip also provides optional Standard Print Port ( parallel port interface ), can be used for printer server device or treat as simple general I/O port. The chip also support upto 3/1 additional General Purpose In/Out pins The main difference between AX88796 and AX88195 are : 1) Embedded packet buffer memory 2) Built-in 10/100Mbps PHY/Transceiver 3) Replace memory I/F with PHY/Transceiver I/F. 4) Canceling SAX address decoding. 5) Fix interrupt status can’t always clean up problem of AX88195. 6) Add upto 3/1 general Purpose In/Out pins. AX88796 use 128-pin LQFP low profile package, 25MHz operation, and single 3.3V operation with 5V I/O tolerance. The ultra low power consumption is an outstanding feature and enlarges the application field. It is suitable for some power consumption sensitive product like small size embedded products, PDA (Personal Digital Assistant) and Palm size computer … etc.
1.2 AX88796 Block Diagram:
Fig - 1 AX88796 Block Diagram
MAC Core
& PHY+
Tranceiver
8K* 16 SRAM and Memory Arbiter
Remote DMA FIFOs NE2000
Registers
Host Interface
STA
SEEPROM I/F
SD[15:0] SA[9:0] Ctl BUS
MII I/F
EECS EECK EEDI EEDO
TPI, TPO
SPP / GPIO
Print Port or General I/O
SMDC SMDIO
Processors, FPGAs, and ASICs – p. 29/??
Ethernet Controller Registers
ASIX ELECTRONICS CORPORATION 32
AX88796 L 3-in-1 Local Bus Fast Ethernet Controller
5.0 Registers Operation
5.1 MAC Core Registers
All registers of MAC Core are 8-bit wide and mapped into pages which are selected by PS (Page Select) in
PAGE 0 (PS1=0,PS0=0) OFFSET READ WRITE
00H Command Register ( CR )
Command Register ( CR )
01H Page Start Register ( PSTART )
Page Start Register ( PSTART )
02H Page Stop Register ( PSTOP )
Page Stop Register ( PSTOP )
03H Boundary Pointer ( BNRY )
Boundary Pointer ( BNRY )
04H Transmit Status Register ( TSR )
Transmit Page Start Address ( TPSR )
05H Number of Collisions Register ( NCR )
Transmit Byte Count Register 0 ( TBCR0 )
06H Current Page Register ( CPR )
Transmit Byte Count Register 1 ( TBCR1 )
07H Interrupt Status Register ( ISR )
Interrupt Status Register ( ISR )
08H Current Remote DMA Address 0 ( CRDA0 )
Remote Start Address Register 0 ( RSAR0 )
09H Current Remote DMA Address 1 ( CRDA1 )
Remote Start Address Register 1 ( RSAR1 )
0AH Reserved Remote Byte Count 0 ( RBCR0 )
0BH Reserved Remote Byte Count 1 ( RBCR1 )
0CH Receive Status Register ( RSR )
Receive Configuration Register ( RCR )
0DH Frame Alignment Errors ( CNTR0 )
Transmit Configuration Register ( TCR )
0EH CRC Errors ( CNTR1 )
Data Configuration Register ( DCR )
0FH Missed Packet Errors ( CNTR2 )
Interrupt Mask Register ( IMR )
10H, 11H Data Port Data Port 12H IFGS1 IFGS1 13H IFGS2 IFGS2 14H MII/EEPROM Access MII/EEPROM Access 15H Test Register Test Register 16H Inter-frame Gap (IFG) Inter-frame Gap (IFG) 17H GPI GPOC
18H - 1AH Standard Printer Port (SPP) Standard Printer Port (SPP) 1BH - 1EH Reserved Reserved
1FH Reset Reserved
Tab - 12 Page 0 of MAC Core Registers Mapping
Processors, FPGAs, and ASICs – p. 30/??
Philips SAA7114H Video Decoder
Processors, FPGAs, and ASICs – p. 31/??
SAA7114H Registers, page 1 of 7 (!)
Processors, FPGAs, and ASICs – p. 32/??
Fixed-function: The 7400 series
2003 Jun 30 4
Philips Semiconductors Product specification
Quad 2-input NAND gate 74HC00; 74HCT00
Fig.2 Pin configuration DHVQFN14.
handbook, halfpage
1 14
1A VCC
7
2
3
4
5
6
1B
1Y
2A
2B
2Y
13
12
11
10
9
4B
4A
4Y
3B
3A
8
GND 3YMNA950
GND(1)
Top view
(1) The die substrate is attached to this pad using conductive dieattach material. It can not be used as a supply pin or input.
handbook, halfpage
MNA211
A
B
Y
Fig.3 Logic diagram (one gate).
handbook, halfpage
MNA212
1Y1A
31B
1
2
2Y2A
62B
4
5
3Y3A
83B
9
10
4Y4A
114B
12
13
Fig.4 Function diagram.
handbook, halfpage
MNA246
3&
&
&
&
2
1
65
4
810
9
1113
12
Fig.5 IEC logic symbol.
2
Functional Diagram
PinoutsCD54HC374, CD54HCT374
(CERDIP)CD74HC374, CD74HCT374
(PDIP, SOIC)TOP VIEW
CD54HC574, CD54HCT574(CERDIP)
CD74HC574, CD74HCT574(PDIP, SOIC)TOP VIEW
TRUTH TABLE
INPUTS OUTPUT
OE CP Dn Qn
L ↑ H H
L ↑ L L
L L X Q0
H X X Z
H = High Level (Steady State)L = Low Level (Steady State)X= Don’t Care↑= Transition from Low to High LevelQ0= The level of Q before the indicated steady-state input
conditions were establishedZ = High Impedance State
11
12
13
14
15
16
17
18
20
19
10
9
8
7
6
5
4
3
2
1OE
Q0
D0
D1
Q1
Q2
D3
D2
Q3
GND
VCC
D7
D6
Q6
Q7
Q5
D5
D4
Q4
CP 11
12
13
14
15
16
17
18
20
19
10
9
8
7
6
5
4
3
2
1OE
D0
D1
D2
D3
D4
D6
D5
D7
GND
VCC
Q1
Q2
Q3
Q0
Q4
Q5
Q6
Q7
CP
Q0
D0
CP
OE
Q1
D1
Q2
D2
Q3
D3
Q4
D4
Q5
D5
Q6
D6
Q7
D7
D
CP Q
D
CP Q
D
CP Q
D
CP Q
D
CP Q
D
CP Q
D
CP Q
D
CP Q
CD54/74HC374, CD54/74HCT374, CD54/74HC574, CD54/74HCT574
7400 74374Quad NAND Gate Octal D Flip-Flop
Processors, FPGAs, and ASICs – p. 33/??
top related