on-chip interconnect protocols - telecom paris€¦ · an example soc : ti omap 5432 an example soc...
TRANSCRIPT
On-Chip InterconnectProtocols
Tarik Graba, Sumanta Chaudhuri<[email protected]>
PlanWhy Protocols ?
BackgroundHandshakeBurst SignalingBlocking & Non-Blocking Transfer
AXIAXI ChannelsAXI HighlightsAXI3/AXI4 DifferencesAXI/OCP Differences
Protocol Compliance
References
2/65 COMELEC Tarik Graba, Sumanta Chaudhuri
An Example SoC : TI OMAP 5432
An Example SoC from TI.OMAP Used in various mobiles/tablets. e.g Amazon
Kindle (OMAP 4430).
3/65 COMELEC Tarik Graba, Sumanta Chaudhuri
AD
VA
NC
E IN
FO
RM
AT
ION
1/MMC2MMC
IPU subsystemDisplay subsystem
IVA HD1080p
- subsystemABE subsystem
SS USB DRD
GPMC
NAND/NOR/PSRAMcontroller
System DMA
USB .0controller
3
64 bits
L4
_C
FG
in
terc
on
ne
ct
L4
_W
KU
Pin
terc
on
ne
ct
L4
_P
ER
in
terc
on
ne
ct
4 x 32 bits
DSP subsystem ISS
- TIMER- GPIO-- SCRM
COUNTER_32K
- _TIMERWD- odule- SAR RAM
General Wakeup Control m
L2 controller + SCUL2 :ROM: 48 K B
: 1 req
cache + MAcache 2 MiB
iGIC 60 .
C64+ (4 issues)Core: 32 b fixed pts.- it
L1: 32 K B cacheL2: 128 K B cache
: 128 reqECM // MMU
DMA (128 ch )
i sharedi shared
INTC .
e .
IVA
HD
n
terc
on
ne
ct
i(3
2 b
its
)
- Sys trl, Acc engines,- Filters, Msg (16 bits)- Seq: ARM968 w/mem- , Mailbox
C .IF
INTC
Shared L2 IF +SL2: 256 K Bi
IPU_C0- 4Cortex M
IPU_C1- 4Cortex M
L1 32 K BMMU/CTM
L2 64 K B RAM / 16 K B ROM
Emulation features
i shared cacheand
i i
SL2
L4-A
BE
n
terc
on
nect
(32 b
its)
i
- 3x M BSP- 1x M PDM- x M ASP- 1x DMIC- 4x TIMER- 1x WD T
CC
1 C
_ IMER
SIMCOP
ISP5
TCTRL and Parallel IFIF ,Serial : CSI2 CCP2
VID1,2,3
GFX,Write back
pipelines
LCD and TVoverlays
SGX544dual-core
3D graphics
GPUsubsystemAESS
L4_CFG
L4 CFG_
1MIFE
DDR3
DMM ( plitter)
sand Tiler
SL2IF
- General Core Control m
b
odule- Spinlock- Mail ox
CM_CORE_AON + profiler
3
1
3
L4_CFG32 bits
- 5x I2C
- 1x HDQ1W
- 4x MCSPI
- 5x UART (1x RDA)I
- 6x TIMER
- 7x GPIO
- 2x MMC
33
3ToMMC1
MMC2DSS
7
8
To HSI,System DMA,
USBUSB H ,
Shared OCP WP,USB TLL
SS DRD,HS ost
HS ,SATA
- EFUSE_CTRL + FROM- OCP2SCP x 2
1To DSP subsystem
From PRM(profiler port)
From CM1(profiler port)
From ebugd subsystem
To L3_INSTR_EMU
MPUMaster 0
EMIF
L3 interconnect_MAIN
DMA
FaceDetect(FDIF)
FDcore
To L3_INSTR
1To FDIF (face detect)
DS
S n
terc
on
nect
i (32 b
its)
DSP_SS nterconnect(128 bits)
i
PRM + profiler1
RD
po
rt
WR
po
rt
ISS megacell
KBD1
Cortex AC0- 15
MPU_
I:32KiB D:32KiB
3-portUSBHS
Host
OHCI
3-port
USBTLL
HS
- DMA
- RFBI
- HDMI video enc- MIPI DSI Ctrl / Mem
x2
HS
IC
HSIC
ELM
1
MPU subsystem
SmartReflex
Cortex AC- 15
MPU_ 1
MPUMaster 1
128
bits
128
bits
128
bits
DM
M 1
DM
M 2
32 bits
32 bits32 bits 64 bits 128 bits 128 bits 32 bits 128 bits 32 bits
Ii
SS
n
terc
on
nect
(32 b
its)
I iSS nterconnect( bits)128
64 bits 64 bits 64 bits 128 bits
64 b
its
64 bits
32 bits
32 bits
32 bits
L4_CFG32 bits
5432-swpu289-001
L4_CFG32 bits
BTE
CBUFF
32 bits 32 bits 128 bits
L4_CFG32 bits
L4_32 bits
PER
USB2PHY
Other modules:
- OCP2SCP_DEBUGSS
- CT_UART / CT_CTA
- CT_CTI_DEBUGSS
L4_CFG_EMU
32 bits
To
MPU SS
L3_INSTR
OCP_WP_NOC
L3_INSTR_EMUCS_DAP
CS_TPIU
CS_ETB
32 KiB RAM DRM
CT_STM128 x 48RAM
XT
RIG
ICEPick
IEEE1149.7adapter
TAPs
To CSTF
To L3_INSTR_EMU
From CM(profiler port)
2FromIVA-HD
ICEmelter
32 bits
Perf
orm
ance M
onitoring
2MIFE
CM_CORE + profiler1
ToL3_INSTR
MPUMaster
EMIF1
L3OCM_RAM
(128 KiB)
32 bits
I:32KiB D:32KiB
88 iRAM: K B
127 requests256 x 64-bit FIFO
32 channels
USB3PHY
EHCI
1-portHSI
32 bits
L4_32 bits
PER
32 bits
L4_CFG32 bits
From DSP
CS_TF_DEBUGSS_1
CS_REP_DEBUGSS_1
CT_TBR32 KiBRAM
CS_REP_DEBUGSS_2
From MPU
From L3_MAIN
- CT_CTM_DEBUGSS
DEBUGSS
DDR3
SATA
GC3202D graphics
BB2Dsubsystem
32
bits
128
bits
128
bits
3
ECCN: 3E991
OMAP5432www.ti.com SWPS051F –MAY 2013–REVISED APRIL 2014
Figure 1-1. OMAP5432 Block Diagram
Copyright © 2013–2014, Texas Instruments Incorporated Device Overview 3Submit Documentation FeedbackProduct Folder Links: OMAP5432
An Example SoC : TI OMAP 5432
An Example SoC from TI.• Used in various mobiles/tablets. e.g Amazon Kindle (OMAP
4430).
• Multiple IPs from multiple vendors on the same chip.• Each IP have different frequency, data width, addressing,
Bandwidth/Latency Requirements etc. etc.• How do they communicate ?• Needs On-Chip Interconnect Protocols.
5/65 COMELEC Tarik Graba, Sumanta Chaudhuri
An Example SoC : TI OMAP 5432
An Example SoC from TI.• Used in various mobiles/tablets. e.g Amazon Kindle (OMAP
4430).• Multiple IPs from multiple vendors on the same chip.
• Each IP have different frequency, data width, addressing,Bandwidth/Latency Requirements etc. etc.
• How do they communicate ?• Needs On-Chip Interconnect Protocols.
5/65 COMELEC Tarik Graba, Sumanta Chaudhuri
An Example SoC : TI OMAP 5432
An Example SoC from TI.• Used in various mobiles/tablets. e.g Amazon Kindle (OMAP
4430).• Multiple IPs from multiple vendors on the same chip.• Each IP have different frequency, data width, addressing,
Bandwidth/Latency Requirements etc. etc.
• How do they communicate ?• Needs On-Chip Interconnect Protocols.
5/65 COMELEC Tarik Graba, Sumanta Chaudhuri
An Example SoC : TI OMAP 5432
An Example SoC from TI.• Used in various mobiles/tablets. e.g Amazon Kindle (OMAP
4430).• Multiple IPs from multiple vendors on the same chip.• Each IP have different frequency, data width, addressing,
Bandwidth/Latency Requirements etc. etc.• How do they communicate ?
• Needs On-Chip Interconnect Protocols.
5/65 COMELEC Tarik Graba, Sumanta Chaudhuri
An Example SoC : TI OMAP 5432
An Example SoC from TI.• Used in various mobiles/tablets. e.g Amazon Kindle (OMAP
4430).• Multiple IPs from multiple vendors on the same chip.• Each IP have different frequency, data width, addressing,
Bandwidth/Latency Requirements etc. etc.• How do they communicate ?• Needs On-Chip Interconnect Protocols.
5/65 COMELEC Tarik Graba, Sumanta Chaudhuri
History : Shared Bus
All masters are connectedto the same set of wires.A bus controller will issuetoken to masters to utilizethe bus.One master blocks the bus.Huge capacitative loading..Not suitable for highperformance applications.
6/65 COMELEC Tarik Graba, Sumanta Chaudhuri
Modern Interconnect : Circuit SwitchedNetwork
7/65 COMELEC Tarik Graba, Sumanta Chaudhuri
Modern Interconnect : Circuit SwitchedNetworkStratégie d’arbitrage
The arbiter switches from one master to another using apre-defined strategy.
Round Robin.Priority Base.Time-out Based.
8/65 COMELEC Tarik Graba, Sumanta Chaudhuri
Modern Interconnect : Packet SwitchedNetwork
Transactions aretransformed into networkpackets.Packets will follow anavailable route to the slave.Packet sizes and networktopology are determinedfrom applicationrequirements.
9/65 COMELEC Tarik Graba, Sumanta Chaudhuri
On-Chip Protocols : Point to PointDissociates network implementation from IP interface.Helps in plug’n play.Only the interface needs to be protocol compliant.
10/65 COMELEC Tarik Graba, Sumanta Chaudhuri
Design Goals for Modern Protocols
Design Reuse, Plug’n play.
Support of High Bandwidth/ Low Latency Traffic.Point-to-Point Protocols. ( As opposed to shared bus)Pipelined/Non-Blocking. Can have multiple outstandingrequests.Be suitable for DRAM Traffic. (Out-of-order data, initialaccess latency)
11/65 COMELEC Tarik Graba, Sumanta Chaudhuri
Design Goals for Modern Protocols
Design Reuse, Plug’n play.Support of High Bandwidth/ Low Latency Traffic.
Point-to-Point Protocols. ( As opposed to shared bus)Pipelined/Non-Blocking. Can have multiple outstandingrequests.Be suitable for DRAM Traffic. (Out-of-order data, initialaccess latency)
11/65 COMELEC Tarik Graba, Sumanta Chaudhuri
Design Goals for Modern Protocols
Design Reuse, Plug’n play.Support of High Bandwidth/ Low Latency Traffic.Point-to-Point Protocols. ( As opposed to shared bus)
Pipelined/Non-Blocking. Can have multiple outstandingrequests.Be suitable for DRAM Traffic. (Out-of-order data, initialaccess latency)
11/65 COMELEC Tarik Graba, Sumanta Chaudhuri
Design Goals for Modern Protocols
Design Reuse, Plug’n play.Support of High Bandwidth/ Low Latency Traffic.Point-to-Point Protocols. ( As opposed to shared bus)Pipelined/Non-Blocking. Can have multiple outstandingrequests.
Be suitable for DRAM Traffic. (Out-of-order data, initialaccess latency)
11/65 COMELEC Tarik Graba, Sumanta Chaudhuri
Design Goals for Modern Protocols
Design Reuse, Plug’n play.Support of High Bandwidth/ Low Latency Traffic.Point-to-Point Protocols. ( As opposed to shared bus)Pipelined/Non-Blocking. Can have multiple outstandingrequests.Be suitable for DRAM Traffic. (Out-of-order data, initialaccess latency)
11/65 COMELEC Tarik Graba, Sumanta Chaudhuri
Some On-Chip Interconnect Protocols
AMBA (Advanced Microcontroller Bus Architecture)Open standard maintained by ARM.
OCP-IP (Open Core Protocol International Partnership)Open standard developed by an industryconsortium (TI, NXP, etc. etc.)
Other standards include CoreConnect, VCI (VirtualComponent Interface), STBus etc. etc.
12/65 COMELEC Tarik Graba, Sumanta Chaudhuri
Quiz
What are the differences between On-Chip Networks &Off-chip Networks ?
13/65 COMELEC Tarik Graba, Sumanta Chaudhuri
Plan
We will discuss some basic concepts : Handshake,Memory-Maps etc.We will discuss the AXI3 Standard first.We will see the differences between AXI3 & AXI4.We will see the differences between AXI OCP.
14/65 COMELEC Tarik Graba, Sumanta Chaudhuri
PlanWhy Protocols ?
BackgroundHandshakeBurst SignalingBlocking & Non-Blocking Transfer
AXIAXI ChannelsAXI HighlightsAXI3/AXI4 DifferencesAXI/OCP Differences
Protocol Compliance
References
15/65 COMELEC Tarik Graba, Sumanta Chaudhuri
Basics : Clocked Interface
16/65 COMELEC Tarik Graba, Sumanta Chaudhuri
Basics : VALID/READY Handshake
A clocked interface can’t bestalled.Handshake mechanism toinclude stalling behavior.
17/65 COMELEC Tarik Graba, Sumanta Chaudhuri
Basics : VALID/READY Handshake
18/65 COMELEC Tarik Graba, Sumanta Chaudhuri
Basics : VALID/READY Handshake
19/65 COMELEC Tarik Graba, Sumanta Chaudhuri
Quiz
What are the pros/cons for a valid/ready interfacecompared to clock interface ?When would use handshake and when would you useclocked interface ?
20/65 COMELEC Tarik Graba, Sumanta Chaudhuri
Bursts
Successive writes/reads to a predefined address pattern.e.g incremental (one dimensional), 2D Bursts.MRMD : Multiple Request Multiple Data.SRMD : Single Request Multiple Data.
21/65 COMELEC Tarik Graba, Sumanta Chaudhuri
Incremental Burst :MRMD
Timing Diagrams 171
OCP-IP Confidential
Figure 30 Incrementing Precise Burst Read
SequenceA. The master starts a read request by driving RD on MCmd, a valid address
on MAddr, four on MBurstLength, INCR on MBurstSeq, and asserts MBurstPrecise. MBurstLength, MBurstSeq and MBurstPrecise must be kept constant during the burst. MReqLast must be deasserted until the last request in the burst. The slave is ready to accept any request, so it asserts SCmdAccept.
B. The master issues the next read in the burst. MAddr is set to the next word-aligned address (incremented by 4 in this case). The slave captures the address of the first request and keeps SCmdAccept asserted.
C. The master issues the next read in the burst. MAddr is set to the next word-aligned address (incremented by 4 in this case). The slave captures the address of the second request and keeps SCmdAccept asserted. The slave responds to the first read by driving DVA on SResp and the read data on SData.
Clk
MAddr
MCmd
MData
SCmdAccept
SResp
SData
IDLE
0x01
RD1 RD2
0x83 0xC4
RD4
DVA4 NULL
D4
IDLE
DVA1
D1
0x42
RD3
DVA2
D2 D3
DVA3
MBurstLength 4 4 44
NULL
Req
ue
st
Ph
ase
Res
pon
se
Ph
ase
2 3 4 5 6 71
MBurstSeq INCR INCR INCRINCR
MBurstPrecise
MReqLast
SRespLast
D E F GA B C
22/65 COMELEC Tarik Graba, Sumanta Chaudhuri
Wrap Burst :MRMD
174 Open Core Protocol Specification
OCP-IP Confidential
10.12 Wrapping Burst ReadFigure 32 illustrates a burst of four 32-bit words, wrapping burst read, with optional burst framing information (MReqLast/SRespLast). MReqLast flags the last request of the burst and SRespLast flags the last response of the burst. As a wrapping burst is precise, the MBurstLength signal is constant during the whole burst, and must be power of two. The wrapping burst address must be aligned to boundary MBurstLength times the OCP word size in bytes.
Figure 32 Wrapping Burst Read
SequenceA. The master starts a read request by driving RD on MCmd, a valid address
on MAddr, four on MBurstLength, WRAP on MBurstSeq, and asserts MBurstPrecise. MBurstLength, MBurstSeq, and MBurstPrecise must be kept constant during the burst. MReqLast must be deasserted until the last request in the burst. The slave is ready to accept any request, so it asserts SCmdAccept.
Clk
MAddr
MCmd
MData
SCmdAccept
SResp
SData
IDLE
0x81
RD1 RD2
0x03 0x44
RD4
DVA4 NULL
D4
IDLE
DVA1
D1
0xC2
RD3
DVA2
D2 D3
DVA3
MBurstLength 4 4 44
NULL
Re
qu
est
P
ha
seR
esp
on
se
Ph
ase
2 3 4 5 6 71
MBurstSeq WRAP WRAP WRAPWRAP
MBurstPrecise
MReqLast
SRespLast
D E F GA B C
23/65 COMELEC Tarik Graba, Sumanta Chaudhuri
SRMD Burst
Timing Diagrams 179
OCP-IP Confidential
Figure 35 Single Request Burst Read
C. The master captures the first response data. The slave issues the second response.
D. The master captures the second response data. The slave issues the third response.
E. The master captures the third response data. The slave issues the fourth response, and asserts SRespLast to indicate the last response of the burst.
F. The master captures the last response data.
Clk
MAddr
MCmd
MData
SCmdAccept
SResp
SData
IDLE
0x01
RD1
DVA4 NULL
IDLE
DVA1
D1
DVA3
D3
MBurstLength 4
NULL
Req
uest
P
hase
Res
pon
se
Pha
se
MBurstSeq INCR
MBurstSingleReq
MBurstPrecise
SRespLast
1 2 3 4 5 6
DVA2
D2
FC D E
D4
A B
24/65 COMELEC Tarik Graba, Sumanta Chaudhuri
Quiz
A processor missed a data in the cache and wants to acache-line fill. Which type of burst shall it use ?What is cache line size for common processorsARM/MIPS ?If the data width of the interface is 128 bits, how manycycles (minimum) are required for a cache line fill ?Which one is better ? SRMD or MRMD ?
25/65 COMELEC Tarik Graba, Sumanta Chaudhuri
DRAM TrafficHigh Initial Latency
Exemple d’une SDRAM
DRAM is the main performance bottleneck in anembedded system.Protocols are designed for efficient utilization of DRAM.DRAM response can come out of order, has high initiallatency.26/65 COMELEC Tarik Graba, Sumanta Chaudhuri
Blocking & Non-Blocking
Blocking :• Master doesn’t emit a new request until the previous is
finished.• Max. Outstanding Reads=1.• e.g AMBA (ARM Advanced Microcontroller Bus
Architecture) (APB,AHB . . .), WishboneNon-Blocking :
• Master emits N requests before waiting for the responses.• Max. Outstanding Reads=N.• e.g AXI,OCP,VCI.
27/65 COMELEC Tarik Graba, Sumanta Chaudhuri
Blocking Read
Timing Diagrams 167
OCP-IP Confidential
10.7 Non-Pipelined ReadFigure 27 shows three read transfers to a slave that cannot pipeline responses after requests. This is the typical behavior of legacy computer bus protocols with a single WAIT or ACK signal. In each transfer, SCmdAccept is asserted in the same cycle that SResp is DVA. Therefore, the request-to-response latency is always 0, but the request accept latency varies from 0 to 2.
Figure 27 Non-Pipelined Read
SequenceA. The master starts the first read request, driving RD on MCmd and a valid
address on MAddr. The slave asserts SCmdAccept, for a request accept latency of 0. When the slave sees the read command, it responds with DVA on SResp and valid data on SData. (This requires a combinational path in the slave from MCmd, and possibly other request phase fields, to SResp, and possibly other response phase fields.)
B. The master launches another read request. It also sees that SResp is DVA and captures the read data from SData. The slave is not ready to respond to the new request, so it deasserts SCmdAccept.
C. The master sees that SCmdAccept is low and extends the request phase. The slave is now ready to respond in the next cycle, so it simultaneously asserts SCmdAccept and drives DVA on SResp and the selected data on SData. The request accept latency is 1.
D. Since SCmdAccept is asserted, the request phase ends. The master sees that SResp is now DVA and captures the data.
E. The master launches a third read request. The slave deasserts SCmdAccept.
Clk
MAddr
MCmd
MData
SCmdAccept
SResp
SData
IDLE
NULL
A1
RD1 IDLERD2
A2 A3
IDLERD3
DVA1 NULL NULLDVA2 DVA3NULL
D1 D2 D3
Req
ues
t P
ha
seR
espo
nse
Ph
ase
2 3 4 5 6 71 8
A B C D E F G
28/65 COMELEC Tarik Graba, Sumanta Chaudhuri
Non-Blocking Read
168 Open Core Protocol Specification
OCP-IP Confidential
F. The slave asserts SCmdAccept after 2 cycles, so the request accept latency is 2. It also drives DVA on SResp and the read data on SData.
G. The master sees that SCmdAccept is asserted, ending the request phase. It also sees that SResp is now DVA and captures the data.
10.8 Pipelined Request and ResponseFigure 28 shows three read transfers using pipelined request and response semantics. In each case, the request is accepted immediately, while the response is returned in the same or a later cycle.
Figure 28 Pipelined Request and Response
SequenceA. The master starts the first read request, driving RD on MCmd and a valid
address on MAddr. The slave asserts SCmdAccept, for a request accept latency of 0.
B. Since SCmdAccept is asserted, the request phase ends. The slave responds to the first request with DVA on SResp and valid data on SData.
C. The master launches a read request and the slave asserts SCmdAccept. The master sees that SResp is DVA and captures the read data from SData. The slave drives NULL on SResp, completing the first response phase.
Clk
MAddr
MCmd
MData
SCmdAccept
SResp
SData
IDLE
NULL
A1
RD1 IDLE
A3
RD3
DVA3 NULL
D3
IDLE
DVA1
D1
A2
RD2
DVA2
D2
NULL
2 3 4 5 6 71R
eque
st
Ph
ase
Res
pons
e
Ph
ase
B C D E F GA
NULL
29/65 COMELEC Tarik Graba, Sumanta Chaudhuri
Out-of-Order Execution
30/65 COMELEC Tarik Graba, Sumanta Chaudhuri
Basics : Memory-Mapped Slaves
31/65 COMELEC Tarik Graba, Sumanta Chaudhuri
Basics : REQUEST & RESPONSE PATH
32/65 COMELEC Tarik Graba, Sumanta Chaudhuri
Quiz
An IP has a databus width of 128 bits. The average readlatency from IP to DRAM is 32 cycles. In the return paththe IP needs an average bandwidth of 16 bytes/cycle. Howmany outstanding read requests should it issue ?Is there a maximum limit on no. of outstanding reads ?
33/65 COMELEC Tarik Graba, Sumanta Chaudhuri
PlanWhy Protocols ?
BackgroundHandshakeBurst SignalingBlocking & Non-Blocking Transfer
AXIAXI ChannelsAXI HighlightsAXI3/AXI4 DifferencesAXI/OCP Differences
Protocol Compliance
References
34/65 COMELEC Tarik Graba, Sumanta Chaudhuri
AMBA History
35/65 COMELEC Tarik Graba, Sumanta Chaudhuri
AXI3 WRITE ADDRESS CHANNEL
36/65 COMELEC Tarik Graba, Sumanta Chaudhuri
AXI3 WRITE ADDRESS CHANNEL
37/65 COMELEC Tarik Graba, Sumanta Chaudhuri
AXI3 WRITE DATA CHANNEL
38/65 COMELEC Tarik Graba, Sumanta Chaudhuri
AXI3 WRITE DATA CHANNEL
39/65 COMELEC Tarik Graba, Sumanta Chaudhuri
AXI3 WRITE RESPONSE CHANNEL
40/65 COMELEC Tarik Graba, Sumanta Chaudhuri
AXI3 WRITE RESPONSE CHANNEL
41/65 COMELEC Tarik Graba, Sumanta Chaudhuri
AXI3 WRITE BURST
42/65 COMELEC Tarik Graba, Sumanta Chaudhuri
AXI3 READ ADDRESS CHANNEL
43/65 COMELEC Tarik Graba, Sumanta Chaudhuri
AXI3 READ RESPONSE CHANNEL
44/65 COMELEC Tarik Graba, Sumanta Chaudhuri
AXI3 READ RESPONSE CHANNEL
45/65 COMELEC Tarik Graba, Sumanta Chaudhuri
AXI3 READ BURST
46/65 COMELEC Tarik Graba, Sumanta Chaudhuri
AXI3 READ BURST
47/65 COMELEC Tarik Graba, Sumanta Chaudhuri
AXI HIGHLIGHTS : Combinatorial Paths
There must be no combinatorial paths between input andoutput signals on both master and slave interfaces.
48/65 COMELEC Tarik Graba, Sumanta Chaudhuri
AXI HIGHLIGHTS : Reset
The AXI protocol includes a single active LOW resetsignal, ARESETn. The reset signal can be assertedasynchronously, but deassertion must be synchronousafter the rising edge of ACLK.To avoid metastability associated with asynchronousde-assertion of Reset.
49/65 COMELEC Tarik Graba, Sumanta Chaudhuri
AXI HIGHLIGHTS : Ordering Rules
Key to Out-of-Order Transactions Processing.At a master interface, read data with the same ARID valuemust arrive in the same order in which the master issuedthe addresses.In a sequence of read transactions with different ARIDvalues, the slave can return the read data in a differentorder than that in which the transactions arrived.
50/65 COMELEC Tarik Graba, Sumanta Chaudhuri
AXI HIGHLIGHTS : Write Data Interleaving
Write Data with different AWIDs can be interleaved.Not supported anymore in AXI4.
51/65 COMELEC Tarik Graba, Sumanta Chaudhuri
AXI HIGHLIGHTS : 4K Crossing
Bursts must not cross 4KB boundaries.This is enforced so that a burst doesn’t cross over toanother slave.The slave memory maps has to be aligned to 4KBboundaries.
52/65 COMELEC Tarik Graba, Sumanta Chaudhuri
AXI HIGHLIGHTS : Special Transactions
Cache Related Transactions. Used to indicate cacheallocate and write policy.write policy
• write-back : Processor writes data to cache. Cache writesthe data back to main memory, when the system bus is free.
• write-through : The processor write is complete only whenthe data has reached both main memory and cache..
allocate policy :• Cacheable• Buferable.• Allocate/ Don’t allocate a cache line.
Total 16 different cache related special transactions.(AxCACHE[3 :0]) x=RD/WR
53/65 COMELEC Tarik Graba, Sumanta Chaudhuri
AXI HIGHLIGHTS : Special Transactions
To indicate protected accesses. AxPROT[1 :0]Exclusive Access
• implemented in the slave.• Signalled by AxLOCK[1 :0], BRESP[1 :0],RRESP[1 :0].• e.g an exclusive write is successful (denoted by EXOKAY) if
no other master has written to the address space betweenprevious read and the write to the same location.
Locked Access• implemented in the the network.• Signalled by AxLOCK[1 :0].• The network guarantees that only the master is allowed
access to a region, until an unlocked access comes fromthe same master.
54/65 COMELEC Tarik Graba, Sumanta Chaudhuri
Features not supported in AXI4
Locked transaction is no longer supported.WID signal disappears. Write Data interleaving is no longersupported.
55/65 COMELEC Tarik Graba, Sumanta Chaudhuri
Additional Features in AXI4
Burstlength for incremental bursts can be 256 (16 forAXI3).Additional signal AxQOS. (x=RD/WR) 4 bit signal, highervalue indicates higher priorityAdditional signal AxREGION. (x=RD/WR). Each memorymapped slave interface can be divided into regions (logicalinterfacs).Additional signal AxUSER. (x=RD/WR).
56/65 COMELEC Tarik Graba, Sumanta Chaudhuri
AXI/OCP Differences
No Separate WR/RD channel like AXI.A common command channel which issue both write/readcommands.OCP can have several of the command channels inparallel. For multi-threading.Supports posted writes. (i.e no response required)
57/65 COMELEC Tarik Graba, Sumanta Chaudhuri
QuizIs there a Bug ?
58/65 COMELEC Tarik Graba, Sumanta Chaudhuri
PlanWhy Protocols ?
BackgroundHandshakeBurst SignalingBlocking & Non-Blocking Transfer
AXIAXI ChannelsAXI HighlightsAXI3/AXI4 DifferencesAXI/OCP Differences
Protocol Compliance
References
59/65 COMELEC Tarik Graba, Sumanta Chaudhuri
Executable Specifications.
AXI specifications are available as a set of SystemVerilogAssertions.Can be instantiated in the netlist for protocol checking.Can be synthesized into FPGA/Emulators.
60/65 COMELEC Tarik Graba, Sumanta Chaudhuri
Constrained random verfication.
Use VIPs(Verification IPs).Generate random AXI Traffic from a BFM (Bus FunctionalMaster) towards IP interface.Constrain the traffic according to IP specs.Functional Coverage.
• Need to check all possible combination of validtransactions.
• e.g check burst length =(1,2,4,8) in each page of 4K in a1M memory space. Need to hit 4 × 256 different cases.
• Need to check that no rule is violated for all combinations.
No Tape-out without 100% coverage.
61/65 COMELEC Tarik Graba, Sumanta Chaudhuri
Constrained random verfication.
62/65 COMELEC Tarik Graba, Sumanta Chaudhuri
Assertion based verfication.
Use ABVIPs (Assertion Based Verification IPs).Can be used in Formal Verification Tools. (e.g IncisiveFormal Verifier)In formal verification, functional coverage is 100% bydefinition.
63/65 COMELEC Tarik Graba, Sumanta Chaudhuri
PlanWhy Protocols ?
BackgroundHandshakeBurst SignalingBlocking & Non-Blocking Transfer
AXIAXI ChannelsAXI HighlightsAXI3/AXI4 DifferencesAXI/OCP Differences
Protocol Compliance
References
64/65 COMELEC Tarik Graba, Sumanta Chaudhuri
ReferencesAmba specs.http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.set.amba/index.html.
Cadence amba vips.http://www.cadence.com/downloads/rl/vip/AMBA.pdf.
Ocp-ip specs.http://www.ocpip.org/uploads/dynamic_areas/Xu4qydXgbYWof7Ihz3Uh/947/Open%20Core%20Protocol%20Specification%203.0.pdf.
Sudeep Pasricha and Nikil Dutt.On-Chip Communication Architectures : System on ChipInterconnect.Morgan Kaufmann Publishers Inc., San Francisco, CA,USA, 2008.
65/65 COMELEC Tarik Graba, Sumanta Chaudhuri