lecture 7. amba
Post on 04-Feb-2016
184 views
Embed Size (px)
DESCRIPTION
COMP427 Embedded Systems. Lecture 7. AMBA. Prof. Taeweon Suh Computer Science Education Korea University. AMBA. Advanced Microcontroller Bus Architecture On-chip bus protocol from ARM - PowerPoint PPT PresentationTRANSCRIPT
Lecture 7. AMBAProf. Taeweon SuhComputer Science & EngineeringKorea UniversityCOMP427 Embedded Systems
Korea Univ
AMBAAdvanced Microcontroller Bus ArchitectureOn-chip bus protocol from ARMOn-chip interconnect specification for the connection and management of functional blocks including processor and peripheral devicesIntroduced in 1996AMBA is a registered trademark of ARM Limited.AMBA is an open standard*Wikipedia
Korea Univ
AMBA HistoryAMBAASB APB
AMBA 2 (1999)AHB widely used on ARM7, ARM9 and ARM Cortex-M based designsASBAPB2 (or APB)
*WikipediaAMBA 3 (2003)AXI3 (or AXI v1.0)widely used on ARM Cortex-A processors including Cortex-A9AHB-Lite v1.0APB3 v1.0ATB v1.0
AMBA 4 (2010)ACEwidely used on the latest ARM Cortex-A processors including Cortex-A7 and Cortex-A15ACE-LiteAXI4AXI4-LiteAXI-Stream v1.0ATB v1.1APB4 v2.0ACE: AXI Coherency ExtensionsAXI: Advanced eXtensible InterfaceAHB: Advanced High-performance BusASB: Advanced System BusAPB: Advanced Peripheral BusATB: Advanced Trace Bus
Korea Univ
ASB*AMBA Specification V2.0
Korea Univ
ASB*Hardware Device 0Hardware Device 1Hardware Device 2Hardware Device 3Hardware Device 4Hardware Device 5ASB
Korea Univ
AHB*AMBA Specification V2.0
Korea Univ
AHB with 3 Masters and 4 Slaves*AMBA Specification V2.0 H indicates AHB signals
Korea Univ
AHB Basic Transfer Example with Wait*AMBA Specification V2.0HREADY Source: Slave Write dataRead data
Korea Univ
AHB Burst Transfer Example*AMBA Specification V2.0HREADY Source: Slave
Korea Univ
AHD Split Transaction*AMBA Specification V2.0If slave decides that it may take a number of cycles to obtain and provide data, it gives a SPLIT transfer response
Arbiter grants use of the bus to other mastersHRESP: Transfer response fro slave (OKAY, ERROR, RETRY, and SPLIT)
Korea Univ
APB Write/Read*AMBA Specification V2.0
Korea Univ
AXI v1.0AMBA AXI protocol is targeted at high-performance, high-frequency system designs
AXI key featuresSeparate address/control and data phasesSupport for unaligned data transfers using byte strobesSeparate read and write data channels to enable low-cost Direct Memory Access (DMA)Ability to issue multiple outstanding addressesOut-of-order transaction completionEasy addition of register stages to provide timing closure
*AMBA AXI Specification V1.0
Korea Univ
5 Independent ChannelsRead address channel and Write address channelVariable length burst: 1 ~ 16 data transfersBurst with a transfer size of 8 ~ 1024 bits (1B ~ 128B)
Read data channelConvey data and any read response info.Data bus can be 8, 16, 32, 64, 128, 256, 512, or 1024 bits
Write data channelData bus can be 8, 16, 32, 64, 128, 256, 512, or 1024 bits
Write response channelWrite response info.*
Korea Univ
AXI Read Operation*AMBA AXI Specification V1.0Read Address ChannelRead Data ChannelRREADY: From master, indicate that master can accept the read data and response info.
Korea Univ
AXI Write Operation*AMBA AXI Specification V1.0Write Address ChannelWrite Data ChannelWrite Response ChannelWVALID Source: MasterWREADY Source: SlaveBVALID Source: SlaveBREADY Source: Master
Korea Univ
Out-of-order CompletionAXI gives an ID tag to every transactionTransactions with the same ID are completed in orderTransactions with different IDs can be completed out of order*AMBA AXI Specification V1.0
Korea Univ
ID Signals*AMBA AXI Specification V1.0Write Address ChannelWrite Data ChannelWrite Response ChannelRead Address ChannelRead Data Channel
Korea Univ
Out-of-order CompletionOut-of-order transactions can improve system performance in 2 waysFast-responding slaves respond in advance of earlier transactions with slower slavesComplex slaves can return data out of orderA data item for a later access might be available before the data for an earlier access is available
If a master requires that transactions are completed in the same order that they are issued, they must all have the same ID tag
It is not a required featureSimple masters and slaves can process one transaction at a time in the order they are issued*AMBA AXI Specification V1.0
Korea Univ
Addition of Register SlicesAXI enables the insertion of a register slice in any channel at the cost of an additional cycle latencyTrade-off between latency and maximum frequency
It can be advantageous to useDirect and fast connection between a processor and high-performance memorySimple register slices to isolate a longer path to less performance-critical peripherals
*AMBA AXI Specification V1.0
Korea Univ
*Backup Slides
Korea Univ
A Computer System*CPUNorth BridgeSouth BridgeMain Memory(DDR2)FSB (Front-Side Bus)DMI (Direct Media I/F)Hard diskUSBPCIe cardI/O devicesGraphics card
Korea Univ
A Typical I/O System Schematic (Simplified) *Memory Bus, I/O busCPU CoreCacheMain MemoryDiskI/O ControllerGraphics CardNetworkInterruptsDiskI/O ControllerI/O ControllerMemory Controller
Korea Univ
I/O InterconnectionA bus is a shared communication link A single set of wires used to connect multiple componentsComposed of address bus, data bus, and control bus (read/write)AdvantagesVersatile new devices can be added easily and can be moved between computer systems that use the same bus standardLow cost a single set of wires is shared in multiple waysDisadvantagesCommunication bottleneck bus bandwidth limits the maximum I/O throughput
The maximum bus speed is largely limited byThe length of the busThe number of devices on the bus*
Korea Univ
I/O Interconnection (Cont)I/O devices and interconnection largely contribute to the performance of computer system Traditionally, parallel shared wires had (have) been used to connect I/O devicesAs the clock frequency increases for communicating with I/O devices, parallel shared wires suffer from clock skew and interference among wiresIndustry transitioned from parallel shared buses to high-speed serial point-to-point interconnections *
Korea Univ
Types of BusesProcessor-memory busFront Side Bus (FSB), proprietary busReplaced by QPI (QuickPath Interconnect) in IntelReplaced by Hypertransport in AMDShort and high speedMatched to the memory system to maximize the memory-processor bandwidthOptimized for cache block transfers
Backplane (backbone) busIndustry standarde.g., PCIexpressAllow processor, memory and I/O devices to coexist on a single busUsed as an intermediary bus connecting I/O busses to the processor-memory bus
I/O bus Industry standarde.g., SATA, USB, FirewireUsually is lengthy and slowerNeeds to accommodate a wide range of I/O devices*CPUNorth BridgeSouth BridgeMain Memory(DDR2)FSB (Front-Side Bus)DMI (Direct Media I/F)Hard diskUSBGraphics cardProcessor-memory bus Backplane busI/O bus
Korea Univ
How Does CPU Access I/O Devices?All the I/O devices have registers implemented, so software programmers can use them to control the devicesThen, for programming, where and how to write to or read from?There are 2 ways to access I/O devicesMemory-mapped I/OI/O-mapped I/O
Memory-mapped I/OI/O device is mapped to a memory spaceCPU generates a memory transaction to access I/O deviceTo access I/O device In MIPS, use lw or sw instructionsIn x86, use mov instruction*0x00xFFFF_FFFF(4GB-1)Main Memory(1GB)0x3FFF_FFFF(1GB-1)I/O deviceI/O deviceI/O device
Korea Univ
How CPU Accesses I/O Devices?I/O-mapped I/OI/O devices are mapped to I/O spaceCPU generates I/O transaction to access I/O deviceTo access I/O deviceIn x86, there are in and out instructions. In x86, I/O space is 64KB
To differentiate memory space and I/O space, there should be hardware supportISA supportIn x86, mov instruction for memory transaction and in,out instruction for I/O transactionPhysical pin from processor indicating the transaction type (memory or I/O)For example, the pin is driven to 1 for memory transaction or 0 for I/O transaction*0x00xFFFF(64KB-1)I/O deviceI/O deviceI/O device
Korea Univ
How I/O Communicates with CPU?PollingCPU periodically checks the status of I/O devices to determine its need for serviceCPU is totally in controlCan waste a lot of CPU time due to speed differences
InterruptI/O device issues an interrupt to indicate that it needs attentionAn I/O interrupt is asynchronous wrt (with respect to) instruction executionIt is not associated with any instruction, so doesnt prevent any instruction from completingYou can pick your own convenient point in the pipeline to handle the interrupt
*
Korea Univ
DMA (Direct Memory Access)Typically, moving data from one place to another involve CPU instructionsLoad (lw) from a location (e.g. memory in an I/O device)Store (sw) to another location (e.g. main memory)Moving a large chunk of data with CPU instructions could take a large fraction of CPU time
DMA has the ability to transfer large blocks of data directly to/from the memory without involving the processorThe processor initiates the DMA transfer by supplying source and destination addresses, the number of bytes to transferThe DMA controller manages the entire transfer (possibly thousand of bytes in length), arbitrating for the busWhen the DMA transfer is complete, the DMA controller interrupts the processor to inform that the transfer is complete
There may be multiple DMA devices in one systemProcessor and DMA controllers contend for bus cycles and for memory
*
**Because AXI channel transfers information in one direction***